Repository: jikexueyuanwiki/tensorflow-zh Branch: master Commit: 28dd80bc65d6 Files: 181 Total size: 2.2 MB Directory structure: gitextract_iq8tfs9_/ ├── .gitignore ├── README.md ├── README副本.md ├── SOURCE/ │ ├── api_docs/ │ │ ├── cc/ │ │ │ ├── ClassEnv.md │ │ │ ├── ClassEnvWrapper.md │ │ │ ├── ClassRandomAccessFile.md │ │ │ ├── ClassSession.md │ │ │ ├── ClassStatus.md │ │ │ ├── ClassTensor.md │ │ │ ├── ClassTensorBuffer.md │ │ │ ├── ClassTensorShape.md │ │ │ ├── ClassTensorShapeIter.md │ │ │ ├── ClassTensorShapeUtils.md │ │ │ ├── ClassThread.md │ │ │ ├── ClassWritableFile.md │ │ │ ├── StructSessionOptions.md │ │ │ ├── StructState.md │ │ │ ├── StructTensorShapeDim.md │ │ │ ├── StructThreadOptions.md │ │ │ └── index.md │ │ ├── index.md │ │ └── python/ │ │ ├── array_ops.md │ │ ├── client.md │ │ ├── constant_op.md │ │ ├── control_flow_ops.md │ │ ├── framework.md │ │ ├── image.md │ │ ├── index.md │ │ ├── io_ops.md │ │ ├── math_ops.md │ │ ├── nn.md │ │ ├── python_io.md │ │ ├── sparse_ops.md │ │ ├── state_ops.md │ │ └── train.md │ ├── extras/ │ │ └── README.txt │ ├── faq.md │ ├── get_started/ │ │ ├── basic_usage.md │ │ ├── introduction.md │ │ └── os_setup.md │ ├── how_tos/ │ │ ├── __init__.py │ │ ├── adding_an_op/ │ │ │ ├── __init__.py │ │ │ ├── attr_examples.cc │ │ │ ├── fact_test.py │ │ │ ├── index.md │ │ │ ├── zero_out_1_test.py │ │ │ ├── zero_out_2_test.py │ │ │ ├── zero_out_3_test.py │ │ │ ├── zero_out_grad_2.py │ │ │ ├── zero_out_op_kernel_1.cc │ │ │ ├── zero_out_op_kernel_2.cc │ │ │ └── zero_out_op_kernel_3.cc │ │ ├── adding_an_op.md │ │ ├── graph_viz/ │ │ │ └── index.md │ │ ├── graph_viz.md │ │ ├── new_data_formats/ │ │ │ ├── Custom Data Readers.md │ │ │ └── index.md │ │ ├── new_data_formats.md │ │ ├── overview.md │ │ ├── reading_data/ │ │ │ ├── __init__.py │ │ │ ├── convert_to_records.py │ │ │ ├── fully_connected_preloaded.py │ │ │ ├── fully_connected_preloaded_var.py │ │ │ ├── fully_connected_reader.py │ │ │ └── index.md │ │ ├── reading_data.md │ │ ├── summaries_and_tensorboard/ │ │ │ └── index.md │ │ ├── summaries_and_tensorboard.md │ │ ├── threading_and_queues/ │ │ │ └── index.md │ │ ├── threading_and_queues.md │ │ ├── using_gpu/ │ │ │ └── index.md │ │ ├── using_gpu.md │ │ ├── variable_scope/ │ │ │ └── index.md │ │ ├── variable_scope.md │ │ ├── variables/ │ │ │ └── index.md │ │ └── variables.md │ ├── images/ │ │ └── getting_started.dot │ ├── index.md │ ├── personal.md │ ├── resource.md │ ├── resources/ │ │ ├── bib.md │ │ ├── dims_types.md │ │ ├── faq.md │ │ ├── glossary.md │ │ ├── overview.md │ │ └── uses.md │ └── tutorials/ │ ├── BUILD │ ├── __init__.py │ ├── deep_cnn/ │ │ ├── cifar_tensorboard.html │ │ └── index.md │ ├── deep_cnn.md │ ├── mandelbrot/ │ │ └── index.md │ ├── mandelbrot.md │ ├── mnist/ │ │ ├── __init__.py │ │ ├── beginners/ │ │ │ └── index.md │ │ ├── download/ │ │ │ └── index.md │ │ ├── fully_connected_feed.py │ │ ├── input_data.py │ │ ├── mnist.py │ │ ├── mnist_softmax.py │ │ ├── pros/ │ │ │ └── index.md │ │ └── tf/ │ │ └── index.md │ ├── mnist_beginners.md │ ├── mnist_download.md │ ├── mnist_pros.md │ ├── mnist_tf.md │ ├── overview.md │ ├── pdes/ │ │ └── index.md │ ├── pdes.md │ ├── recurrent/ │ │ └── index.md │ ├── recurrent.md │ ├── seq2seq/ │ │ └── index.md │ ├── seq2seq.md │ ├── word2vec/ │ │ ├── __init__.py │ │ ├── index.md │ │ └── word2vec_basic.py │ └── word2vec.md ├── TOC.md ├── config.json ├── learn-github.md ├── markdown.md └── tex_pdf/ ├── api/ │ ├── c4s00.tex │ ├── cc/ │ │ ├── ClassEnv.md │ │ ├── ClassEnvWrapper.md │ │ ├── ClassRandomAccessFile.md │ │ ├── ClassSession.md │ │ ├── ClassStatus.md │ │ ├── ClassTensor.md │ │ ├── ClassTensorBuffer.md │ │ ├── ClassTensorShape.md │ │ ├── ClassTensorShapeIter.md │ │ ├── ClassTensorShapeUtils.md │ │ ├── ClassThread.md │ │ ├── ClassWritableFile.md │ │ ├── StructSessionOptions.md │ │ ├── StructState.md │ │ ├── StructTensorShapeDim.md │ │ ├── StructThreadOptions.md │ │ └── index.md │ └── python/ │ ├── c4s01_framework.tex │ ├── c4s02_constant_op.tex │ ├── c4s03_state_ops.tex │ ├── c4s04_array_ops.tex │ ├── c4s05_math_ops.tex │ ├── c4s06_control_flow_ops.tex │ ├── c4s07_image.tex │ ├── c4s08_sparse_ops.tex │ ├── c4s09_io_ops.tex │ ├── c4s10_python_io.tex │ ├── c4s11_nn.tex │ ├── c4s12_client.tex │ ├── c4s13_train.tex │ └── index.tex ├── get_started/ │ ├── c1s01_introduction.tex │ ├── c1s02_os_setup.tex │ └── c1s03_basic_usage.tex ├── how_tos/ │ ├── c3s00_overview.tex │ ├── c3s01_variables.tex │ ├── c3s02_variable_scope.tex │ ├── c3s03_viz_learning.tex │ ├── c3s04_graph_viz.tex │ ├── c3s05_reading_data.tex │ ├── c3s06_threading_and_queues.tex │ ├── c3s07_adding_an_op.tex │ ├── c3s08_new_data_formats.tex │ └── c3s09_using_gpu.tex ├── tensorflow_manual_cn.idx ├── tensorflow_manual_cn.ilg ├── tensorflow_manual_cn.ind ├── tensorflow_manual_cn.tex └── tutorials/ ├── c2s00_overview.tex ├── c2s01_minist_beginners.tex ├── c2s02_minist_pros.tex ├── c2s03_.tex ├── c2s04_.tex ├── c2s05_word2vec.tex ├── c2s06_recurrent.tex ├── c2s07_seq2seq.tex ├── c2s08_2_9_mandelbrot.tex ├── c2s09_pdes.tex └── c2s10_mnist_download.tex ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ *.aux *.fdb_latexmk *.log *.out *.synctex.gz *.toc *.fls *.sublime-project *.sublime-workspace *.DS_Store *.idea ================================================ FILE: README.md ================================================ # TensorFlow 官方文档中文版 ![](SOURCE/images/TensorFlow.jpg) ### 你正在阅读的项目可能会比 Android 系统更加深远地影响着世界! ## 缘起 2015年11月9日,Google发布人工智能系统TensorFlow并宣布开源,同日,极客学院组织在线TensorFlow中文文档翻译。 机器学习作为人工智能的一种类型,可以让软件根据大量的数据来对未来的情况进行阐述或预判。如今,领先的科技巨头无不在机器学习下予以极大投入。Facebook、苹果、微软,甚至国内的百度。Google 自然也在其中。「TensorFlow」是 Google 多年以来内部的机器学习系统。如今,Google 正在将此系统成为开源系统,并将此系统的参数公布给业界工程师、学者和拥有大量编程能力的技术人员,这意味着什么呢? 打个不太恰当的比喻,如今 Google 对待 TensorFlow 系统,有点类似于该公司对待旗下移动操作系统 Android。如果更多的数据科学家开始使用 Google 的系统来从事机器学习方面的研究,那么这将有利于 Google 对日益发展的机器学习行业拥有更多的主导权。 为了让国内的技术人员在最短的时间内迅速掌握这一世界领先的 AI 系统,极客学院 Wiki 团队发起对 TensorFlow 官方文档的中文协同翻译,一周之内,全部翻译认领完成,一个月后,全部30章节翻译校对完成,上线极客学院Wiki平台并提供下载。 Google TensorFlow项目负责人Jeff Dean为该中文翻译项目回信称:"*看到能够将TensorFlow翻译成中文我非常激动,我们将TensorFlow开源的主要原因之一是为了让全世界的人们能够从机器学习与人工智能中获益,类似这样的协作翻译能够让更多的人更容易地接触到TensorFlow项目,很期待接下来该项目在全球范围内的应用!*" Jeff回信原文: ![jeff](SOURCE/images/jeff.png) 再次衷心感谢每一位为该翻译项目做出贡献的同学,我们会持续关注TensorFlow、AI领域以及其它最新技术的发展、持续维护该协作翻译、持续提供更多更优质的内容,为广大IT学习者们服务! ## 内容来源 英文官方网站: 官方GitHub仓库: 中文版 GitHub 仓库: ## 参与者(按认领章节排序) ### 翻译 - [@PFZheng](https://github.com/PFZheng) - [@Tony Jin](https://github.com/linbojin) - [@chenweican](https://github.com/chenweican) - [@bingjin](https://github.com/bingjin) - [@oskycar](https://github.com/oskycar) - [@btpeter](https://github.com/btpeter) - [@Warln](https://github.com/Warln) - [@ericxk](https://github.com/ericxk) - [@wangaicc](https://github.com/wangaicc) - [@Terence Cooper](https://github.com/TerenceCooper) - [@zhyhooo](https://github.com/zhyhooo) - [@thylaco1eo](https://github.com/thylaco1eo) - [@volvet](https://github.com/volvet) - [@zhangkom](https://github.com/zhangkom) - [@derekshang](https://github.com/derekshang) - [@lianghyv](https://github.com/lianghyv) - [@nb312](https://github.com/nb312) - [@Jim-Zenn](https://github.com/Jim-Zenn) - [@andyiac](https://github.com/andyiac) - [@Terence Cooper](https://github.com/TerenceCooper) - [@leege100](https://github.com/leege100) ### 校对 - [@yangtze](https://github.com/sstruct) - [@ericxk](https://github.com/ericxk) - [@HongyangWang](https://github.com/WangHong-yang) - [@LichAmnesia](https://github.com/LichAmnesia) - [@zhyhooo](https://github.com/zhyhooo) - [@waiwaizheng](https://github.com/waiwaizheng) - [@HongyangWang](https://github.com/WangHong-yang) - [@tensorfly](https://github.com/tensorfly) - [@lonlonago](https://github.com/lonlonago) - [@jishaoming](https://github.com/jishaoming) - [@lucky521](https://github.com/lucky521) - [@allensummer](http://github.com/allensummer) - [@volvet](https://github.com/volvet) - [@ZHNathanielLee](https://github.com/ZHNathanielLee) - [@pengfoo](https://github.com/PengFoo) - [@qiaohaijun](https://github.com/qiaohaijun) - [@Seika](https://github.com/SeikaScarlet) ## 进度记录 - 2015-11-10, 谷歌发布全新人工智能系统TensorFlow并宣布开源, 极客学院Wiki启动协同翻译,创建 GitHub 仓库,制定协同规范 - 2015-11-18, 所有章节认领完毕,翻译完成18章,校对认领7章,Star数361,fork数100,协同翻译QQ群及技术交流群的TF爱好者将近300人,GitHub搜索TensorFlow排名第二 - 2015-12-10, Star数超过500 - 2015-12-15, 项目正式上线 ## 花絮 在组织翻译的过程中,有些事情令人印象深刻,记录下来,希望以后来学习文档的同学能够明了到手中这份文档的由来: - 参加翻译的有学生,也有老师;有专门研究AI/ML的,也有对此感兴趣的;有国内的,也有远在纽约的;有工程技术人员也有博士、专家 - 其中一位,[恩泽](http://www.longmotto.com)同学,为了翻译一篇文档,在前一天没有睡觉的情况下坚持翻完,20个小时没有合眼 - 还有一位老师,刚从讲台上讲完课,就立即给我们的翻译提修改意见 - 很多同学自发的将搭建环境中遇到的问题总结到FAQ里帮助他人 - 为了一个翻译细节,经常是来回几次,和其他人讨论完善 ## 持续改进 这样的一个高技术领域的文档,我们在翻译的过程中,难免会有不完善的地方,希望请大家一起帮助我们持续改进文档的翻译质量,帮助更多的人,方法: - 在GitHub上提Issue或Pull Request,地址为: [https://github.com/jikexueyuanwiki/tensorflow-zh](https://github.com/jikexueyuanwiki/tensorflow-zh) - 加入TensorFlow技术交流群,与TensorFlower们一起研究交流技术干货--TensorFlow技术交流群:782484288 - 对翻译感兴趣?加入协同翻译群:248320884,与翻译大神一道研究TensorFlow的本地化 - 给我们写邮件: wiki@jikexueyuan.com ## 感谢支持 - [极客学院 Wiki](http://wiki.jikexueyuan.com) 提供图文教程托管服务 ## 离线版本 目前,离线版本(PDF、ePub)可正常下载、使用 ## Tex-PDF 修订版 [Tex-PDF 修订版](tex_pdf) 目前正在编订中,欢迎加入进来一起修订。您可以在此查看[预览版](tex_pdf/tensorflow_manual_cn.pdf)目前最新状态。 ================================================ FILE: README副本.md ================================================ # TensorFlow 官方文档中文版 ![](SOURCE/images/tensorflow_logo.png) ### 你正在翻译的项目可能会比 Android 系统更加深远地影响着世界! ## 缘起 2015年11月9日,Google 官方在其博客上称,Google Research 宣布推出第二代机器学习系统 TensorFlow,针对先前的 DistBelief 的短板有了各方面的加强,更重要的是,它是开源的,任何人都可以用。 机器学习作为人工智能的一种类型,可以让软件根据大量的数据来对未来的情况进行阐述或预判。如今,领先的科技巨头无不在机器学习下予以极大投入。Facebook、苹果、微软,甚至国内的百度。Google 自然也在其中。「TensorFlow」是 Google 多年以来内部的机器学习系统。如今,Google 正在将此系统成为开源系统,并将此系统的参数公布给业界工程师、学者和拥有大量编程能力的技术人员,这意味着什么呢? 打个不太恰当的比喻,如今 Google 对待 TensorFlow 系统,有点类似于该公司对待旗下移动操作系统 Android。如果更多的数据科学家开始使用 Google 的系统来从事机器学习方面的研究,那么这将有利于 Google 对日益发展的机器学习行业拥有更多的主导权。 为了让国内的技术人员在最短的时间内迅速掌握这一世界领先的 AI 系统,极客学院 Wiki 团队发起对 TensorFlow 官方文档的中文协同翻译。 欢迎各路人工智能及机器学习领域的专家和爱好者参与这一翻译项目,我们会为每位翻译和校对人员署名。 ## 一起来参与 如果想做出贡献(翻译或者校对)的话,请加QQ群:248320884,谢谢! PS: 想探讨TensorFlow技术的可以加"TensorFlow技术交流群":495115006 ## 内容来源 英文官方网站: 官方GitHub仓库: 中文版 GitHub 仓库: ## 关于TensorFlow的一些报道 [野心:像开放安卓一样,这一次,谷歌决定开放它的机器学习系统——TensorFlow](http://www.huxiu.com/article/130763/1.html?f=wangzhan) [最强大机器学习平台开源 谷歌准备主宰人工智能?](http://m.wallstreetcn.com/node/226012) ## 参与步骤 * fork主仓库() * 按照章节认领翻译(每次申请一个章节)或者校对(可申请多个章节)在下面这个`README.md`里找还没有被人申请的章节,写上(@你的github号),给主仓库的`master`分支提pull request; * 提的 pull request 被确认,合并到主仓库后,代表你申请的章节*认领*完成,开始翻译或校对; * 翻译或校对的文件为README.md或者TOC.md中对应的md文件,请不要翻译/校对单独文件夹中的index.md * 翻译过程请参照 *翻译协作规范* (见下一节),完成翻译后提交 pull request 给主仓库的`master`分支; * 完成校对后同样提交 pull request 给主仓库的`master`分支; * 全部翻译/校对完成后,我们会生成PDF/ePub文档,放在极客学院Wiki平台发布,并为所有参与者署名! ## 翻译协作规范 为了让大家协作顺畅,需要每一个人遵循如下协作规范~ - 如果对Markdown和GitHub不了解,请先阅读[如何使用Markdown](markdown.md)以及[如何使用GitHub](learn-github.md) - 使用Markdown进行翻译,文件名必须使用英文 - 翻译后的文档请放到SOURCE文件夹下的对应章节中,然后pull request即可 - 如遇到文中的图片,请统一放在SOURCE/images目录下 - 原文中的HTML标签及代码请不要修改、翻译 - 有其他任何问题都欢迎发issue,我们看到了会尽快回复 - 翻译人员需将对应的原文地址和翻译人姓名添加到译文末尾,审校人员需要将自己的名字添加到译文末尾,具体格式请参见样例: > 原文:[Color Palettes](http://www.google.com/design/spec/resources/color-palettes.html) 翻译:[iceskysl](https://github.com/iceskysl) 校对:[PoppinLp](https://github.com/poppinlp) ## 校对规范 - 认领校对时请提供机器学习 or AI方面的专业背景说明,保证校对质量 - 文章通顺,适合人类阅读与理解,别让人以为是机翻 - 确保图片都能正常显示,且其相对地址都是SOURCE/images,图片存放正确 - 专有名词符合术语表中的翻译要求,拿不准的新词汇可以使用中文翻译后加括号内英文的形式 - 校对人员需有一定专业背景,保证译文无专业知识方面错误 - 请查看翻译人员是否将原文链接和翻译链接放到译文最下方,审校人员需要将自己的名字添加到译文末尾,具体格式请参见样例: > 原文:[Color Palettes](http://www.google.com/design/spec/resources/color-palettes.html) 翻译:[iceskysl](https://github.com/iceskysl) 校对:[PoppinLp](https://github.com/poppinlp) - 有任何问题请提Issues或到协同翻译群讨论,校对后提PR等待Merge,管理员通过后会在目录后面打√标识完成 ## 参与者(按认领章节排序) ### 翻译 & 校对 - 起步 - [简介](get_started/introduction.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对:([@yangtze](https://github.com/sstruct))√ - [下载及安装](get_started/os_setup.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对: ([@yangtze](https://github.com/sstruct))√ - [基本用法](get_started/basic_usage.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对:([@yangtze](https://github.com/sstruct))√ - 教程 - [综述](tutorials/overview.md) 翻译:([@PFZheng](https://github.com/PFZheng))√ 校对: ([@ericxk](https://github.com/ericxk)) √ - [MNIST机器学习入门](tutorials/mnist_beginners.md) 翻译:([@Tony Jin](https://github.com/linbojin))√ 校对: ([@ericxk](https://github.com/ericxk)) - [深入MNIST](tutorials/mnist_pros.md) 翻译:([@chenweican](https://github.com/chenweican))√ 校对: ([@HongyangWang](https://github.com/WangHong-yang)) √ - [TensorFlow运作方式入门](tutorials/mnist_tf.md) 翻译:([@bingjin](https://github.com/bingjin))√ 校对: ([@LichAmnesia](https://github.com/LichAmnesia)) √ - [卷积神经网络](tutorials/deep_cnn.md) 翻译: ([@oskycar](https://github.com/oskycar))√ 校对: ([@zhyhooo](https://github.com/zhyhooo)) - [Vector Representations of Words](tutorials/word2vec.md)翻译: ([@btpeter](https://github.com/btpeter)) √ 校对: ([@waiwaizheng](https://github.com/waiwaizheng)) - [递归神经网络](tutorials/recurrent.md) 翻译:([@Warln](https://github.com/Warln))√ 校对: ([@HongyangWang](https://github.com/WangHong-yang)) - [曼德布洛特(Mandelbrot)集合](tutorials/mandelbrot.md) 翻译:([@ericxk](https://github.com/ericxk))√ 校对: ([@tensorfly](https://github.com/tensorfly))√ - [偏微分方程](tutorials/pdes.md) 翻译:([@wangaicc](https://github.com/wangaicc))√ 校对: ([@tensorfly](https://github.com/tensorfly))√ - [MNIST数据下载](tutorials/mnist_download.md) 翻译: ([@btpeter](https://github.com/btpeter)) √ 校对: ([@waiwaizheng](https://github.com/waiwaizheng)) - 运作方式 - [总览](how_tos/overview.md) 翻译: ([@Terence Cooper](https://github.com/TerenceCooper))√ 校对: ([@lonlonago](https://github.com/lonlonago)) √ - [变量:创建、初始化、保存和加载](how_tos/variables.md) 翻译: ([@zhyhooo](https://github.com/zhyhooo))√ 校对: ([@jishaoming](https://github.com/jishaoming)) - [TensorBoard:可视化学习](how_tos/summaries_and_tensorboard.md)翻译:([@thylaco1eo](https://github.com/thylaco1eo))√ 校对: ([@lucky521](https://github.com/lucky521)) √ - [TensorBoard:图表可视化](how_tos/graph_viz.md) 翻译:([@Warln](https://github.com/Warln))√ 校对: ([@lucky521](https://github.com/lucky521)) √ - [读取数据](how_tos/reading_data.md) 翻译: ([@volvet](https://github.com/volvet)) 校对: ([@allensummer](http://github.com/allensummer)) - [线程和队列](how_tos/threading_and_queues.md) 翻译: ([@zhangkom](https://github.com/zhangkom)) √ 校对:([@volvet](https://github.com/volvet))√ - [添加新的Op](how_tos/adding_an_op.md) 翻译:([@PFZheng](https://github.com/PFZheng))√校对:([@ZHNathanielLee](https://github.com/ZHNathanielLee)) √ - [自定义数据读取](how_tos/new_data_formats.md) 翻译:([@derekshang](https://github.com/derekshang))√ 校对:([@pengfoo](https://github.com/PengFoo)) - [使用gpu](how_tos/using_gpu.md)翻译:([@lianghyv](https://github.com/lianghyv))√ 校对:([@qiaohaijun](https://github.com/qiaohaijun)) - [共享变量](how_tos/variable_scope.md) 翻译:([@nb312](https://github.com/nb312))√校对:([@pengfoo](https://github.com/PengFoo)) - 资源 - [总览](resources/overview.md) 翻译:([@Jim-Zenn](https://github.com/Jim-Zenn))√ 校对:([@lonlonago](https://github.com/lonlonago)) √ - [BibTex 引用](resources/bib.md) 翻译:([@Jim-Zenn](https://github.com/Jim-Zenn))√ 校对:([@pengfoo](https://github.com/PengFoo)) - [使用示例](resources/uses.md) 翻译:([@andyiac](https://github.com/andyiac))√ 校对([@lonlonago](https://github.com/lonlonago)) √ - [FAQ](resources/faq.md) 翻译:([@Terence Cooper](https://github.com/TerenceCooper))√ 校对:([@pengfoo](https://github.com/PengFoo)) - [术语表](resources/glossary.md) 翻译: ([@leege100](https://github.com/leege100))√ 校对: ([@lonlonago](https://github.com/lonlonago)) √ - [Tensor的阶、形状和数据类型](resources/dims_types.md) 翻译:([@nb312](https://github.com/nb312))√ 校对:([@lonlonago](https://github.com/lonlonago)) √ - 其他 - [常见问题汇总](faq.md) ## 进度记录 - 2015-11-10, 谷歌发布全新人工智能系统TensorFlow并宣布开源, 极客学院Wiki启动协同翻译,创建 GitHub 仓库,制定协同规范 - 2015-11-18, 所有章节认领完毕,翻译完成18章,校对认领7章,Star数361,fork数100,协同翻译QQ群及技术交流群的TF爱好者将近300人,GitHub搜索TensorFlow排名第二 - 2015-12-10, Star数超过500 ## TensorFlow 相关资源 - [Google官方Blog宣布TensorFlow开源](https://googleblog.blogspot.com/2015/11/tensorflow-smarter-machine-learning-for.html) - [TensorFlow WhitePaper(PDF下载)](http://download.tensorflow.org/paper/whitepaper2015.pdf) - [Jeff Dean 介绍 TensorFlow(视频)](https://www.youtube.com/watch?v=90-S1M7Ny_o&t=21m2s) - [TensorFlow 简化版接口 Scikit Flow](https://github.com/google/skflow) - [TensorFlow 使用样例](https://github.com/aymericdamien/TensorFlow-Examples) - [TensorFlow 与 mxnet, caffe 对比 @chenrudan](http://chenrudan.github.io/blog/2015/11/18/comparethreeopenlib.html) - [TensorFlow 与 torch7, caffe, mxnet 在内存使用量以及性能对比 @mli](https://github.com/dmlc/mxnet/blob/master/doc/overview_chn.md) ## 感谢支持 ## 离线版本 ================================================ FILE: SOURCE/api_docs/cc/ClassEnv.md ================================================ # Class `tensorflow::Env` An interface used by the tensorflow implementation to access operating system functionality like the filesystem etc. Callers may wish to provide a custom Env object to get fine grain control. All Env implementations are safe for concurrent access from multiple threads without any external synchronization. ##Member Summary * [`tensorflow::Env::Env()`](#tensorflow_Env_Env) * [`virtual tensorflow::Env::~Env()`](#virtual_tensorflow_Env_Env) * [`virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0`](#virtual_Status_tensorflow_Env_NewRandomAccessFile) * Creates a brand new random access read-only file with the specified name. * [`virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewWritableFile) * Creates an object that writes to a new file with the specified name. * [`virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewAppendableFile) * Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). * [`virtual bool tensorflow::Env::FileExists(const string &fname)=0`](#virtual_bool_tensorflow_Env_FileExists) * Returns true iff the named file exists. * [`virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0`](#virtual_Status_tensorflow_Env_GetChildren) * Stores in *result the names of the children of the specified directory. The names are relative to "dir". * [`virtual Status tensorflow::Env::DeleteFile(const string &fname)=0`](#virtual_Status_tensorflow_Env_DeleteFile) * Deletes the named file. * [`virtual Status tensorflow::Env::CreateDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_CreateDir) * Creates the specified directory. * [`virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_DeleteDir) * Deletes the specified directory. * [`virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0`](#virtual_Status_tensorflow_Env_GetFileSize) * Stores the size of fname in *file_size. * [`virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0`](#virtual_Status_tensorflow_Env_RenameFile) * Renames file src to target. If target already exists, it will be replaced. * [`virtual uint64 tensorflow::Env::NowMicros()=0`](#virtual_uint64_tensorflow_Env_NowMicros) * Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. * [`virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0`](#virtual_void_tensorflow_Env_SleepForMicroseconds) * Sleeps/delays the thread for the prescribed number of micro-seconds. * [`virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0`](#virtual_Thread_tensorflow_Env_StartThread) * Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". * [`static Env* tensorflow::Env::Default()`](#static_Env_tensorflow_Env_Default) * Returns a default environment suitable for the current operating system. ##Member Details #### `tensorflow::Env::Env()` #### `virtual tensorflow::Env::~Env()` #### `virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0` Creates a brand new random access read-only file with the specified name. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status. The returned file may be concurrently accessed by multiple threads. #### `virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0` Creates an object that writes to a new file with the specified name. Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0` Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `virtual bool tensorflow::Env::FileExists(const string &fname)=0` Returns true iff the named file exists. #### `virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0` Stores in *result the names of the children of the specified directory. The names are relative to "dir". Original contents of *results are dropped. #### `virtual Status tensorflow::Env::DeleteFile(const string &fname)=0` Deletes the named file. #### `virtual Status tensorflow::Env::CreateDir(const string &dirname)=0` Creates the specified directory. #### `virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0` Deletes the specified directory. #### `virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0` Stores the size of fname in *file_size. #### `virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0` Renames file src to target. If target already exists, it will be replaced. #### `virtual uint64 tensorflow::Env::NowMicros()=0` Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. #### `virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0` Sleeps/delays the thread for the prescribed number of micro-seconds. #### `virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0` Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running). #### `static Env* tensorflow::Env::Default()` Returns a default environment suitable for the current operating system. Sophisticated users may wish to provide their own Env implementation instead of relying on this default environment. The result of Default() belongs to this library and must never be deleted. ================================================ FILE: SOURCE/api_docs/cc/ClassEnvWrapper.md ================================================ # Class `tensorflow::EnvWrapper` An implementation of Env that forwards all calls to another Env . May be useful to clients who wish to override just part of the functionality of another Env . ##Member Summary * [`tensorflow::EnvWrapper::EnvWrapper(Env *t)`](#tensorflow_EnvWrapper_EnvWrapper) * Initializes an EnvWrapper that delegates all calls to *t. * [`virtual tensorflow::EnvWrapper::~EnvWrapper()`](#virtual_tensorflow_EnvWrapper_EnvWrapper) * [`Env* tensorflow::EnvWrapper::target() const`](#Env_tensorflow_EnvWrapper_target) * Returns the target to which this Env forwards all calls. * [`Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override`](#Status_tensorflow_EnvWrapper_NewRandomAccessFile) * Creates a brand new random access read-only file with the specified name. * [`Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewWritableFile) * Creates an object that writes to a new file with the specified name. * [`Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewAppendableFile) * Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). * [`bool tensorflow::EnvWrapper::FileExists(const string &f) override`](#bool_tensorflow_EnvWrapper_FileExists) * Returns true iff the named file exists. * [`Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override`](#Status_tensorflow_EnvWrapper_GetChildren) * Stores in *result the names of the children of the specified directory. The names are relative to "dir". * [`Status tensorflow::EnvWrapper::DeleteFile(const string &f) override`](#Status_tensorflow_EnvWrapper_DeleteFile) * Deletes the named file. * [`Status tensorflow::EnvWrapper::CreateDir(const string &d) override`](#Status_tensorflow_EnvWrapper_CreateDir) * Creates the specified directory. * [`Status tensorflow::EnvWrapper::DeleteDir(const string &d) override`](#Status_tensorflow_EnvWrapper_DeleteDir) * Deletes the specified directory. * [`Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override`](#Status_tensorflow_EnvWrapper_GetFileSize) * Stores the size of fname in *file_size. * [`Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override`](#Status_tensorflow_EnvWrapper_RenameFile) * Renames file src to target. If target already exists, it will be replaced. * [`uint64 tensorflow::EnvWrapper::NowMicros() override`](#uint64_tensorflow_EnvWrapper_NowMicros) * Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. * [`void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override`](#void_tensorflow_EnvWrapper_SleepForMicroseconds) * Sleeps/delays the thread for the prescribed number of micro-seconds. * [`Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override`](#Thread_tensorflow_EnvWrapper_StartThread) * Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". ##Member Details #### `tensorflow::EnvWrapper::EnvWrapper(Env *t)` Initializes an EnvWrapper that delegates all calls to *t. #### `virtual tensorflow::EnvWrapper::~EnvWrapper()` #### `Env* tensorflow::EnvWrapper::target() const` Returns the target to which this Env forwards all calls. #### `Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override` Creates a brand new random access read-only file with the specified name. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status. The returned file may be concurrently accessed by multiple threads. #### `Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override` Creates an object that writes to a new file with the specified name. Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override` Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `bool tensorflow::EnvWrapper::FileExists(const string &f) override` Returns true iff the named file exists. #### `Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override` Stores in *result the names of the children of the specified directory. The names are relative to "dir". Original contents of *results are dropped. #### `Status tensorflow::EnvWrapper::DeleteFile(const string &f) override` Deletes the named file. #### `Status tensorflow::EnvWrapper::CreateDir(const string &d) override` Creates the specified directory. #### `Status tensorflow::EnvWrapper::DeleteDir(const string &d) override` Deletes the specified directory. #### `Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override` Stores the size of fname in *file_size. #### `Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override` Renames file src to target. If target already exists, it will be replaced. #### `uint64 tensorflow::EnvWrapper::NowMicros() override` Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. #### `void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override` Sleeps/delays the thread for the prescribed number of micro-seconds. #### `Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override` Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running). ================================================ FILE: SOURCE/api_docs/cc/ClassRandomAccessFile.md ================================================ # Class `tensorflow::RandomAccessFile` A file abstraction for randomly reading the contents of a file. ##Member Summary * [`tensorflow::RandomAccessFile::RandomAccessFile()`](#tensorflow_RandomAccessFile_RandomAccessFile) * [`virtual tensorflow::RandomAccessFile::~RandomAccessFile()`](#virtual_tensorflow_RandomAccessFile_RandomAccessFile) * [`virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0`](#virtual_Status_tensorflow_RandomAccessFile_Read) * Reads up to "n" bytes from the file starting at "offset". ##Member Details #### `tensorflow::RandomAccessFile::RandomAccessFile()` #### `virtual tensorflow::RandomAccessFile::~RandomAccessFile()` #### `virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0` Reads up to "n" bytes from the file starting at "offset". "scratch[0..n-1]" may be written by this routine. Sets "*result" to the data that was read (including if fewer than "n" bytes were successfully read). May set "*result" to point at data in "scratch[0..n-1]", so "scratch[0..n-1]" must be live when "*result" is used. On OK returned status: "n" bytes have been stored in "*result". On non-OK returned status: [0..n] bytes have been stored in "*result". Returns `OUT_OF_RANGE` if fewer than n bytes were stored in "*result" because of EOF. Safe for concurrent use by multiple threads. ================================================ FILE: SOURCE/api_docs/cc/ClassSession.md ================================================ # Class `tensorflow::Session` A Session instance lets a caller drive a TensorFlow graph computation. When a Session is created with a given target, a new Session object is bound to the universe of resources specified by that target. Those resources are available to this session to perform computation described in the GraphDef. After extending the session with a graph, the caller uses the Run() API to perform the computation and potentially fetch outputs as Tensors. Example: ```c++ tensorflow::GraphDef graph; // ... Create or load graph into "graph". // This example uses the default options which connects // to a local runtime. tensorflow::SessionOptions options; std::unique_ptr session(tensorflow::NewSession(options)); // Create the session with this graph. tensorflow::Status s = session->Create(graph); if (!s.ok()) { ... } // Run the graph and fetch the first output of the "output" // operation, and also run to but do not return anything // for the "update_state" operation. std::vector outputs; s = session->Run({}, {"output:0"}, {"update_state"}, &outputs); if (!s.ok()) { ... } // Map the output as a flattened float tensor, and do something // with it. auto output_tensor = outputs[0].flat(); if (output_tensor(0) > 0.5) { ... } // Close the session to release the resources associated with // this session. session->Close() ``` A Session allows concurrent calls to Run() , though a Session must be created / extended by a single thread. Only one thread must call Close() , and Close() must only be called after all other calls to Run() have returned. ##Member Summary * [`virtual Status tensorflow::Session::Create(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Create) * Create the graph to be used for the session. * [`virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Extend) * Adds operations to the graph that is already registered with the Session . * [`virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0`](#virtual_Status_tensorflow_Session_Run) * Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`. * [`virtual Status tensorflow::Session::Close()=0`](#virtual_Status_tensorflow_Session_Close) * Closes this session. * [`virtual tensorflow::Session::~Session()`](#virtual_tensorflow_Session_Session) ##Member Details #### `virtual Status tensorflow::Session::Create(const GraphDef &graph)=0` Create the graph to be used for the session. Returns an error if this session has already been created with a graph. To re-use the session with a different graph, the caller must Close() the session first. #### `virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0` Adds operations to the graph that is already registered with the Session . The names of new operations in "graph" must not exist in the graph that is already registered. #### `virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0` Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`. The order of tensors in `outputs` will match the order provided by `output_tensor_names`. If `Run` returns `OK()`, then `outputs->size()` will be equal to `output_tensor_names.size()`. If `Run` does not return `OK()`, the state of `outputs` is undefined. REQUIRES: The name of each Tensor of the input or output must match a "Tensor endpoint" in the `GraphDef` passed to ` Create() `. REQUIRES: outputs is not nullptr if `output_tensor_names` is non-empty. #### `virtual Status tensorflow::Session::Close()=0` Closes this session. Closing a session releases the resources used by this session on the TensorFlow runtime (specified during session creation by the ` SessionOptions::target ` field). #### `virtual tensorflow::Session::~Session()` ================================================ FILE: SOURCE/api_docs/cc/ClassStatus.md ================================================ # Class `tensorflow::Status` ##Member Summary * [`tensorflow::Status::Status()`](#tensorflow_Status_Status) * Create a success status. * [`tensorflow::Status::~Status()`](#tensorflow_Status_Status) * [`tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)`](#tensorflow_Status_Status) * Create a status with the specified error code and msg as a human-readable string containing more detailed information. * [`tensorflow::Status::Status(const Status &s)`](#tensorflow_Status_Status) * Copy the specified status. * [`void tensorflow::Status::operator=(const Status &s)`](#void_tensorflow_Status_operator_) * [`bool tensorflow::Status::ok() const`](#bool_tensorflow_Status_ok) * Returns true iff the status indicates success. * [`tensorflow::error::Code tensorflow::Status::code() const`](#tensorflow_error_Code_tensorflow_Status_code) * [`const string& tensorflow::Status::error_message() const`](#const_string_tensorflow_Status_error_message) * [`bool tensorflow::Status::operator==(const Status &x) const`](#bool_tensorflow_Status_operator_) * [`bool tensorflow::Status::operator!=(const Status &x) const`](#bool_tensorflow_Status_operator_) * [`void tensorflow::Status::Update(const Status &new_status)`](#void_tensorflow_Status_Update) * If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`. * [`string tensorflow::Status::ToString() const`](#string_tensorflow_Status_ToString) * Return a string representation of this status suitable for printing. Returns the string `"OK"` for success. * [`static Status tensorflow::Status::OK()`](#static_Status_tensorflow_Status_OK) ##Member Details #### `tensorflow::Status::Status()` Create a success status. #### `tensorflow::Status::~Status()` #### `tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)` Create a status with the specified error code and msg as a human-readable string containing more detailed information. #### `tensorflow::Status::Status(const Status &s)` Copy the specified status. #### `void tensorflow::Status::operator=(const Status &s)` #### `bool tensorflow::Status::ok() const` Returns true iff the status indicates success. #### `tensorflow::error::Code tensorflow::Status::code() const` #### `const string& tensorflow::Status::error_message() const` #### `bool tensorflow::Status::operator==(const Status &x) const` #### `bool tensorflow::Status::operator!=(const Status &x) const` #### `void tensorflow::Status::Update(const Status &new_status)` If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`. Convenient way of keeping track of the first error encountered. Instead of: `if (overall_status.ok()) overall_status = new_status` Use: `overall_status.Update(new_status);` #### `string tensorflow::Status::ToString() const` Return a string representation of this status suitable for printing. Returns the string `"OK"` for success. #### `static Status tensorflow::Status::OK()` ================================================ FILE: SOURCE/api_docs/cc/ClassTensor.md ================================================ # Class `tensorflow::Tensor` Represents an n-dimensional array of values. ##Member Summary * [`tensorflow::Tensor::Tensor()`](#tensorflow_Tensor_Tensor) * Default Tensor constructor. Creates a 1-dimension, 0-element float tensor. * [`tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor) * Creates a Tensor of the given `type` and `shape`. * [`tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor) * Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer. * [`tensorflow::Tensor::Tensor(DataType type)`](#tensorflow_Tensor_Tensor) * Creates an uninitialized Tensor of the given data type. * [`tensorflow::Tensor::Tensor(const Tensor &other)`](#tensorflow_Tensor_Tensor) * [`tensorflow::Tensor::~Tensor()`](#tensorflow_Tensor_Tensor) * Copy constructor. * [`DataType tensorflow::Tensor::dtype() const`](#DataType_tensorflow_Tensor_dtype) * Returns the data type. * [`const TensorShape& tensorflow::Tensor::shape() const`](#const_TensorShape_tensorflow_Tensor_shape) * Returns the shape of the tensor. * [`int tensorflow::Tensor::dims() const`](#int_tensorflow_Tensor_dims) * Convenience accessor for the tensor shape. * [`int64 tensorflow::Tensor::dim_size(int d) const`](#int64_tensorflow_Tensor_dim_size) * Convenience accessor for the tensor shape. * [`int64 tensorflow::Tensor::NumElements() const`](#int64_tensorflow_Tensor_NumElements) * Convenience accessor for the tensor shape. * [`bool tensorflow::Tensor::IsSameSize(const Tensor &b) const`](#bool_tensorflow_Tensor_IsSameSize) * [`bool tensorflow::Tensor::IsInitialized() const`](#bool_tensorflow_Tensor_IsInitialized) * Has this Tensor been initialized? * [`size_t tensorflow::Tensor::TotalBytes() const`](#size_t_tensorflow_Tensor_TotalBytes) * Returns the estimated memory usage of this tensor. * [`Tensor& tensorflow::Tensor::operator=(const Tensor &other)`](#Tensor_tensorflow_Tensor_operator_) * Assign operator. This tensor shares other's underlying storage. * [`bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_CopyFrom) * Copy the other tensor into this tensor and reshape it. * [`Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const`](#Tensor_tensorflow_Tensor_Slice) * Slice this tensor along the 1st dimension. * [`bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto) * Parse `other` and construct the tensor. * [`bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto) * [`void tensorflow::Tensor::AsProtoField(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoField) * Fills in `proto` with `*this` tensor's content. * [`void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoTensorContent) * [`TTypes::Vec tensorflow::Tensor::vec()`](#TTypes_T_Vec_tensorflow_Tensor_vec) * Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `. * [`TTypes::Matrix tensorflow::Tensor::matrix()`](#TTypes_T_Matrix_tensorflow_Tensor_matrix) * [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_tensor) * [`TTypes::Flat tensorflow::Tensor::flat()`](#TTypes_T_Flat_tensorflow_Tensor_flat) * Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape. * [`TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()`](#TTypes_T_UnalignedFlat_tensorflow_Tensor_unaligned_flat) * [`TTypes::Matrix tensorflow::Tensor::flat_inner_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_inner_dims) * [`TTypes::Matrix tensorflow::Tensor::flat_outer_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_outer_dims) * [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_shaped) * [`TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_UnalignedTensor_tensorflow_Tensor_unaligned_shaped) * [`TTypes< T >::Scalar tensorflow::Tensor::scalar()`](#TTypes_T_Scalar_tensorflow_Tensor_scalar) * Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`. * [`TTypes::ConstVec tensorflow::Tensor::vec() const`](#TTypes_T_ConstVec_tensorflow_Tensor_vec) * Const versions of all the methods above. * [`TTypes::ConstMatrix tensorflow::Tensor::matrix() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_matrix) * [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_tensor) * [`TTypes::ConstFlat tensorflow::Tensor::flat() const`](#TTypes_T_ConstFlat_tensorflow_Tensor_flat) * [`TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const`](#TTypes_T_UnalignedConstFlat_tensorflow_Tensor_unaligned_flat) * [`TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_inner_dims) * [`TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_outer_dims) * [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_shaped) * [`TTypes< T, NDIMS >::UnalignedConstTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_UnalignedConstTensor_tensorflow_Tensor_unaligned_shaped) * [`TTypes< T >::ConstScalar tensorflow::Tensor::scalar() const`](#TTypes_T_ConstScalar_tensorflow_Tensor_scalar) * [`string tensorflow::Tensor::SummarizeValue(int64 max_entries) const`](#string_tensorflow_Tensor_SummarizeValue) * Render the first `max_entries` values in `*this` into a string. * [`string tensorflow::Tensor::DebugString() const`](#string_tensorflow_Tensor_DebugString) * A human-readable summary of the tensor suitable for debugging. * [`void tensorflow::Tensor::FillDescription(TensorDescription *description) const`](#void_tensorflow_Tensor_FillDescription) * [`StringPiece tensorflow::Tensor::tensor_data() const`](#StringPiece_tensorflow_Tensor_tensor_data) * Returns a `StringPiece` mapping the current tensor's buffer. ##Member Details #### `tensorflow::Tensor::Tensor()` Default Tensor constructor. Creates a 1-dimension, 0-element float tensor. #### `tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)` Creates a Tensor of the given `type` and `shape`. The underlying buffer is allocated using a `CPUAllocator`. #### `tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)` Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer. `a` must outlive the lifetime of this Tensor . #### `tensorflow::Tensor::Tensor(DataType type)` Creates an uninitialized Tensor of the given data type. #### `tensorflow::Tensor::Tensor(const Tensor &other)` #### `tensorflow::Tensor::~Tensor()` Copy constructor. #### `DataType tensorflow::Tensor::dtype() const` Returns the data type. #### `const TensorShape& tensorflow::Tensor::shape() const` Returns the shape of the tensor. #### `int tensorflow::Tensor::dims() const` Convenience accessor for the tensor shape. For all shape accessors, see comments for relevant methods of ` TensorShape ` in ` tensor_shape.h `. #### `int64 tensorflow::Tensor::dim_size(int d) const` Convenience accessor for the tensor shape. #### `int64 tensorflow::Tensor::NumElements() const` Convenience accessor for the tensor shape. #### `bool tensorflow::Tensor::IsSameSize(const Tensor &b) const` #### `bool tensorflow::Tensor::IsInitialized() const` Has this Tensor been initialized? #### `size_t tensorflow::Tensor::TotalBytes() const` Returns the estimated memory usage of this tensor. #### `Tensor& tensorflow::Tensor::operator=(const Tensor &other)` Assign operator. This tensor shares other's underlying storage. #### `bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT` Copy the other tensor into this tensor and reshape it. This tensor shares other's underlying storage. Returns `true` iff `other.shape()` has the same number of elements of the given `shape`. #### `Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const` Slice this tensor along the 1st dimension. I.e., the returned tensor satisifies returned[i, ...] == this[dim0_start + i, ...]. The returned tensor shares the underlying tensor buffer with this tensor. NOTE: The returned tensor may not satisfies the same alignment requirement as this tensor depending on the shape. The caller must check the returned tensor's alignment before calling certain methods that have alignment requirement (e.g., ` flat() `, `tensor()`). REQUIRES: ` dims() ` >= 1 REQUIRES: `0 <= dim0_start <= dim0_limit <= dim_size(0)` #### `bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT` Parse `other` and construct the tensor. Returns `true` iff the parsing succeeds. If the parsing fails, the state of `*this` is unchanged. #### `bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT` #### `void tensorflow::Tensor::AsProtoField(TensorProto *proto) const` Fills in `proto` with `*this` tensor's content. ` AsProtoField() ` fills in the repeated field for `proto.dtype()`, while `AsProtoTensorContent()` encodes the content in `proto.tensor_content()` in a compact form. #### `void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const` #### `TTypes::Vec tensorflow::Tensor::vec()` Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `. Use these methods when you know the data type and the number of dimensions of the Tensor and you want an `Eigen::Tensor` automatically sized to the ` Tensor ` sizes. The implementation check fails if either type or sizes mismatch. Example: ```c++ typedef float T; Tensor my_mat(...built with Shape{rows: 3, cols: 5}...); auto mat = my_mat.matrix(); // 2D Eigen::Tensor, 3 x 5. auto mat = my_mat.tensor(); // 2D Eigen::Tensor, 3 x 5. auto vec = my_mat.vec(); // CHECK fails as my_mat is 2D. auto vec = my_mat.tensor(); // CHECK fails as my_mat is 2D. auto mat = my_mat.matrix();// CHECK fails as type mismatch. ``` #### `TTypes::Matrix tensorflow::Tensor::matrix()` #### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()` #### `TTypes::Flat tensorflow::Tensor::flat()` Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape. These methods allow you to access the data with the dimensions and sizes of your choice. You do not need to know the number of dimensions of the Tensor to call them. However, they `CHECK` that the type matches and the dimensions requested creates an `Eigen::Tensor` with the same number of elements as the tensor. Example: ```c++ typedef float T; Tensor my_ten(...built with Shape{planes: 4, rows: 3, cols: 5}...); // 1D Eigen::Tensor, size 60: auto flat = my_ten.flat(); // 2D Eigen::Tensor 12 x 5: auto inner = my_ten.flat_inner_dims(); // 2D Eigen::Tensor 4 x 15: auto outer = my_ten.shaped({4, 15}); // CHECK fails, bad num elements: auto outer = my_ten.shaped({4, 8}); // 3D Eigen::Tensor 6 x 5 x 2: auto weird = my_ten.shaped({6, 5, 2}); // CHECK fails, type mismatch: auto bad = my_ten.flat(); ``` #### `TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()` #### `TTypes::Matrix tensorflow::Tensor::flat_inner_dims()` Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the last one into the first dimension of the result. #### `TTypes::Matrix tensorflow::Tensor::flat_outer_dims()` Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the first one into the last dimension of the result. #### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)` #### `TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)` #### `TTypes< T >::Scalar tensorflow::Tensor::scalar()` Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`. Using ` scalar() ` allows the compiler to perform optimizations as the size of the tensor is known at compile time. #### `TTypes::ConstVec tensorflow::Tensor::vec() const` Const versions of all the methods above. #### `TTypes::ConstMatrix tensorflow::Tensor::matrix() const` #### `TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const` #### `TTypes::ConstFlat tensorflow::Tensor::flat() const` #### `TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const` #### `TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const` #### `TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const` #### `TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes) const` #### `TTypes< T, NDIMS >::UnalignedConstTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes) const` #### `TTypes< T >::ConstScalar tensorflow::Tensor::scalar() const` #### `string tensorflow::Tensor::SummarizeValue(int64 max_entries) const` Render the first `max_entries` values in `*this` into a string. #### `string tensorflow::Tensor::DebugString() const` A human-readable summary of the tensor suitable for debugging. #### `void tensorflow::Tensor::FillDescription(TensorDescription *description) const` Fill in the `TensorDescription` proto with metadata about the tensor that is useful for monitoring and debugging. #### `StringPiece tensorflow::Tensor::tensor_data() const` Returns a `StringPiece` mapping the current tensor's buffer. The returned `StringPiece` may point to memory location on devices that the CPU cannot address directly. NOTE: The underlying tensor buffer is refcounted, so the lifetime of the contents mapped by the `StringPiece` matches the lifetime of the buffer; callers should arrange to make sure the buffer does not get destroyed while the `StringPiece` is still used. REQUIRES: `DataTypeCanUseMemcpy( dtype() )`. ================================================ FILE: SOURCE/api_docs/cc/ClassTensorBuffer.md ================================================ # Class `tensorflow::TensorBuffer` ##Member Summary * [`tensorflow::TensorBuffer::~TensorBuffer() override`](#tensorflow_TensorBuffer_TensorBuffer) * [`virtual void* tensorflow::TensorBuffer::data() const =0`](#virtual_void_tensorflow_TensorBuffer_data) * [`virtual size_t tensorflow::TensorBuffer::size() const =0`](#virtual_size_t_tensorflow_TensorBuffer_size) * [`virtual TensorBuffer* tensorflow::TensorBuffer::root_buffer()=0`](#virtual_TensorBuffer_tensorflow_TensorBuffer_root_buffer) * [`virtual void tensorflow::TensorBuffer::FillAllocationDescription(AllocationDescription *proto) const =0`](#virtual_void_tensorflow_TensorBuffer_FillAllocationDescription) * [`T* tensorflow::TensorBuffer::base() const`](#T_tensorflow_TensorBuffer_base) ##Member Details #### `tensorflow::TensorBuffer::~TensorBuffer() override` #### `virtual void* tensorflow::TensorBuffer::data() const =0` #### `virtual size_t tensorflow::TensorBuffer::size() const =0` #### `virtual TensorBuffer* tensorflow::TensorBuffer::root_buffer()=0` #### `virtual void tensorflow::TensorBuffer::FillAllocationDescription(AllocationDescription *proto) const =0` #### `T* tensorflow::TensorBuffer::base() const` ================================================ FILE: SOURCE/api_docs/cc/ClassTensorShape.md ================================================ # Class `tensorflow::TensorShape` Manages the dimensions of a Tensor and their sizes. ##Member Summary * [`tensorflow::TensorShape::TensorShape(gtl::ArraySlice< int64 > dim_sizes)`](#tensorflow_TensorShape_TensorShape) * Construct a ` TensorShape ` from the provided sizes. REQUIRES: `dim_sizes[i] >= 0` * [`tensorflow::TensorShape::TensorShape(std::initializer_list< int64 > dim_sizes)`](#tensorflow_TensorShape_TensorShape) * [`tensorflow::TensorShape::TensorShape(const TensorShapeProto &proto)`](#tensorflow_TensorShape_TensorShape) * REQUIRES: `IsValid(proto)` * [`tensorflow::TensorShape::TensorShape()`](#tensorflow_TensorShape_TensorShape) * [`void tensorflow::TensorShape::Clear()`](#void_tensorflow_TensorShape_Clear) * Clear a tensor shape. * [`void tensorflow::TensorShape::AddDim(int64 size)`](#void_tensorflow_TensorShape_AddDim) * Add a dimension to the end ("inner-most"). REQUIRES: `size >= 0` * [`void tensorflow::TensorShape::AppendShape(const TensorShape &shape)`](#void_tensorflow_TensorShape_AppendShape) * Appends all the dimensions from `shape`. * [`void tensorflow::TensorShape::InsertDim(int d, int64 size)`](#void_tensorflow_TensorShape_InsertDim) * Insert a dimension somewhere in the ` TensorShape `. REQUIRES: `0 <= d <= dims() ` REQUIRES: `size >= 0` * [`void tensorflow::TensorShape::set_dim(int d, int64 size)`](#void_tensorflow_TensorShape_set_dim) * Modifies the size of the dimension `d` to be `size` REQUIRES: `0 <= d < dims() ` REQUIRES: `size >= 0` * [`void tensorflow::TensorShape::RemoveDim(int d)`](#void_tensorflow_TensorShape_RemoveDim) * Removes dimension `d` from the ` TensorShape `. REQUIRES: `0 <= d < dims() ` * [`int tensorflow::TensorShape::dims() const`](#int_tensorflow_TensorShape_dims) * Return the number of dimensions in the tensor. * [`int64 tensorflow::TensorShape::dim_size(int d) const`](#int64_tensorflow_TensorShape_dim_size) * Returns the number of elements in dimension `d`. REQUIRES: `0 <= d < dims() ` * [`gtl::ArraySlice tensorflow::TensorShape::dim_sizes() const`](#gtl_ArraySlice_int64_tensorflow_TensorShape_dim_sizes) * Returns sizes of all dimensions. * [`int64 tensorflow::TensorShape::num_elements() const`](#int64_tensorflow_TensorShape_num_elements) * Returns the number of elements in the tensor. * [`bool tensorflow::TensorShape::IsSameSize(const TensorShape &b) const`](#bool_tensorflow_TensorShape_IsSameSize) * [`bool tensorflow::TensorShape::operator==(const TensorShape &b) const`](#bool_tensorflow_TensorShape_operator_) * [`void tensorflow::TensorShape::AsProto(TensorShapeProto *proto) const`](#void_tensorflow_TensorShape_AsProto) * Fill `*proto` from `*this`. * [`Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizes() const`](#Eigen_DSizes_Eigen_DenseIndex_NDIMS_tensorflow_TensorShape_AsEigenDSizes) * Fill `*dsizes` from `*this`. * [`Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizesWithPadding() const`](#Eigen_DSizes_Eigen_DenseIndex_NDIMS_tensorflow_TensorShape_AsEigenDSizesWithPadding) * [`TensorShapeIter tensorflow::TensorShape::begin() const`](#TensorShapeIter_tensorflow_TensorShape_begin) * For iterating through the dimensions. * [`TensorShapeIter tensorflow::TensorShape::end() const`](#TensorShapeIter_tensorflow_TensorShape_end) * [`string tensorflow::TensorShape::DebugString() const`](#string_tensorflow_TensorShape_DebugString) * For error messages. * [`string tensorflow::TensorShape::ShortDebugString() const`](#string_tensorflow_TensorShape_ShortDebugString) * [`static bool tensorflow::TensorShape::IsValid(const TensorShapeProto &proto)`](#static_bool_tensorflow_TensorShape_IsValid) * Returns `true` iff `proto` is a valid tensor shape. ##Member Details #### `tensorflow::TensorShape::TensorShape(gtl::ArraySlice< int64 > dim_sizes)` Construct a ` TensorShape ` from the provided sizes. REQUIRES: `dim_sizes[i] >= 0` #### `tensorflow::TensorShape::TensorShape(std::initializer_list< int64 > dim_sizes)` #### `tensorflow::TensorShape::TensorShape(const TensorShapeProto &proto)` REQUIRES: `IsValid(proto)` #### `tensorflow::TensorShape::TensorShape()` Create a tensor shape with no dimensions and one element, which you can then call ` AddDim() ` on. #### `void tensorflow::TensorShape::Clear()` Clear a tensor shape. #### `void tensorflow::TensorShape::AddDim(int64 size)` Add a dimension to the end ("inner-most"). REQUIRES: `size >= 0` #### `void tensorflow::TensorShape::AppendShape(const TensorShape &shape)` Appends all the dimensions from `shape`. #### `void tensorflow::TensorShape::InsertDim(int d, int64 size)` Insert a dimension somewhere in the ` TensorShape `. REQUIRES: `0 <= d <= dims() ` REQUIRES: `size >= 0` #### `void tensorflow::TensorShape::set_dim(int d, int64 size)` Modifies the size of the dimension `d` to be `size` REQUIRES: `0 <= d < dims() ` REQUIRES: `size >= 0` #### `void tensorflow::TensorShape::RemoveDim(int d)` Removes dimension `d` from the ` TensorShape `. REQUIRES: `0 <= d < dims() ` #### `int tensorflow::TensorShape::dims() const` Return the number of dimensions in the tensor. #### `int64 tensorflow::TensorShape::dim_size(int d) const` Returns the number of elements in dimension `d`. REQUIRES: `0 <= d < dims() ` #### `gtl::ArraySlice tensorflow::TensorShape::dim_sizes() const` Returns sizes of all dimensions. #### `int64 tensorflow::TensorShape::num_elements() const` Returns the number of elements in the tensor. We use `int64` and not `size_t` to be compatible with `Eigen::Tensor` which uses `ptrdiff_t`. #### `bool tensorflow::TensorShape::IsSameSize(const TensorShape &b) const` Returns true if `*this` and `b` have the same sizes. Ignores dimension names. #### `bool tensorflow::TensorShape::operator==(const TensorShape &b) const` #### `void tensorflow::TensorShape::AsProto(TensorShapeProto *proto) const` Fill `*proto` from `*this`. #### `Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizes() const` Fill `*dsizes` from `*this`. #### `Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizesWithPadding() const` Same as ` AsEigenDSizes() ` but allows for `NDIMS > dims() ` in which case we pad the rest of the sizes with 1. #### `TensorShapeIter tensorflow::TensorShape::begin() const` For iterating through the dimensions. #### `TensorShapeIter tensorflow::TensorShape::end() const` #### `string tensorflow::TensorShape::DebugString() const` For error messages. #### `string tensorflow::TensorShape::ShortDebugString() const` #### `static bool tensorflow::TensorShape::IsValid(const TensorShapeProto &proto)` Returns `true` iff `proto` is a valid tensor shape. ================================================ FILE: SOURCE/api_docs/cc/ClassTensorShapeIter.md ================================================ # Class `tensorflow::TensorShapeIter` ##Member Summary * [`tensorflow::TensorShapeIter::TensorShapeIter(const TensorShape *shape, int d)`](#tensorflow_TensorShapeIter_TensorShapeIter) * [`bool tensorflow::TensorShapeIter::operator==(const TensorShapeIter &rhs)`](#bool_tensorflow_TensorShapeIter_operator_) * [`bool tensorflow::TensorShapeIter::operator!=(const TensorShapeIter &rhs)`](#bool_tensorflow_TensorShapeIter_operator_) * [`void tensorflow::TensorShapeIter::operator++()`](#void_tensorflow_TensorShapeIter_operator_) * [`TensorShapeDim tensorflow::TensorShapeIter::operator*()`](#TensorShapeDim_tensorflow_TensorShapeIter_operator_) ##Member Details #### `tensorflow::TensorShapeIter::TensorShapeIter(const TensorShape *shape, int d)` #### `bool tensorflow::TensorShapeIter::operator==(const TensorShapeIter &rhs)` #### `bool tensorflow::TensorShapeIter::operator!=(const TensorShapeIter &rhs)` #### `void tensorflow::TensorShapeIter::operator++()` #### `TensorShapeDim tensorflow::TensorShapeIter::operator*()` ================================================ FILE: SOURCE/api_docs/cc/ClassTensorShapeUtils.md ================================================ # Class `tensorflow::TensorShapeUtils` Static helper routines for ` TensorShape `. Includes a few common predicates on a tensor shape. ##Member Summary * [`static bool tensorflow::TensorShapeUtils::IsScalar(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsScalar) * [`static bool tensorflow::TensorShapeUtils::IsVector(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsVector) * [`static bool tensorflow::TensorShapeUtils::IsLegacyScalar(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsLegacyScalar) * [`static bool tensorflow::TensorShapeUtils::IsLegacyVector(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsLegacyVector) * [`static bool tensorflow::TensorShapeUtils::IsVectorOrHigher(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsVectorOrHigher) * [`static bool tensorflow::TensorShapeUtils::IsMatrix(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsMatrix) * [`static bool tensorflow::TensorShapeUtils::IsMatrixOrHigher(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsMatrixOrHigher) * [`static TensorShape tensorflow::TensorShapeUtils::MakeShape(const T *dims, int n)`](#static_TensorShape_tensorflow_TensorShapeUtils_MakeShape) * Returns a ` TensorShape ` whose dimensions are `dims[0]`, `dims[1]`, ..., `dims[n-1]`. * [`static string tensorflow::TensorShapeUtils::ShapeListString(const gtl::ArraySlice< TensorShape > &shapes)`](#static_string_tensorflow_TensorShapeUtils_ShapeListString) * [`static bool tensorflow::TensorShapeUtils::StartsWith(const TensorShape &shape0, const TensorShape &shape1)`](#static_bool_tensorflow_TensorShapeUtils_StartsWith) ##Member Details #### `static bool tensorflow::TensorShapeUtils::IsScalar(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsVector(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsLegacyScalar(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsLegacyVector(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsVectorOrHigher(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsMatrix(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsMatrixOrHigher(const TensorShape &shape)` #### `static TensorShape tensorflow::TensorShapeUtils::MakeShape(const T *dims, int n)` Returns a ` TensorShape ` whose dimensions are `dims[0]`, `dims[1]`, ..., `dims[n-1]`. #### `static string tensorflow::TensorShapeUtils::ShapeListString(const gtl::ArraySlice< TensorShape > &shapes)` #### `static bool tensorflow::TensorShapeUtils::StartsWith(const TensorShape &shape0, const TensorShape &shape1)` ================================================ FILE: SOURCE/api_docs/cc/ClassThread.md ================================================ # Class `tensorflow::Thread` ##Member Summary * [`tensorflow::Thread::Thread()`](#tensorflow_Thread_Thread) * [`virtual tensorflow::Thread::~Thread()`](#virtual_tensorflow_Thread_Thread) * Blocks until the thread of control stops running. ##Member Details #### `tensorflow::Thread::Thread()` #### `virtual tensorflow::Thread::~Thread()` Blocks until the thread of control stops running. ================================================ FILE: SOURCE/api_docs/cc/ClassWritableFile.md ================================================ # Class `tensorflow::WritableFile` A file abstraction for sequential writing. The implementation must provide buffering since callers may append small fragments at a time to the file. ##Member Summary * [`tensorflow::WritableFile::WritableFile()`](#tensorflow_WritableFile_WritableFile) * [`virtual tensorflow::WritableFile::~WritableFile()`](#virtual_tensorflow_WritableFile_WritableFile) * [`virtual Status tensorflow::WritableFile::Append(const StringPiece &data)=0`](#virtual_Status_tensorflow_WritableFile_Append) * [`virtual Status tensorflow::WritableFile::Close()=0`](#virtual_Status_tensorflow_WritableFile_Close) * [`virtual Status tensorflow::WritableFile::Flush()=0`](#virtual_Status_tensorflow_WritableFile_Flush) * [`virtual Status tensorflow::WritableFile::Sync()=0`](#virtual_Status_tensorflow_WritableFile_Sync) ##Member Details #### `tensorflow::WritableFile::WritableFile()` #### `virtual tensorflow::WritableFile::~WritableFile()` #### `virtual Status tensorflow::WritableFile::Append(const StringPiece &data)=0` #### `virtual Status tensorflow::WritableFile::Close()=0` #### `virtual Status tensorflow::WritableFile::Flush()=0` #### `virtual Status tensorflow::WritableFile::Sync()=0` ================================================ FILE: SOURCE/api_docs/cc/StructSessionOptions.md ================================================ # Struct `tensorflow::SessionOptions` Configuration information for a Session . ##Member Summary * [`Env* tensorflow::SessionOptions::env`](#Env_tensorflow_SessionOptions_env) * The environment to use. * [`string tensorflow::SessionOptions::target`](#string_tensorflow_SessionOptions_target) * The TensorFlow runtime to connect to. * [`ConfigProto tensorflow::SessionOptions::config`](#ConfigProto_tensorflow_SessionOptions_config) * Configuration options. * [`tensorflow::SessionOptions::SessionOptions()`](#tensorflow_SessionOptions_SessionOptions) ##Member Details #### `Env* tensorflow::SessionOptions::env` The environment to use. #### `string tensorflow::SessionOptions::target` The TensorFlow runtime to connect to. If 'target' is empty or unspecified, the local TensorFlow runtime implementation will be used. Otherwise, the TensorFlow engine defined by 'target' will be used to perform all computations. "target" can be either a single entry or a comma separated list of entries. Each entry is a resolvable address of the following format: local ip:port host:port ... other system-specific formats to identify tasks and jobs ... NOTE: at the moment 'local' maps to an in-process service-based runtime. Upon creation, a single session affines itself to one of the remote processes, with possible load balancing choices when the "target" resolves to a list of possible processes. If the session disconnects from the remote process during its lifetime, session calls may fail immediately. #### `ConfigProto tensorflow::SessionOptions::config` Configuration options. #### `tensorflow::SessionOptions::SessionOptions()` ================================================ FILE: SOURCE/api_docs/cc/StructState.md ================================================ # Struct `tensorflow::Status::State` ##Member Summary * [`tensorflow::error::Code tensorflow::Status::State::code`](#tensorflow_error_Code_tensorflow_Status_State_code) * [`string tensorflow::Status::State::msg`](#string_tensorflow_Status_State_msg) ##Member Details #### `tensorflow::error::Code tensorflow::Status::State::code` #### `string tensorflow::Status::State::msg` ================================================ FILE: SOURCE/api_docs/cc/StructTensorShapeDim.md ================================================ # Struct `tensorflow::TensorShapeDim` ##Member Summary * [`int tensorflow::TensorShapeDim::size`](#int_tensorflow_TensorShapeDim_size) * [`tensorflow::TensorShapeDim::TensorShapeDim(int64 s)`](#tensorflow_TensorShapeDim_TensorShapeDim) ##Member Details #### `int tensorflow::TensorShapeDim::size` #### `tensorflow::TensorShapeDim::TensorShapeDim(int64 s)` ================================================ FILE: SOURCE/api_docs/cc/StructThreadOptions.md ================================================ # Struct `tensorflow::ThreadOptions` Options to configure a Thread . Note that the options are all hints, and the underlying implementation may choose to ignore it. ##Member Summary * [`size_t tensorflow::ThreadOptions::stack_size`](#size_t_tensorflow_ThreadOptions_stack_size) * Thread stack size to use (in bytes). * [`size_t tensorflow::ThreadOptions::guard_size`](#size_t_tensorflow_ThreadOptions_guard_size) * Guard area size to use near thread stacks to use (in bytes) ##Member Details #### `size_t tensorflow::ThreadOptions::stack_size` Thread stack size to use (in bytes). #### `size_t tensorflow::ThreadOptions::guard_size` Guard area size to use near thread stacks to use (in bytes) ================================================ FILE: SOURCE/api_docs/cc/index.md ================================================ # TensorFlow C++ Session API reference documentation TensorFlow's public C++ API includes only the API for executing graphs, as of version 0.5. To control the execution of a graph from C++: 1. Build the computation graph using the [Python API](../../api_docs/python/). 1. Use [tf.train.write_graph()](../../api_docs/python/train.md#write_graph) to write the graph to a file. 1. Load the graph using the C++ Session API. For example: ```c++ // Reads a model graph definition from disk, and creates a session object you // can use to run it. Status LoadGraph(string graph_file_name, Session** session) { GraphDef graph_def; TF_RETURN_IF_ERROR( ReadBinaryProto(Env::Default(), graph_file_name, &graph_def)); TF_RETURN_IF_ERROR(NewSession(SessionOptions(), session)); TF_RETURN_IF_ERROR((*session)->Create(graph_def)); return Status::OK(); } ``` 1. Run the graph with a call to `session->Run()` ##Classes * [tensorflow::Env](../../api_docs/cc/ClassEnv.md) * [tensorflow::EnvWrapper](../../api_docs/cc/ClassEnvWrapper.md) * [tensorflow::RandomAccessFile](../../api_docs/cc/ClassRandomAccessFile.md) * [tensorflow::Session](../../api_docs/cc/ClassSession.md) * [tensorflow::Status](../../api_docs/cc/ClassStatus.md) * [tensorflow::Tensor](../../api_docs/cc/ClassTensor.md) * [tensorflow::TensorBuffer](../../api_docs/cc/ClassTensorBuffer.md) * [tensorflow::TensorShape](../../api_docs/cc/ClassTensorShape.md) * [tensorflow::TensorShapeIter](../../api_docs/cc/ClassTensorShapeIter.md) * [tensorflow::TensorShapeUtils](../../api_docs/cc/ClassTensorShapeUtils.md) * [tensorflow::Thread](../../api_docs/cc/ClassThread.md) * [tensorflow::WritableFile](../../api_docs/cc/ClassWritableFile.md) ##Structs * [tensorflow::SessionOptions](../../api_docs/cc/StructSessionOptions.md) * [tensorflow::Status::State](../../api_docs/cc/StructState.md) * [tensorflow::TensorShapeDim](../../api_docs/cc/StructTensorShapeDim.md) * [tensorflow::ThreadOptions](../../api_docs/cc/StructThreadOptions.md) ================================================ FILE: SOURCE/api_docs/index.md ================================================ # Overview TensorFlow has APIs available in several languages both for constructing and executing a TensorFlow graph. The Python API is at present the most complete and the easiest to use, but the C++ API may offer some performance advantages in graph execution, and supports deployment to small devices such as Android. Over time, we hope that the TensorFlow community will develop front ends for languages like Go, Java, JavaScript, Lua R, and perhaps others. With [SWIG](http://swig.org), it's relatively easy to develop a TensorFlow interface for your favorite language. Note: Many practical aspects of usage are covered in the Mechanics tab, and some additional documentation not specific to any particular language API is available in the Resources tab. ================================================ FILE: SOURCE/api_docs/python/array_ops.md ================================================ # Tensor Transformations Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Tensor Transformations](#AUTOGENERATED-tensor-transformations) * [Casting](#AUTOGENERATED-casting) * [`tf.string_to_number(string_tensor, out_type=None, name=None)`](#string_to_number) * [`tf.to_double(x, name='ToDouble')`](#to_double) * [`tf.to_float(x, name='ToFloat')`](#to_float) * [`tf.to_bfloat16(x, name='ToBFloat16')`](#to_bfloat16) * [`tf.to_int32(x, name='ToInt32')`](#to_int32) * [`tf.to_int64(x, name='ToInt64')`](#to_int64) * [`tf.cast(x, dtype, name=None)`](#cast) * [Shapes and Shaping](#AUTOGENERATED-shapes-and-shaping) * [`tf.shape(input, name=None)`](#shape) * [`tf.size(input, name=None)`](#size) * [`tf.rank(input, name=None)`](#rank) * [`tf.reshape(tensor, shape, name=None)`](#reshape) * [`tf.squeeze(input, squeeze_dims=None, name=None)`](#squeeze) * [`tf.expand_dims(input, dim, name=None)`](#expand_dims) * [Slicing and Joining](#AUTOGENERATED-slicing-and-joining) * [`tf.slice(input_, begin, size, name=None)`](#slice) * [`tf.split(split_dim, num_split, value, name='split')`](#split) * [`tf.tile(input, multiples, name=None)`](#tile) * [`tf.pad(input, paddings, name=None)`](#pad) * [`tf.concat(concat_dim, values, name='concat')`](#concat) * [`tf.pack(values, name='pack')`](#pack) * [`tf.unpack(value, num=None, name='unpack')`](#unpack) * [`tf.reverse_sequence(input, seq_lengths, seq_dim, name=None)`](#reverse_sequence) * [`tf.reverse(tensor, dims, name=None)`](#reverse) * [`tf.transpose(a, perm=None, name='transpose')`](#transpose) * [`tf.gather(params, indices, name=None)`](#gather) * [`tf.dynamic_partition(data, partitions, num_partitions, name=None)`](#dynamic_partition) * [`tf.dynamic_stitch(indices, data, name=None)`](#dynamic_stitch) ## Casting TensorFlow provides several operations that you can use to cast tensor data types in your graph. - - - ### `tf.string_to_number(string_tensor, out_type=None, name=None)` Converts each string in the input Tensor to the specified numeric type. (Note that int32 overflow results in an error while float overflow results in a rounded value.) ##### Args: * `string_tensor`: A `Tensor` of type `string`. * `out_type`: An optional `tf.DType` from: `tf.float32, tf.int32`. Defaults to `tf.float32`. The numeric type to interpret each string in string_tensor as. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `out_type`. A Tensor of the same shape as the input string_tensor. - - - ### `tf.to_double(x, name='ToDouble')` Casts a tensor to type `float64`. ##### Args: * `x`: A `Tensor` or `SparseTensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` or `SparseTensor` with same shape as `x` with type `float64`. ##### Raises: * `TypeError`: If `x` cannot be cast to the `float64`. - - - ### `tf.to_float(x, name='ToFloat')` Casts a tensor to type `float32`. ##### Args: * `x`: A `Tensor` or `SparseTensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` or `SparseTensor` with same shape as `x` with type `float32`. ##### Raises: * `TypeError`: If `x` cannot be cast to the `float32`. - - - ### `tf.to_bfloat16(x, name='ToBFloat16')` Casts a tensor to type `bfloat16`. ##### Args: * `x`: A `Tensor` or `SparseTensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` or `SparseTensor` with same shape as `x` with type `bfloat16`. ##### Raises: * `TypeError`: If `x` cannot be cast to the `bfloat16`. - - - ### `tf.to_int32(x, name='ToInt32')` Casts a tensor to type `int32`. ##### Args: * `x`: A `Tensor` or `SparseTensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` or `SparseTensor` with same shape as `x` with type `int32`. ##### Raises: * `TypeError`: If `x` cannot be cast to the `int32`. - - - ### `tf.to_int64(x, name='ToInt64')` Casts a tensor to type `int64`. ##### Args: * `x`: A `Tensor` or `SparseTensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` or `SparseTensor` with same shape as `x` with type `int64`. ##### Raises: * `TypeError`: If `x` cannot be cast to the `int64`. - - - ### `tf.cast(x, dtype, name=None)` Casts a tensor to a new type. The operation casts `x` (in case of `Tensor`) or `x.values` (in case of `SparseTensor`) to `dtype`. For example: ```python # tensor `a` is [1.8, 2.2], dtype=tf.float tf.cast(a, tf.int32) ==> [1, 2] # dtype=tf.int32 ``` ##### Args: * `x`: A `Tensor` or `SparseTensor`. * `dtype`: The destination type. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` or `SparseTensor` with same shape as `x`. ##### Raises: * `TypeError`: If `x` cannot be cast to the `dtype`. ## Shapes and Shaping TensorFlow provides several operations that you can use to determine the shape of a tensor and change the shape of a tensor. - - - ### `tf.shape(input, name=None)` Returns the shape of a tensor. This operation returns a 1-D integer tensor representing the shape of `input`. For example: ```prettyprint # 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]] shape(t) ==> [2, 2, 3] ``` ##### Args: * `input`: A `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int32`. - - - ### `tf.size(input, name=None)` Returns the size of a tensor. This operation returns an integer representing the number of elements in `input`. For example: ```prettyprint # 't' is [[[1, 1,, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]]] size(t) ==> 12 ``` ##### Args: * `input`: A `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int32`. - - - ### `tf.rank(input, name=None)` Returns the rank of a tensor. This operation returns an integer representing the rank of `input`. For example: ```prettyprint # 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]] # shape of tensor 't' is [2, 2, 3] rank(t) ==> 3 ``` **Note**: The rank of a tensor is not the same as the rank of a matrix. The rank of a tensor is the number of indices required to uniquely select each element of the tensor. Rank is also known as "order", "degree", or "ndims." ##### Args: * `input`: A `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int32`. - - - ### `tf.reshape(tensor, shape, name=None)` Reshapes a tensor. Given `tensor`, this operation returns a tensor that has the same values as `tensor` with shape `shape`. If `shape` is the special value `[-1]`, then `tensor` is flattened and the operation outputs a 1-D tensor with all elements of `tensor`. If `shape` is 1-D or higher, then the operation returns a tensor with shape `shape` filled with the values of `tensor`. In this case, the number of elements implied by `shape` must be the same as the number of elements in `tensor`. For example: ```prettyprint # tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9] # tensor 't' has shape [9] reshape(t, [3, 3]) ==> [[1, 2, 3] [4, 5, 6] [7, 8, 9]] # tensor 't' is [[[1, 1], [2, 2]] # [[3, 3], [4, 4]]] # tensor 't' has shape [2, 2] reshape(t, [2, 4]) ==> [[1, 1, 2, 2] [3, 3, 4, 4]] # tensor 't' is [[[1, 1, 1], # [2, 2, 2]], # [[3, 3, 3], # [4, 4, 4]], # [[5, 5, 5], # [6, 6, 6]]] # tensor 't' has shape [3, 2, 3] # pass '[-1]' to flatten 't' reshape(t, [-1]) ==> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6] ``` ##### Args: * `tensor`: A `Tensor`. * `shape`: A `Tensor` of type `int32`. Defines the shape of the output tensor. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `tensor`. - - - ### `tf.squeeze(input, squeeze_dims=None, name=None)` Removes dimensions of size 1 from the shape of a tensor. Given a tensor `input`, this operation returns a tensor of the same type with all dimensions of size 1 removed. If you don't want to remove all size 1 dimensions, you can remove specific size 1 dimensions by specifying `squeeze_dims`. For example: ```prettyprint # 't' is a tensor of shape [1, 2, 1, 3, 1, 1] shape(squeeze(t)) ==> [2, 3] ``` Or, to remove specific size 1 dimensions: ```prettyprint # 't' is a tensor of shape [1, 2, 1, 3, 1, 1] shape(squeeze(t, [2, 4])) ==> [1, 2, 3, 1] ``` ##### Args: * `input`: A `Tensor`. The `input` to squeeze. * `squeeze_dims`: An optional list of `ints`. Defaults to `[]`. If specified, only squeezes the dimensions listed. The dimension index starts at 0. It is an error to squeeze a dimension that is not 1. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. Contains the same data as `input`, but has one or more dimensions of size 1 removed. - - - ### `tf.expand_dims(input, dim, name=None)` Inserts a dimension of 1 into a tensor's shape. Given a tensor `input`, this operation inserts a dimension of 1 at the dimension index `dim` of `input`'s shape. The dimension index `dim` starts at zero; if you specify a negative number for `dim` it is counted backward from the end. This operation is useful if you want to add a batch dimension to a single element. For example, if you have a single image of shape `[height, width, channels]`, you can make it a batch of 1 image with `expand_dims(image, 0)`, which will make the shape `[1, height, width, channels]`. Other examples: ```prettyprint # 't' is a tensor of shape [2] shape(expand_dims(t, 0)) ==> [1, 2] shape(expand_dims(t, 1)) ==> [2, 1] shape(expand_dims(t, -1)) ==> [2, 1] # 't2' is a tensor of shape [2, 3, 5] shape(expand_dims(t2, 0)) ==> [1, 2, 3, 5] shape(expand_dims(t2, 2)) ==> [2, 3, 1, 5] shape(expand_dims(t2, 3)) ==> [2, 3, 5, 1] ``` This operation requires that: `-1-input.dims() <= dim <= input.dims()` This operation is related to `squeeze()`, which removes dimensions of size 1. ##### Args: * `input`: A `Tensor`. * `dim`: A `Tensor` of type `int32`. 0-D (scalar). Specifies the dimension index at which to expand the shape of `input`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. Contains the same data as `input`, but its shape has an additional dimension of size 1 added. ## Slicing and Joining TensorFlow provides several operations to slice or extract parts of a tensor, or join multiple tensors together. - - - ### `tf.slice(input_, begin, size, name=None)` Extracts a slice from a tensor. This operation extracts a slice of size `size` from a tensor `input` starting at the location specified by `begin`. The slice `size` is represented as a tensor shape, where `size[i]` is the number of elements of the 'i'th dimension of `input` that you want to slice. The starting location (`begin`) for the slice is represented as an offset in each dimension of `input`. In other words, `begin[i]` is the offset into the 'i'th dimension of `input` that you want to slice from. `begin` is zero-based; `size` is one-based. If `size[i]` is -1, all remaining elements in dimension i are included in the slice. In other words, this is equivalent to setting: `size[i] = input.dim_size(i) - begin[i]` This operation requires that: `0 <= begin[i] <= begin[i] + size[i] <= Di for i in [0, n]` For example: ``` # 'input' is [[[1, 1, 1], [2, 2, 2]], # [[3, 3, 3], [4, 4, 4]], # [[5, 5, 5], [6, 6, 6]]] tf.slice(input, [1, 0, 0], [1, 1, 3]) ==> [[[3, 3, 3]]] tf.slice(input, [1, 0, 0], [1, 2, 3]) ==> [[[3, 3, 3], [4, 4, 4]]] tf.slice(input, [1, 0, 0], [2, 1, 3]) ==> [[[3, 3, 3]], [[5, 5, 5]]] ``` ##### Args: * `input_`: A `Tensor`. * `begin`: An `int32` or `int64` `Tensor`. * `size`: An `int32` or `int64` `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` the same type as `input`. - - - ### `tf.split(split_dim, num_split, value, name='split')` Splits a tensor into `num_split` tensors along one dimension. Splits `value` along dimension `split_dim` into `num_split` smaller tensors. Requires that `num_split` evenly divide `value.shape[split_dim]`. For example: ```python # 'value' is a tensor with shape [5, 30] # Split 'value' into 3 tensors along dimension 1 split0, split1, split2 = tf.split(1, 3, value) tf.shape(split0) ==> [5, 10] ``` ##### Args: * `split_dim`: A 0-D `int32` `Tensor`. The dimension along which to split. Must be in the range `[0, rank(value))`. * `num_split`: A 0-D `int32` `Tensor`. The number of ways to split. * `value`: The `Tensor` to split. * `name`: A name for the operation (optional). ##### Returns: `num_split` `Tensor` objects resulting from splitting `value`. - - - ### `tf.tile(input, multiples, name=None)` Constructs a tensor by tiling a given tensor. This operation creates a new tensor by replicating `input` `multiples` times. The output tensor's i'th dimension has `input.dims(i) * multiples[i]` elements, and the values of `input` are replicated `multiples[i]` times along the 'i'th dimension. For example, tiling `[a b c d]` by `[2]` produces `[a b c d a b c d]`. ##### Args: * `input`: A `Tensor`. 1-D or higher. * `multiples`: A `Tensor` of type `int32`. 1-D. Length must be the same as the number of dimensions in `input` * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. - - - ### `tf.pad(input, paddings, name=None)` Pads a tensor with zeros. This operation pads a `input` with zeros according to the `paddings` you specify. `paddings` is an integer tensor with shape `[Dn, 2]`, where n is the rank of `input`. For each dimension D of `input`, `paddings[D, 0]` indicates how many zeros to add before the contents of `input` in that dimension, and `paddings[D, 1]` indicates how many zeros to add after the contents of `input` in that dimension. The padded size of each dimension D of the output is: `paddings(D, 0) + input.dim_size(D) + paddings(D, 1)` For example: ```prettyprint # 't' is [[1, 1], [2, 2]] # 'paddings' is [[1, 1]], [2, 2]] # rank of 't' is 2 pad(t, paddings) ==> [[0, 0, 0, 0, 0] [0, 0, 0, 0, 0] [0, 1, 1, 0, 0] [[0, 2, 2, 0, 0] [0, 0, 0, 0, 0]] ``` ##### Args: * `input`: A `Tensor`. * `paddings`: A `Tensor` of type `int32`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. - - - ### `tf.concat(concat_dim, values, name='concat')` Concatenates tensors along one dimension. Concatenates the list of tensors `values` along dimension `concat_dim`. If `values[i].shape = [D0, D1, ... Dconcat_dim(i), ...Dn]`, the concatenated result has shape [D0, D1, ... Rconcat_dim, ...Dn] where Rconcat_dim = sum(Dconcat_dim(i)) That is, the data from the input tensors is joined along the `concat_dim` dimension. The number of dimensions of the input tensors must match, and all dimensions except `concat_dim` must be equal. For example: ```python t1 = [[1, 2, 3], [4, 5, 6]] t2 = [[7, 8, 9], [10, 11, 12]] tf.concat(0, [t1, t2]) ==> [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]] tf.concat(1, [t1, t2]) ==> [[1, 2, 3, 7, 8, 9], [4, 5, 6, 10, 11, 12]] # tensor t3 with shape [2, 3] # tensor t4 with shape [2, 3] tf.shape(tf.concat(0, [t3, t4])) ==> [4, 3] tf.shape(tf.concat(1, [t3, t4])) ==> [2, 6] ``` ##### Args: * `concat_dim`: 0-D `int32` `Tensor`. Dimension along which to concatenate. * `values`: A list of `Tensor` objects or a single `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` resulting from concatenation of the input tensors. - - - ### `tf.pack(values, name='pack')` Packs a list of rank-`R` tensors into one rank-`(R+1)` tensor. Packs tensors in `values` into a tensor with rank one higher than each tensor in `values` and shape `[len(values)] + values[0].shape`. The output satisfies `output[i, ...] = values[i][...]`. This is the opposite of unpack. The numpy equivalent is tf.pack([x, y, z]) = np.asarray([x, y, z]) ##### Args: * `values`: A list of `Tensor` objects with the same shape and type. * `name`: A name for this operation (optional). ##### Returns: * `output`: A packed `Tensor` with the same type as `values`. - - - ### `tf.unpack(value, num=None, name='unpack')` Unpacks the outer dimension of a rank-`R` tensor into rank-`(R-1)` tensors. Unpacks `num` tensors from `value` along the first dimension. If `num` is not specified (the default), it is inferred from `value`'s shape. If `value.shape[0]` is not known, `ValueError` is raised. The ith tensor in `output` is the slice `value[i, ...]`. Each tensor in `output` has shape `value.shape[1:]`. This is the opposite of pack. The numpy equivalent is tf.unpack(x, n) = list(x) ##### Args: * `value`: A rank `R > 0` `Tensor` to be unpacked. * `num`: An `int`. The first dimension of value. Automatically inferred if `None` (the default). * `name`: A name for the operation (optional). ##### Returns: The list of `Tensor` objects unpacked from `value`. ##### Raises: * `ValueError`: If `num` is unspecified and cannot be inferred. - - - ### `tf.reverse_sequence(input, seq_lengths, seq_dim, name=None)` Reverses variable length slices in dimension `seq_dim`. This op first slices `input` along the first dimension, and for each slice `i`, reverses the first `seq_lengths[i]` elements along the dimension `seq_dim`. The elements of `seq_lengths` must obey `seq_lengths[i] < input.dims[seq_dim]`, and `seq_lengths` must be a vector of length `input.dims(0)`. The output slice `i` along dimension 0 is then given by input slice `i`, with the first `seq_lengths[i]` slices along dimension `seq_dim` reversed. For example: ```prettyprint # Given this: seq_dim = 1 input.dims = (4, ...) seq_lengths = [7, 2, 3, 5] # then slices of input are reversed on seq_dim, but only up to seq_lengths: output[0, 0:7, :, ...] = input[0, 7:0:-1, :, ...] output[1, 0:2, :, ...] = input[1, 2:0:-1, :, ...] output[2, 0:3, :, ...] = input[2, 3:0:-1, :, ...] output[3, 0:5, :, ...] = input[3, 5:0:-1, :, ...] # while entries past seq_lens are copied through: output[0, 7:, :, ...] = input[0, 7:, :, ...] output[1, 2:, :, ...] = input[1, 2:, :, ...] output[2, 3:, :, ...] = input[2, 3:, :, ...] output[3, 2:, :, ...] = input[3, 2:, :, ...] ``` ##### Args: * `input`: A `Tensor`. The input to reverse. * `seq_lengths`: A `Tensor` of type `int64`. 1-D with length `input.dims(0)` and `max(seq_lengths) < input.dims(seq_dim)` * `seq_dim`: An `int`. The dimension which is partially reversed. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. The partially reversed input. It has the same shape as `input`. - - - ### `tf.reverse(tensor, dims, name=None)` Reverses specific dimensions of a tensor. Given a `tensor`, and a `bool` tensor `dims` representing the dimensions of `tensor`, this operation reverses each dimension i of `tensor` where `dims[i]` is `True`. `tensor` can have up to 8 dimensions. The number of dimensions of `tensor` must equal the number of elements in `dims`. In other words: `rank(tensor) = size(dims)` For example: ```prettyprint # tensor 't' is [[[[ 0, 1, 2, 3], # [ 4, 5, 6, 7], # [ 8, 9, 10, 11]], # [[12, 13, 14, 15], # [16, 17, 18, 19], # [20, 21, 22, 23]]]] # tensor 't' shape is [1, 2, 3, 4] # 'dims' is [False, False, False, True] reverse(t, dims) ==> [[[[ 3, 2, 1, 0], [ 7, 6, 5, 4], [ 11, 10, 9, 8]], [[15, 14, 13, 12], [19, 18, 17, 16], [23, 22, 21, 20]]]] # 'dims' is [False, True, False, False] reverse(t, dims) ==> [[[[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23] [[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]]] # 'dims' is [False, False, True, False] reverse(t, dims) ==> [[[[8, 9, 10, 11], [4, 5, 6, 7], [0, 1, 2, 3]] [[20, 21, 22, 23], [16, 17, 18, 19], [12, 13, 14, 15]]]] ``` ##### Args: * `tensor`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `bool`, `float32`, `float64`. Up to 8-D. * `dims`: A `Tensor` of type `bool`. 1-D. The dimensions to reverse. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `tensor`. The same shape as `tensor`. - - - ### `tf.transpose(a, perm=None, name='transpose')` Transposes `a`. Permutes the dimensions according to `perm`. The returned tensor's dimension i will correspond to the input dimension `perm[i]`. If `perm` is not given, it is set to (n-1...0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. For example: ```python # 'x' is [[1 2 3] # [4 5 6]] tf.transpose(x) ==> [[1 4] [2 5] [3 6]] # Equivalently tf.transpose(x perm=[0, 1]) ==> [[1 4] [2 5] [3 6]] # 'perm' is more useful for n-dimensional tensors, for n > 2 # 'x' is [[[1 2 3] # [4 5 6]] # [[7 8 9] # [10 11 12]]] # Take the transpose of the matrices in dimension-0 tf.transpose(b, perm=[0, 2, 1]) ==> [[[1 4] [2 5] [3 6]] [[7 10] [8 11] [9 12]]] ``` ##### Args: * `a`: A `Tensor`. * `perm`: A permutation of the dimensions of `a`. * `name`: A name for the operation (optional). ##### Returns: A transposed `Tensor`. - - - ### `tf.gather(params, indices, name=None)` Gather slices from `params` according to `indices`. `indices` must be an integer tensor of any dimension (usually 0-D or 1-D). Produces an output tensor with shape `indices.shape + params.shape[1:]` where: # Scalar indices output[:, ..., :] = params[indices, :, ... :] # Vector indices output[i, :, ..., :] = params[indices[i], :, ... :] # Higher rank indices output[i, ..., j, :, ... :] = params[indices[i, ..., j], :, ..., :] If `indices` is a permutation and `len(indices) == params.shape[0]` then this operation will permute `params` accordingly.
##### Args: * `params`: A `Tensor`. * `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `params`. - - - ### `tf.dynamic_partition(data, partitions, num_partitions, name=None)` Partitions `data` into `num_partitions` tensors using indices from `partitions`. For each index tuple `js` of size `partitions.ndim`, the slice `data[js, ...]` becomes part of `outputs[partitions[js]]`. The slices with `partitions[js] = i` are placed in `outputs[i]` in lexicographic order of `js`, and the first dimension of `outputs[i]` is the number of entries in `partitions` equal to `i`. In detail, outputs[i].shape = [sum(partitions == i)] + data.shape[partitions.ndim:] outputs[i] = pack([data[js, ...] for js if partitions[js] == i]) `data.shape` must start with `partitions.shape`. For example: # Scalar partitions partitions = 1 num_partitions = 2 data = [10, 20] outputs[0] = [] # Empty with shape [0, 2] outputs[1] = [[10, 20]] # Vector partitions partitions = [0, 0, 1, 1, 0] num_partitions = 2 data = [10, 20, 30, 40, 50] outputs[0] = [10, 20, 50] outputs[1] = [30, 40]
##### Args: * `data`: A `Tensor`. * `partitions`: A `Tensor` of type `int32`. Any shape. Indices in the range `[0, num_partitions)`. * `num_partitions`: An `int` that is `>= 1`. The number of partitions to output. * `name`: A name for the operation (optional). ##### Returns: A list of `num_partitions` `Tensor` objects of the same type as data. - - - ### `tf.dynamic_stitch(indices, data, name=None)` Interleave the values from the `data` tensors into a single tensor. Builds a merged tensor such that merged[indices[m][i, ..., j], ...] = data[m][i, ..., j, ...] For example, if each `indices[m]` is scalar or vector, we have # Scalar indices merged[indices[m], ...] = data[m][...] # Vector indices merged[indices[m][i], ...] = data[m][i, ...] Each `data[i].shape` must start with the corresponding `indices[i].shape`, and the rest of `data[i].shape` must be constant w.r.t. `i`. That is, we must have `data[i].shape = indices[i].shape + constant`. In terms of this `constant`, the output shape is merged.shape = [max(indices)] + constant Values are merged in order, so if an index appears in both `indices[m][i]` and `indices[n][j]` for `(m,i) < (n,j)` the slice `data[n][j]` will appear in the merged result. For example: indices[0] = 6 indices[1] = [4, 1] indices[2] = [[5, 2], [0, 3]] data[0] = [61, 62] data[1] = [[41, 42], [11, 12]] data[2] = [[[51, 52], [21, 22]], [[1, 2], [31, 32]]] merged = [[1, 2], [11, 12], [21, 22], [31, 32], [41, 42], [51, 52], [61, 62]]
##### Args: * `indices`: A list of at least 2 `Tensor` objects of type `int32`. * `data`: A list with the same number of `Tensor` objects as `indices` of `Tensor` objects of the same type. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. ================================================ FILE: SOURCE/api_docs/python/client.md ================================================ # Running Graphs ## Contents ### [Running Graphs](#AUTOGENERATED-running-graphs) * [Session management](#AUTOGENERATED-session-management) * [`class tf.Session`](#Session) * [`class tf.InteractiveSession`](#InteractiveSession) * [`tf.get_default_session()`](#get_default_session) * [Error classes](#AUTOGENERATED-error-classes) * [`class tf.OpError`](#OpError) * [`class tf.errors.CancelledError`](#CancelledError) * [`class tf.errors.UnknownError`](#UnknownError) * [`class tf.errors.InvalidArgumentError`](#InvalidArgumentError) * [`class tf.errors.DeadlineExceededError`](#DeadlineExceededError) * [`class tf.errors.NotFoundError`](#NotFoundError) * [`class tf.errors.AlreadyExistsError`](#AlreadyExistsError) * [`class tf.errors.PermissionDeniedError`](#PermissionDeniedError) * [`class tf.errors.UnauthenticatedError`](#UnauthenticatedError) * [`class tf.errors.ResourceExhaustedError`](#ResourceExhaustedError) * [`class tf.errors.FailedPreconditionError`](#FailedPreconditionError) * [`class tf.errors.AbortedError`](#AbortedError) * [`class tf.errors.OutOfRangeError`](#OutOfRangeError) * [`class tf.errors.UnimplementedError`](#UnimplementedError) * [`class tf.errors.InternalError`](#InternalError) * [`class tf.errors.UnavailableError`](#UnavailableError) * [`class tf.errors.DataLossError`](#DataLossError) This library contains classes for launching graphs and executing operations. The [basic usage](../../get_started/introduction.md#basic-usage) guide has examples of how a graph is launched in a [`tf.Session`](#Session). ## Session management - - - ### `class tf.Session` A class for running TensorFlow operations. A `Session` object encapsulates the environment in which `Operation` objects are executed, and `Tensor` objects are evaluated. For example: ```python # Build a graph. a = tf.constant(5.0) b = tf.constant(6.0) c = a * b # Launch the graph in a session. sess = tf.Session() # Evaluate the tensor `c`. print sess.run(c) ``` A session may own resources, such as [variables](../../api_docs/python/state_ops.md#Variable), [queues](../../api_docs/python/io_ops.md#QueueBase), and [readers](../../api_docs/python/io_ops.md#ReaderBase). It is important to release these resources when they are no longer required. To do this, either invoke the [`close()`](#Session.close) method on the session, or use the session as a context manager. The following two examples are equivalent: ```python # Using the `close()` method. sess = tf.Session() sess.run(...) sess.close() # Using the context manager. with tf.Session() as sess: sess.run(...) ``` The [`ConfigProto`] (https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/config.proto) protocol buffer exposes various configuration options for a session. For example, to create a session that uses soft constraints for device placement, and log the resulting placement decisions, create a session as follows: ```python # Launch the graph in a session that allows soft device placement and # logs the placement decisions. sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True, log_device_placement=True)) ``` - - - #### `tf.Session.__init__(target='', graph=None, config=None)` Creates a new TensorFlow session. If no `graph` argument is specified when constructing the session, the default graph will be launched in the session. If you are using more than one graph (created with `tf.Graph()` in the same process, you will have to use different sessions for each graph, but each graph can be used in multiple sessions. In this case, it is often clearer to pass the graph to be launched explicitly to the session constructor. ##### Args: * `target`: (Optional.) The execution engine to connect to. Defaults to using an in-process engine. At present, no value other than the empty string is supported. * `graph`: (Optional.) The `Graph` to be launched (described above). * `config`: (Optional.) A [`ConfigProto`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/config.proto) protocol buffer with configuration options for the session. - - - #### `tf.Session.run(fetches, feed_dict=None)` Runs the operations and evaluates the tensors in `fetches`. This method runs one "step" of TensorFlow computation, by running the necessary graph fragment to execute every `Operation` and evaluate every `Tensor` in `fetches`, substituting the values in `feed_dict` for the corresponding input values. The `fetches` argument may be a list of graph elements or a single graph element, and these determine the return value of this method. A graph element can be one of the following types: * If the *i*th element of `fetches` is an [`Operation`](../../api_docs/python/framework.md#Operation), the *i*th return value will be `None`. * If the *i*th element of `fetches` is a [`Tensor`](../../api_docs/python/framework.md#Tensor), the *i*th return value will be a numpy ndarray containing the value of that tensor. * If the *i*th element of `fetches` is a [`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor), the *i*th return value will be a [`SparseTensorValue`](../../api_docs/python/sparse_ops.md#SparseTensorValue) containing the value of that sparse tensor. The optional `feed_dict` argument allows the caller to override the value of tensors in the graph. Each key in `feed_dict` can be one of the following types: * If the key is a [`Tensor`](../../api_docs/python/framework.md#Tensor), the value may be a Python scalar, string, list, or numpy ndarray that can be converted to the same `dtype` as that tensor. Additionally, if the key is a [placeholder](../../api_docs/python/io_ops.md#placeholder), the shape of the value will be checked for compatibility with the placeholder. * If the key is a [`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor), the value should be a [`SparseTensorValue`](../../api_docs/python/sparse_ops.md#SparseTensorValue). ##### Args: * `fetches`: A single graph element, or a list of graph elements (described above). * `feed_dict`: A dictionary that maps graph elements to values (described above). ##### Returns: Either a single value if `fetches` is a single graph element, or a list of values if `fetches` is a list (described above). ##### Raises: * `RuntimeError`: If this `Session` is in an invalid state (e.g. has been closed). * `TypeError`: If `fetches` or `feed_dict` keys are of an inappropriate type. * `ValueError`: If `fetches` or `feed_dict` keys are invalid or refer to a `Tensor` that doesn't exist. - - - #### `tf.Session.close()` Closes this session. Calling this method frees all resources associated with the session. ##### Raises: * `RuntimeError`: If an error occurs while closing the session. - - - #### `tf.Session.graph` The graph that was launched in this session. - - - #### `tf.Session.as_default()` Returns a context manager that makes this object the default session. Use with the `with` keyword to specify that calls to [`Operation.run()`](../../api_docs/python/framework.md#Operation.run) or [`Tensor.run()`](../../api_docs/python/framework.md#Tensor.run) should be executed in this session. ```python c = tf.constant(..) sess = tf.Session() with sess.as_default(): assert tf.get_default_session() is sess print c.eval() ``` To get the current default session, use [`tf.get_default_session()`](#get_default_session). *N.B.* The `as_default` context manager *does not* close the session when you exit the context, and you must close the session explicitly. ```python c = tf.constant(...) sess = tf.Session() with sess.as_default(): print c.eval() # ... with sess.as_default(): print c.eval() sess.close() ``` Alternatively, you can use `with tf.Session():` to create a session that is automatically closed on exiting the context, including when an uncaught exception is raised. *N.B.* The default graph is a property of the current thread. If you create a new thread, and wish to use the default session in that thread, you must explicitly add a `with sess.as_default():` in that thread's function. ##### Returns: A context manager using this session as the default session. - - - ### `class tf.InteractiveSession` A TensorFlow `Session` for use in interactive contexts, such as a shell. The only difference with a regular `Session` is that an `InteractiveSession` installs itself as the default session on construction. The methods [`Tensor.eval()`](../../api_docs/python/framework.md#Tensor.eval) and [`Operation.run()`](../../api_docs/python/framework.md#Operation.run) will use that session to run ops. This is convenient in interactive shells and [IPython notebooks](http://ipython.org), as it avoids having to pass an explicit `Session` object to run ops. For example: ```python sess = tf.InteractiveSession() a = tf.constant(5.0) b = tf.constant(6.0) c = a * b # We can just use 'c.eval()' without passing 'sess' print c.eval() sess.close() ``` Note that a regular session installs itself as the default session when it is created in a `with` statement. The common usage in non-interactive programs is to follow that pattern: ```python a = tf.constant(5.0) b = tf.constant(6.0) c = a * b with tf.Session(): # We can also use 'c.eval()' here. print c.eval() ``` - - - #### `tf.InteractiveSession.__init__(target='', graph=None)` Creates a new interactive TensorFlow session. If no `graph` argument is specified when constructing the session, the default graph will be launched in the session. If you are using more than one graph (created with `tf.Graph()` in the same process, you will have to use different sessions for each graph, but each graph can be used in multiple sessions. In this case, it is often clearer to pass the graph to be launched explicitly to the session constructor. ##### Args: * `target`: (Optional.) The execution engine to connect to. Defaults to using an in-process engine. At present, no value other than the empty string is supported. * `graph`: (Optional.) The `Graph` to be launched (described above). - - - #### `tf.InteractiveSession.close()` Closes an `InteractiveSession`. - - - ### `tf.get_default_session()` Returns the default session for the current thread. The returned `Session` will be the innermost session on which a `Session` or `Session.as_default()` context has been entered. *N.B.* The default session is a property of the current thread. If you create a new thread, and wish to use the default session in that thread, you must explicitly add a `with sess.as_default():` in that thread's function. ##### Returns: The default `Session` being used in the current thread. ## Error classes - - - ### `class tf.OpError` A generic error that is raised when TensorFlow execution fails. Whenever possible, the session will raise a more specific subclass of `OpError` from the `tf.errors` module. - - - #### `tf.OpError.op` The operation that failed, if known. *N.B.* If the failed op was synthesized at runtime, e.g. a `Send` or `Recv` op, there will be no corresponding [`Operation`](../../api_docs/python/framework.md#Operation) object. In that case, this will return `None`, and you should instead use the [`OpError.node_def`](#OpError.node_def) to discover information about the op. ##### Returns: The `Operation` that failed, or None. - - - #### `tf.OpError.node_def` The `NodeDef` proto representing the op that failed. #### Other Methods - - - #### `tf.OpError.__init__(node_def, op, message, error_code)` Creates a new OpError indicating that a particular op failed. ##### Args: * `node_def`: The graph_pb2.NodeDef proto representing the op that failed. * `op`: The ops.Operation that failed, if known; otherwise None. * `message`: The message string describing the failure. * `error_code`: The error_codes_pb2.Code describing the error. - - - #### `tf.OpError.error_code` The integer error code that describes the error. - - - #### `tf.OpError.message` The error message that describes the error. - - - ### `class tf.errors.CancelledError` Raised when an operation or step is cancelled. For example, a long-running operation (e.g. [`queue.enqueue()`](../../api_docs/python/io_ops.md#QueueBase.enqueue) may be cancelled by running another operation (e.g. [`queue.close(cancel_pending_enqueues=True)`](../../api_docs/python/io_ops.md#QueueBase.close), or by [closing the session](../../api_docs/python/client.md#Session.close). A step that is running such a long-running operation will fail by raising `CancelledError`. - - - #### `tf.errors.CancelledError.__init__(node_def, op, message)` Creates a `CancelledError`. - - - ### `class tf.errors.UnknownError` Unknown error. An example of where this error may be returned is if a Status value received from another address space belongs to an error-space that is not known to this address space. Also errors raised by APIs that do not return enough error information may be converted to this error. - - - #### `tf.errors.UnknownError.__init__(node_def, op, message, error_code=2)` Creates an `UnknownError`. - - - ### `class tf.errors.InvalidArgumentError` Raised when an operation receives an invalid argument. This may occur, for example, if an operation is receives an input tensor that has an invalid value or shape. For example, the [`tf.matmul()`](../../api_docs/python/math_ops.md#matmul) op will raise this error if it receives an input that is not a matrix, and the [`tf.reshape()`](../../api_docs/python/array_ops.md#reshape) op will raise this error if the new shape does not match the number of elements in the input tensor. - - - #### `tf.errors.InvalidArgumentError.__init__(node_def, op, message)` Creates an `InvalidArgumentError`. - - - ### `class tf.errors.DeadlineExceededError` Raised when a deadline expires before an operation could complete. This exception is not currently used. - - - #### `tf.errors.DeadlineExceededError.__init__(node_def, op, message)` Creates a `DeadlineExceededError`. - - - ### `class tf.errors.NotFoundError` Raised when a requested entity (e.g., a file or directory) was not found. For example, running the [`tf.WholeFileReader.read()`](../../api_docs/python/io_ops.md#WholeFileReader) operation could raise `NotFoundError` if it receives the name of a file that does not exist. - - - #### `tf.errors.NotFoundError.__init__(node_def, op, message)` Creates a `NotFoundError`. - - - ### `class tf.errors.AlreadyExistsError` Raised when an entity that we attempted to create already exists. For example, running an operation that saves a file (e.g. [`tf.train.Saver.save()`](../../api_docs/python/train.md#Saver.save)) could potentially raise this exception if an explicit filename for an existing file was passed. - - - #### `tf.errors.AlreadyExistsError.__init__(node_def, op, message)` Creates an `AlreadyExistsError`. - - - ### `class tf.errors.PermissionDeniedError` Raised when the caller does not have permission to run an operation. For example, running the [`tf.WholeFileReader.read()`](../../api_docs/python/io_ops.md#WholeFileReader) operation could raise `PermissionDeniedError` if it receives the name of a file for which the user does not have the read file permission. - - - #### `tf.errors.PermissionDeniedError.__init__(node_def, op, message)` Creates a `PermissionDeniedError`. - - - ### `class tf.errors.UnauthenticatedError` The request does not have valid authentication credentials. This exception is not currently used. - - - #### `tf.errors.UnauthenticatedError.__init__(node_def, op, message)` Creates an `UnauthenticatedError`. - - - ### `class tf.errors.ResourceExhaustedError` Some resource has been exhausted. For example, this error might be raised if a per-user quota is exhausted, or perhaps the entire file system is out of space. - - - #### `tf.errors.ResourceExhaustedError.__init__(node_def, op, message)` Creates a `ResourceExhaustedError`. - - - ### `class tf.errors.FailedPreconditionError` Operation was rejected because the system is not in a state to execute it. This exception is most commonly raised when running an operation that reads a [`tf.Variable`](../../api_docs/python/state_ops.md#Variable) before it has been initialized. - - - #### `tf.errors.FailedPreconditionError.__init__(node_def, op, message)` Creates a `FailedPreconditionError`. - - - ### `class tf.errors.AbortedError` The operation was aborted, typically due to a concurrent action. For example, running a [`queue.enqueue()`](../../api_docs/python/io_ops.md#QueueBase.enqueue) operation may raise `AbortedError` if a [`queue.close()`](../../api_docs/python/io_ops.md#QueueBase.close) operation previously ran. - - - #### `tf.errors.AbortedError.__init__(node_def, op, message)` Creates an `AbortedError`. - - - ### `class tf.errors.OutOfRangeError` Raised when an operation executed past the valid range. This exception is raised in "end-of-file" conditions, such as when a [`queue.dequeue()`](../../api_docs/python/io_ops.md#QueueBase.dequeue) operation is blocked on an empty queue, and a [`queue.close()`](../../api_docs/python/io_ops.md#QueueBase.close) operation executes. - - - #### `tf.errors.OutOfRangeError.__init__(node_def, op, message)` Creates an `OutOfRangeError`. - - - ### `class tf.errors.UnimplementedError` Raised when an operation has not been implemented. Some operations may raise this error when passed otherwise-valid arguments that it does not currently support. For example, running the [`tf.nn.max_pool()`](../../api_docs/python/nn.md#max_pool) operation would raise this error if pooling was requested on the batch dimension, because this is not yet supported. - - - #### `tf.errors.UnimplementedError.__init__(node_def, op, message)` Creates an `UnimplementedError`. - - - ### `class tf.errors.InternalError` Raised when the system experiences an internal error. This exception is raised when some invariant expected by the runtime has been broken. Catching this exception is not recommended. - - - #### `tf.errors.InternalError.__init__(node_def, op, message)` Creates an `InternalError`. - - - ### `class tf.errors.UnavailableError` Raised when the runtime is currently unavailable. This exception is not currently used. - - - #### `tf.errors.UnavailableError.__init__(node_def, op, message)` Creates an `UnavailableError`. - - - ### `class tf.errors.DataLossError` Raised when unrecoverable data loss or corruption is encountered. For example, this may be raised by running a [`tf.WholeFileReader.read()`](../../api_docs/python/io_ops.md#WholeFileReader) operation, if the file is truncated while it is being read. - - - #### `tf.errors.DataLossError.__init__(node_def, op, message)` Creates a `DataLossError`. ================================================ FILE: SOURCE/api_docs/python/constant_op.md ================================================ # Constants, Sequences, and Random Values Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Constants, Sequences, and Random Values](#AUTOGENERATED-constants--sequences--and-random-values) * [Constant Value Tensors](#AUTOGENERATED-constant-value-tensors) * [`tf.zeros(shape, dtype=tf.float32, name=None)`](#zeros) * [`tf.zeros_like(tensor, dtype=None, name=None)`](#zeros_like) * [`tf.ones(shape, dtype=tf.float32, name=None)`](#ones) * [`tf.ones_like(tensor, dtype=None, name=None)`](#ones_like) * [`tf.fill(dims, value, name=None)`](#fill) * [`tf.constant(value, dtype=None, shape=None, name='Const')`](#constant) * [Sequences](#AUTOGENERATED-sequences) * [`tf.linspace(start, stop, num, name=None)`](#linspace) * [`tf.range(start, limit, delta=1, name='range')`](#range) * [Random Tensors](#AUTOGENERATED-random-tensors) * [Examples:](#AUTOGENERATED-examples-) * [`tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)`](#random_normal) * [`tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)`](#truncated_normal) * [`tf.random_uniform(shape, minval=0.0, maxval=1.0, dtype=tf.float32, seed=None, name=None)`](#random_uniform) * [`tf.random_shuffle(value, seed=None, name=None)`](#random_shuffle) * [`tf.set_random_seed(seed)`](#set_random_seed) ## Constant Value Tensors TensorFlow provides several operations that you can use to generate constants. - - - ### `tf.zeros(shape, dtype=tf.float32, name=None)` Creates a tensor with all elements set to zero. This operation returns a tensor of type `dtype` with shape `shape` and all elements set to zero. For example: ```python tf.zeros([3, 4], int32) ==> [[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]] ``` ##### Args: * `shape`: Either a list of integers, or a 1-D `Tensor` of type `int32`. * `dtype`: The type of an element in the resulting `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with all elements set to zero. - - - ### `tf.zeros_like(tensor, dtype=None, name=None)` Creates a tensor with all elements set to zero. Given a single tensor (`tensor`), this operation returns a tensor of the same type and shape as `tensor` with all elements set to zero. Optionally, you can use `dtype` to specify a new type for the returned tensor. For example: ```python # 'tensor' is [[1, 2, 3], [4, 5, 6]] tf.zeros_like(tensor) ==> [[0, 0, 0], [0, 0, 0]] ``` ##### Args: * `tensor`: A `Tensor`. * `dtype`: A type for the returned `Tensor`. Must be `float32`, `float64`, `int8`, `int16`, `int32`, `int64`, `uint8`, or `complex64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with all elements set to zero. - - - ### `tf.ones(shape, dtype=tf.float32, name=None)` Creates a tensor with all elements set to 1. This operation returns a tensor of type `dtype` with shape `shape` and all elements set to 1. For example: ```python tf.ones([2, 3], int32) ==> [[1, 1, 1], [1, 1, 1]] ``` ##### Args: * `shape`: Either a list of integers, or a 1-D `Tensor` of type `int32`. * `dtype`: The type of an element in the resulting `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with all elements set to 1. - - - ### `tf.ones_like(tensor, dtype=None, name=None)` Creates a tensor with all elements set to 1. Given a single tensor (`tensor`), this operation returns a tensor of the same type and shape as `tensor` with all elements set to 1. Optionally, you can specify a new type (`dtype`) for the returned tensor. For example: ```python # 'tensor' is [[1, 2, 3], [4, 5, 6]] tf.ones_like(tensor) ==> [[1, 1, 1], [1, 1, 1]] ``` ##### Args: * `tensor`: A `Tensor`. * `dtype`: A type for the returned `Tensor`. Must be `float32`, `float64`, `int8`, `int16`, `int32`, `int64`, `uint8`, or `complex64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with all elements set to 1. - - - ### `tf.fill(dims, value, name=None)` Creates a tensor filled with a scalar value. This operation creates a tensor of shape `dims` and fills it with `value`. For example: ```prettyprint # output tensor shape needs to be [2, 3] # so 'dims' is [2, 3] fill(dims, 9) ==> [[9, 9, 9] [9, 9, 9]] ``` ##### Args: * `dims`: A `Tensor` of type `int32`. 1-D. Represents the shape of the output tensor. * `value`: A `Tensor`. 0-D (scalar). Value to fill the returned tensor. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `value`. - - - ### `tf.constant(value, dtype=None, shape=None, name='Const')` Creates a constant tensor. The resulting tensor is populated with values of type `dtype`, as specified by arguments `value` and (optionally) `shape` (see examples below). The argument `value` can be a constant value, or a list of values of type `dtype`. If `value` is a list, then the length of the list must be less than or equal to the number of elements implied by the `shape` argument (if specified). In the case where the list length is less than the number of elements specified by `shape`, the last element in the list will be used to fill the remaining entries. The argument `shape` is optional. If present, it specifies the dimensions of the resulting tensor. If not present, then the tensor is a scalar (0-D) if `value` is a scalar, or 1-D otherwise. If the argument `dtype` is not specified, then the type is inferred from the type of `value`. For example: ```python # Constant 1-D Tensor populated with value list. tensor = tf.constant([1, 2, 3, 4, 5, 6, 7]) => [1 2 3 4 5 6 7] # Constant 2-D tensor populated with scalar value -1. tensor = tf.constant(-1.0, shape=[2, 3]) => [[-1. -1. -1.] [-1. -1. -1.]] ``` ##### Args: * `value`: A constant value (or list) of output type `dtype`. * `dtype`: The type of the elements of the resulting tensor. * `shape`: Optional dimensions of resulting tensor. * `name`: Optional name for the tensor. ##### Returns: A Constant Tensor. ## Sequences - - - ### `tf.linspace(start, stop, num, name=None)` Generates values in an interval. A sequence of `num` evenly-spaced values are generated beginning at `start`. If `num > 1`, the values in the sequence increase by `stop - start / num - 1`, so that the last one is exactly `stop`. For example: ``` tf.linspace(10.0, 12.0, 3, name="linspace") => [ 10.0 11.0 12.0] ``` ##### Args: * `start`: A `Tensor`. Must be one of the following types: `float32`, `float64`. First entry in the range. * `stop`: A `Tensor`. Must have the same type as `start`. Last entry in the range. * `num`: A `Tensor` of type `int32`. Number of values to generate. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `start`. 1-D. The generated values. - - - ### `tf.range(start, limit, delta=1, name='range')` Creates a sequence of integers. This operation creates a sequence of integers that begins at `start` and extends by increments of `delta` up to but not including `limit`. For example: ``` # 'start' is 3 # 'limit' is 18 # 'delta' is 3 tf.range(start, limit, delta) ==> [3, 6, 9, 12, 15] ``` ##### Args: * `start`: A 0-D (scalar) of type `int32`. First entry in sequence. * `limit`: A 0-D (scalar) of type `int32`. Upper limit of sequence, exclusive. * `delta`: A 0-D `Tensor` (scalar) of type `int32`. Optional. Default is 1. Number that increments `start`. * `name`: A name for the operation (optional). ##### Returns: An 1-D `int32` `Tensor`. ## Random Tensors TensorFlow has several ops that create random tensors with different distributions. The random ops are stateful, and create new random values each time they are evaluated. The `seed` keyword argument in these functions acts in conjunction with the graph-level random seed. Changing either the graph-level seed using [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) or the op-level seed will change the underlying seed of these operations. Setting neither graph-level nor op-level seed, results in a random seed for all operations. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for details on the interaction between operation-level and graph-level random seeds. ### Examples: ```python # Create a tensor of shape [2, 3] consisting of random normal values, with mean # -1 and standard deviation 4. norm = tf.random_normal([2, 3], mean=-1, stddev=4) # Shuffle the first dimension of a tensor c = tf.constant([[1, 2], [3, 4], [5, 6]]) shuff = tf.random_shuffle(c) # Each time we run these ops, different results are generated sess = tf.Session() print sess.run(norm) print sess.run(norm) # Set an op-level seed to generate repeatable sequences across sessions. c = tf.constant([[1, 2], [3, 4], [5, 6]]) sess = tf.Session() norm = tf.random_normal(c, seed=1234) print sess.run(norm) print sess.run(norm) ``` Another common use of random values is the intialization of variables. Also see the [Variables How To](../../how_tos/variables/index.md). ```python # Use random uniform values in [0, 1) as the initializer for a variable of shape # [2, 3]. The default type is float32. var = tf.Variable(tf.random_uniform([2, 3]), name="var") init = tf.initialize_all_variables() sess = tf.Session() sess.run(init) print sess.run(var) ``` - - - ### `tf.random_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)` Outputs random values from a normal distribution. ##### Args: * `shape`: A 1-D integer Tensor or Python array. The shape of the output tensor. * `mean`: A 0-D Tensor or Python value of type `dtype`. The mean of the normal distribution. * `stddev`: A 0-D Tensor or Python value of type `dtype`. The standard deviation of the normal distribution. * `dtype`: The type of the output. * `seed`: A Python integer. Used to create a random seed for the distribution. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. * `name`: A name for the operation (optional). ##### Returns: A tensor of the specified shape filled with random normal values. - - - ### `tf.truncated_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None)` Outputs random values from a truncated normal distribution. The generated values follow a normal distribution with specified mean and standard deviation, except that values whose magnitude is more than 2 standard deviations from the mean are dropped and re-picked. ##### Args: * `shape`: A 1-D integer Tensor or Python array. The shape of the output tensor. * `mean`: A 0-D Tensor or Python value of type `dtype`. The mean of the truncated normal distribution. * `stddev`: A 0-D Tensor or Python value of type `dtype`. The standard deviation of the truncated normal distribution. * `dtype`: The type of the output. * `seed`: A Python integer. Used to create a random seed for the distribution. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. * `name`: A name for the operation (optional). ##### Returns: A tensor of the specified shape filled with random truncated normal values. - - - ### `tf.random_uniform(shape, minval=0.0, maxval=1.0, dtype=tf.float32, seed=None, name=None)` Outputs random values from a uniform distribution. The generated values follow a uniform distribution in the range `[minval, maxval)`. The lower bound `minval` is included in the range, while the upper bound `maxval` is excluded. ##### Args: * `shape`: A 1-D integer Tensor or Python array. The shape of the output tensor. * `minval`: A 0-D Tensor or Python value of type `dtype`. The lower bound on the range of random values to generate. * `maxval`: A 0-D Tensor or Python value of type `dtype`. The upper bound on the range of random values to generate. * `dtype`: The type of the output. * `seed`: A Python integer. Used to create a random seed for the distribution. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. * `name`: A name for the operation (optional). ##### Returns: A tensor of the specified shape filled with random uniform values. - - - ### `tf.random_shuffle(value, seed=None, name=None)` Randomly shuffles a tensor along its first dimension. The tensor is shuffled along dimension 0, such that each `value[j]` is mapped to one and only one `output[i]`. For example, a mapping that might occur for a 3x2 tensor is: ```python [[1, 2], [[5, 6], [3, 4], ==> [1, 2], [5, 6]] [3, 4]] ``` ##### Args: * `value`: A Tensor to be shuffled. * `seed`: A Python integer. Used to create a random seed for the distribution. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. * `name`: A name for the operation (optional). ##### Returns: A tensor of same shape and type as `value`, shuffled along its first dimension. - - - ### `tf.set_random_seed(seed)` Sets the graph-level random seed. Operations that rely on a random seed actually derive it from two seeds: the graph-level and operation-level seeds. This sets the graph-level seed. Its interactions with operation-level seeds is as follows: 1. If neither the graph-level nor the operation seed is set: A random seed is used for this op. 2. If the graph-level seed is set, but the operation seed is not: The system deterministically picks an operation seed in conjunction with the graph-level seed so that it gets a unique random sequence. 3. If the graph-level seed is not set, but the operation seed is set: A default graph-level seed and the specified operation seed are used to determine the random sequence. 4. If both the graph-level and the operation seed are set: Both seeds are used in conjunction to determine the random sequence. To illustrate the user-visible effects, consider these examples: To generate different sequences across sessions, set neither graph-level nor op-level seeds: ```python a = tf.random_uniform([1]) b = tf.random_normal([1]) print "Session 1" with tf.Session() as sess1: print sess1.run(a) # generates 'A1' print sess1.run(a) # generates 'A2' print sess1.run(b) # generates 'B1' print sess1.run(b) # generates 'B2' print "Session 2" with tf.Session() as sess2: print sess2.run(a) # generates 'A3' print sess2.run(a) # generates 'A4' print sess2.run(b) # generates 'B3' print sess2.run(b) # generates 'B4' ``` To generate the same repeatable sequence for an op across sessions, set the seed for the op: ```python a = tf.random_uniform([1], seed=1) b = tf.random_normal([1]) # Repeatedly running this block with the same graph will generate the same # sequence of values for 'a', but different sequences of values for 'b'. print "Session 1" with tf.Session() as sess1: print sess1.run(a) # generates 'A1' print sess1.run(a) # generates 'A2' print sess1.run(b) # generates 'B1' print sess1.run(b) # generates 'B2' print "Session 2" with tf.Session() as sess2: print sess2.run(a) # generates 'A1' print sess2.run(a) # generates 'A2' print sess2.run(b) # generates 'B3' print sess2.run(b) # generates 'B4' ``` To make the random sequences generated by all ops be repeatable across sessions, set a graph-level seed: ```python tf.set_random_seed(1234) a = tf.random_uniform([1]) b = tf.random_normal([1]) # Repeatedly running this block with the same graph will generate different # sequences of 'a' and 'b'. print "Session 1" with tf.Session() as sess1: print sess1.run(a) # generates 'A1' print sess1.run(a) # generates 'A2' print sess1.run(b) # generates 'B1' print sess1.run(b) # generates 'B2' print "Session 2" with tf.Session() as sess2: print sess2.run(a) # generates 'A1' print sess2.run(a) # generates 'A2' print sess2.run(b) # generates 'B1' print sess2.run(b) # generates 'B2' ``` ##### Args: * `seed`: integer. ================================================ FILE: SOURCE/api_docs/python/control_flow_ops.md ================================================ # Control Flow Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Control Flow](#AUTOGENERATED-control-flow) * [Control Flow Operations](#AUTOGENERATED-control-flow-operations) * [`tf.identity(input, name=None)`](#identity) * [`tf.tuple(tensors, name=None, control_inputs=None)`](#tuple) * [`tf.group(*inputs, **kwargs)`](#group) * [`tf.no_op(name=None)`](#no_op) * [`tf.count_up_to(ref, limit, name=None)`](#count_up_to) * [Logical Operators](#AUTOGENERATED-logical-operators) * [`tf.logical_and(x, y, name=None)`](#logical_and) * [`tf.logical_not(x, name=None)`](#logical_not) * [`tf.logical_or(x, y, name=None)`](#logical_or) * [`tf.logical_xor(x, y, name='LogicalXor')`](#logical_xor) * [Comparison Operators](#AUTOGENERATED-comparison-operators) * [`tf.equal(x, y, name=None)`](#equal) * [`tf.not_equal(x, y, name=None)`](#not_equal) * [`tf.less(x, y, name=None)`](#less) * [`tf.less_equal(x, y, name=None)`](#less_equal) * [`tf.greater(x, y, name=None)`](#greater) * [`tf.greater_equal(x, y, name=None)`](#greater_equal) * [`tf.select(condition, t, e, name=None)`](#select) * [`tf.where(input, name=None)`](#where) * [Debugging Operations](#AUTOGENERATED-debugging-operations) * [`tf.is_finite(x, name=None)`](#is_finite) * [`tf.is_inf(x, name=None)`](#is_inf) * [`tf.is_nan(x, name=None)`](#is_nan) * [`tf.verify_tensor_all_finite(t, msg, name=None)`](#verify_tensor_all_finite) * [`tf.check_numerics(tensor, message, name=None)`](#check_numerics) * [`tf.add_check_numerics_ops()`](#add_check_numerics_ops) * [`tf.Assert(condition, data, summarize=None, name=None)`](#Assert) * [`tf.Print(input_, data, message=None, first_n=None, summarize=None, name=None)`](#Print) ## Control Flow Operations TensorFlow provides several operations and classes that you can use to control the execution of operations and add conditional dependencies to your graph. - - - ### `tf.identity(input, name=None)` Return a tensor with the same shape and contents as the input tensor or value. ##### Args: * `input`: A `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. - - - ### `tf.tuple(tensors, name=None, control_inputs=None)` Group tensors together. This creates a tuple of tensors with the same values as the `tensors` argument, except that the value of each tensor is only returned after the values of all tensors have been computed. `control_inputs` contains additional ops that have to finish before this op finishes, but whose outputs are not returned. This can be used as a "join" mechanism for parallel computations: all the argument tensors can be computed in parallel, but the values of any tensor returned by `tuple` are only available after all the parallel computations are done. See also `group` and `with_dependencies`. ##### Args: * `tensors`: A list of `Tensor`s or `IndexedSlices`, some entries can be `None`. * `name`: (optional) A name to use as a `name_scope` for the operation. * `control_inputs`: List of additional ops to finish before returning. ##### Returns: Same as `tensors`. ##### Raises: * `ValueError`: If `tensors` does not contain any `Tensor` or `IndexedSlices`. - - - ### `tf.group(*inputs, **kwargs)` Create an op that groups multiple operations. When this op finishes, all ops in `input` have finished. This op has no output. See also `tuple` and `with_dependencies`. ##### Args: * `*inputs`: One or more tensors to group. * `**kwargs`: Optional parameters to pass when constructing the NodeDef. * `name`: A name for this operation (optional). ##### Returns: An Operation that executes all its inputs. ##### Raises: * `ValueError`: If an unknown keyword argument is provided, or if there are no inputs. - - - ### `tf.no_op(name=None)` Does nothing. Only useful as a placeholder for control edges. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - ### `tf.count_up_to(ref, limit, name=None)` Increments 'ref' until it reaches 'limit'. This operation outputs "ref" after the update is done. This makes it easier to chain operations that need to use the updated value. ##### Args: * `ref`: A mutable `Tensor`. Must be one of the following types: `int32`, `int64`. Should be from a scalar `Variable` node. * `limit`: An `int`. If incrementing ref would bring it above limit, instead generates an 'OutOfRange' error. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `ref`. A copy of the input before increment. If nothing else modifies the input, the values produced will all be distinct. ## Logical Operators TensorFlow provides several operations that you can use to add logical operators to your graph. - - - ### `tf.logical_and(x, y, name=None)` Returns the truth value of x AND y element-wise. ##### Args: * `x`: A `Tensor` of type `bool`. * `y`: A `Tensor` of type `bool`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.logical_not(x, name=None)` Returns the truth value of NOT x element-wise. ##### Args: * `x`: A `Tensor` of type `bool`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.logical_or(x, y, name=None)` Returns the truth value of x OR y element-wise. ##### Args: * `x`: A `Tensor` of type `bool`. * `y`: A `Tensor` of type `bool`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.logical_xor(x, y, name='LogicalXor')` x ^ y = (x | y) & ~(x & y). ## Comparison Operators TensorFlow provides several operations that you can use to add comparison operators to your graph. - - - ### `tf.equal(x, y, name=None)` Returns the truth value of (x == y) element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `complex64`, `quint8`, `qint8`, `qint32`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.not_equal(x, y, name=None)` Returns the truth value of (x != y) element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `complex64`, `quint8`, `qint8`, `qint32`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.less(x, y, name=None)` Returns the truth value of (x < y) element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.less_equal(x, y, name=None)` Returns the truth value of (x <= y) element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.greater(x, y, name=None)` Returns the truth value of (x > y) element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.greater_equal(x, y, name=None)` Returns the truth value of (x >= y) element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.select(condition, t, e, name=None)` Selects elements from `t` or `e`, depending on `condition`. The `condition`, `t`, and `e` tensors must all have the same shape, and the output will also have that shape. The `condition` tensor acts as an element-wise mask that chooses, based on the value at each element, whether the corresponding element in the output should be taken from `t` (if true) or `e` (if false). For example: For example: ```prettyprint # 'condition' tensor is [[True, False] # [True, False]] # 't' is [[1, 1], # [1, 1]] # 'e' is [[2, 2], # [2, 2]] select(condition, t, e) ==> [[1, 2], [1, 2]] ``` ##### Args: * `condition`: A `Tensor` of type `bool`. * `t`: A `Tensor` with the same shape as `condition`. * `e`: A `Tensor` with the same type and shape as `t`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with the same type and shape as `t` and `e`. - - - ### `tf.where(input, name=None)` Returns locations of true values in a boolean tensor. This operation returns the coordinates of true elements in `input`. The coordinates are returned in a 2-D tensor where the first dimension (rows) represents the number of true elements, and the second dimension (columns) represents the coordinates of the true elements. Keep in mind, the shape of the output tensor can vary depending on how many true values there are in `input`. Indices are output in row-major order. For example: ```prettyprint # 'input' tensor is [[True, False] # [True, False]] # 'input' has two true values, so output has two coordinates. # 'input' has rank of 2, so coordinates have two indices. where(input) ==> [[0, 0], [1, 0]] # `input` tensor is [[[True, False] # [True, False]] # [[False, True] # [False, True]] # [[False, False] # [False, True]]] # 'input' has 5 true values, so output has 5 coordinates. # 'input' has rank of 3, so coordinates have three indices. where(input) ==> [[0, 0, 0], [0, 1, 0], [1, 0, 1], [1, 1, 1], [2, 1, 1]] ``` ##### Args: * `input`: A `Tensor` of type `bool`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int64`. ## Debugging Operations TensorFlow provides several operations that you can use to validate values and debug your graph. - - - ### `tf.is_finite(x, name=None)` Returns which elements of x are finite. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.is_inf(x, name=None)` Returns which elements of x are Inf. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.is_nan(x, name=None)` Returns which elements of x are NaN. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. - - - ### `tf.verify_tensor_all_finite(t, msg, name=None)` Assert that the tensor does not contain any NaN's or Inf's. ##### Args: * `t`: Tensor to check. * `msg`: Message to log on failure. * `name`: A name for this operation (optional). ##### Returns: Same tensor as `t`. - - - ### `tf.check_numerics(tensor, message, name=None)` Checks a tensor for NaN and Inf values. When run, reports an `InvalidArgument` error if `tensor` has any values that are not a number (NaN) or infinity (Inf). Otherwise, passes `tensor` as-is. ##### Args: * `tensor`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `message`: A `string`. Prefix of the error message. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `tensor`. - - - ### `tf.add_check_numerics_ops()` Connect a check_numerics to every floating point tensor. `check_numerics` operations themselves are added for each `float` or `double` tensor in the graph. For all ops in the graph, the `check_numerics` op for all of its (`float` or `double`) inputs is guaranteed to run before the `check_numerics` op on any of its outputs. ##### Returns: A `group` op depending on all `check_numerics` ops added. - - - ### `tf.Assert(condition, data, summarize=None, name=None)` Asserts that the given condition is true. If `condition` evaluates to false, print the list of tensors in `data`. `summarize` determines how many entries of the tensors to print. ##### Args: * `condition`: The condition to evaluate. * `data`: The tensors to print out when condition is false. * `summarize`: Print this many entries of each tensor. * `name`: A name for this operation (optional). - - - ### `tf.Print(input_, data, message=None, first_n=None, summarize=None, name=None)` Prints a list of tensors. This is an identity op with the side effect of printing `data` when evaluating. ##### Args: * `input_`: A tensor passed through this op. * `data`: A list of tensors to print out when op is evaluated. * `message`: A string, prefix of the error message. * `first_n`: Only log `first_n` number of times. Negative numbers log always; this is the default. * `summarize`: Only print this many entries of each tensor. * `name`: A name for the operation (optional). ##### Returns: Same tensor as `input_`. ================================================ FILE: SOURCE/api_docs/python/framework.md ================================================ # Building Graphs ## Contents ### [Building Graphs](#AUTOGENERATED-building-graphs) * [Core graph data structures](#AUTOGENERATED-core-graph-data-structures) * [`class tf.Graph`](#Graph) * [`class tf.Operation`](#Operation) * [`class tf.Tensor`](#Tensor) * [Tensor types](#AUTOGENERATED-tensor-types) * [`class tf.DType`](#DType) * [`tf.as_dtype(type_value)`](#as_dtype) * [Utility functions](#AUTOGENERATED-utility-functions) * [`tf.device(dev)`](#device) * [`tf.name_scope(name)`](#name_scope) * [`tf.control_dependencies(control_inputs)`](#control_dependencies) * [`tf.convert_to_tensor(value, dtype=None, name=None)`](#convert_to_tensor) * [`tf.get_default_graph()`](#get_default_graph) * [`tf.import_graph_def(graph_def, input_map=None, return_elements=None, name=None, op_dict=None)`](#import_graph_def) * [Graph collections](#AUTOGENERATED-graph-collections) * [`tf.add_to_collection(name, value)`](#add_to_collection) * [`tf.get_collection(key, scope=None)`](#get_collection) * [`class tf.GraphKeys`](#GraphKeys) * [Defining new operations](#AUTOGENERATED-defining-new-operations) * [`class tf.RegisterGradient`](#RegisterGradient) * [`tf.NoGradient(op_type)`](#NoGradient) * [`class tf.RegisterShape`](#RegisterShape) * [`class tf.TensorShape`](#TensorShape) * [`class tf.Dimension`](#Dimension) * [`tf.op_scope(values, name, default_name)`](#op_scope) * [`tf.get_seed(op_seed)`](#get_seed) Classes and functions for building TensorFlow graphs. ## Core graph data structures - - - ### `class tf.Graph` A TensorFlow computation, represented as a dataflow graph. A `Graph` contains a set of [`Operation`](../../api_docs/python/framework.md#Operation) objects, which represent units of computation; and [`Tensor`](../../api_docs/python/framework.md#Tensor) objects, which represent the units of data that flow between operations. A default `Graph` is always registered, and accessible by calling [`tf.get_default_graph()`](../../api_docs/python/framework.md#get_default_graph). To add an operation to the default graph, simply call one of the functions that defines a new `Operation`: ``` c = tf.constant(4.0) assert c.graph is tf.get_default_graph() ``` Another typical usage involves the [`Graph.as_default()`](../../api_docs/python/framework.md#Graph.as_default) context manager, which overrides the current default graph for the lifetime of the context: ```python g = tf.Graph() with g.as_default(): # Define operations and tensors in `g`. c = tf.constant(30.0) assert c.graph is g ``` Important note: This class *is not* thread-safe for graph construction. All operations should be created from a single thread, or external synchronization must be provided. Unless otherwise specified, all methods are not thread-safe. - - - #### `tf.Graph.__init__()` Creates a new, empty Graph. - - - #### `tf.Graph.as_default()` Returns a context manager that makes this `Graph` the default graph. This method should be used if you want to create multiple graphs in the same process. For convenience, a global default graph is provided, and all ops will be added to this graph if you do not create a new graph explicitly. Use this method the `with` keyword to specify that ops created within the scope of a block should be added to this graph. The default graph is a property of the current thread. If you create a new thread, and wish to use the default graph in that thread, you must explicitly add a `with g.as_default():` in that thread's function. The following code examples are equivalent: ```python # 1. Using Graph.as_default(): g = tf.Graph() with g.as_default(): c = tf.constant(5.0) assert c.graph is g # 2. Constructing and making default: with tf.Graph().as_default() as g: c = tf.constant(5.0) assert c.graph is g ``` ##### Returns: A context manager for using this graph as the default graph. - - - #### `tf.Graph.as_graph_def(from_version=None)` Returns a serialized `GraphDef` representation of this graph. The serialized `GraphDef` can be imported into another `Graph` (using [`import_graph_def()`](#import_graph_def)) or used with the [C++ Session API](../../api_docs/cc/index.md). This method is thread-safe. ##### Args: * `from_version`: Optional. If this is set, returns a `GraphDef` containing only the nodes that were added to this graph since its `version` property had the given value. ##### Returns: A [`GraphDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto) protocol buffer. - - - #### `tf.Graph.finalize()` Finalizes this graph, making it read-only. After calling `g.finalize()`, no new operations can be added to `g`. This method is used to ensure that no operations are added to a graph when it is shared between multiple threads, for example when using a [`QueueRunner`](../../api_docs/python/train.md#QueueRunner). - - - #### `tf.Graph.finalized` True if this graph has been finalized. - - - #### `tf.Graph.control_dependencies(control_inputs)` Returns a context manager that specifies control dependencies. Use with the `with` keyword to specify that all operations constructed within the context should have control dependencies on `control_inputs`. For example: ```python with g.control_dependencies([a, b, c]): # `d` and `e` will only run after `a`, `b`, and `c` have executed. d = ... e = ... ``` Multiple calls to `control_dependencies()` can be nested, and in that case a new `Operation` will have control dependencies on the union of `control_inputs` from all active contexts. ```python with g.control_dependencies([a, b]): # Ops declared here run after `a` and `b`. with g.control_dependencies([c, d]): # Ops declared here run after `a`, `b`, `c`, and `d`. ``` *N.B.* The control dependencies context applies *only* to ops that are constructed within the context. Merely using an op or tensor in the context does not add a control dependency. The following example illustrates this point: ```python # WRONG def my_func(pred, tensor): t = tf.matmul(tensor, tensor) with tf.control_dependencies([pred]): # The matmul op is created outside the context, so no control # dependency will be added. return t # RIGHT def my_func(pred, tensor): with tf.control_dependencies([pred]): # The matmul op is created in the context, so a control dependency # will be added. return tf.matmul(tensor, tensor) ``` ##### Args: * `control_inputs`: A list of `Operation` or `Tensor` objects, which must be executed or computed before running the operations defined in the context. ##### Returns: A context manager that specifies control dependencies for all operations constructed within the context. ##### Raises: * `TypeError`: If `control_inputs` is not a list of `Operation` or `Tensor` objects. - - - #### `tf.Graph.device(device_name_or_function)` Returns a context manager that specifies the default device to use. The `device_name_or_function` argument may either be a device name string, a device function, or None: * If it is a device name string, all operations constructed in this context will be assigned to the device with that name. * If it is a function, it will be treated as function from Operation objects to device name strings, and invoked each time a new Operation is created. The Operation will be assigned to the device with the returned name. * If it is None, the default device will be cleared. For example: ```python with g.device('/gpu:0'): # All operations constructed in this context will be placed # on GPU 0. with g.device(None): # All operations constructed in this context will have no # assigned device. # Defines a function from `Operation` to device string. def matmul_on_gpu(n): if n.type == "MatMul": return "/gpu:0" else: return "/cpu:0" with g.device(matmul_on_gpu): # All operations of type "MatMul" constructed in this context # will be placed on GPU 0; all other operations will be placed # on CPU 0. ``` ##### Args: * `device_name_or_function`: The device name or function to use in the context. ##### Returns: A context manager that specifies the default device to use for newly created ops. - - - #### `tf.Graph.name_scope(name)` Returns a context manager that creates hierarchical names for operations. A graph maintains a stack of name scopes. A `with name_scope(...):` statement pushes a new name onto the stack for the lifetime of the context. The `name` argument will be interpreted as follows: * A string (not ending with '/') will create a new name scope, in which `name` is appended to the prefix of all operations created in the context. If `name` has been used before, it will be made unique by calling `self.unique_name(name)`. * A scope previously captured from a `with g.name_scope(...) as scope:` statement will be treated as an "absolute" name scope, which makes it possible to re-enter existing scopes. * A value of `None` or the empty string will reset the current name scope to the top-level (empty) name scope. For example: ```python with tf.Graph().as_default() as g: c = tf.constant(5.0, name="c") assert c_1.name == "c" c_1 = tf.constant(6.0, name="c") assert c_1.name == "c_1" # Creates a scope called "nested" with g.name_scope("nested") as scope: nested_c = tf.constant(10.0, name="c") assert nested_c.name == "nested/c" # Creates a nested scope called "inner". with g.name_scope("inner"): nested_inner_c = tf.constant(20.0, name="c") assert nested_inner_c.name == "nested/inner/c" # Create a nested scope called "inner_1". with g.name_scope("inner"): nested_inner_1_c = tf.constant(30.0, name="c") assert nested_inner_1_c.name == "nested/inner_1/c" # Treats `scope` as an absolute name scope, and # switches to the "nested/" scope. with g.name_scope(scope): nested_d = tf.constant(40.0, name="d") assert nested_d.name == "nested/d" with g.name_scope(""): e = tf.constant(50.0, name="e") assert e.name == "e" ``` The name of the scope itself can be captured by `with g.name_scope(...) as scope:`, which stores the name of the scope in the variable `scope`. This value can be used to name an operation that represents the overall result of executing the ops in a scope. For example: ```python inputs = tf.constant(...) with g.name_scope('my_layer') as scope: weights = tf.Variable(..., name="weights") biases = tf.Variable(..., name="biases") affine = tf.matmul(inputs, weights) + biases output = tf.nn.relu(affine, name=scope) ``` ##### Args: * `name`: A name for the scope. ##### Returns: A context manager that installs `name` as a new name scope. A `Graph` instance supports an arbitrary number of "collections" that are identified by name. For convenience when building a large graph, collections can store groups of related objects: for example, the `tf.Variable` uses a collection (named [`tf.GraphKeys.VARIABLES`](../../api_docs/python/framework.md#GraphKeys)) for all variables that are created during the construction of a graph. The caller may define additional collections by specifying a new name. - - - #### `tf.Graph.add_to_collection(name, value)` Stores `value` in the collection with the given `name`. ##### Args: * `name`: The key for the collection. For example, the `GraphKeys` class contains many standard names for collections. * `value`: The value to add to the collection. - - - #### `tf.Graph.get_collection(name, scope=None)` Returns a list of values in the collection with the given `name`. ##### Args: * `key`: The key for the collection. For example, the `GraphKeys` class contains many standard names for collections. * `scope`: (Optional.) If supplied, the resulting list is filtered to include only items whose name begins with this string. ##### Returns: The list of values in the collection with the given `name`, or an empty list if no value has been added to that collection. The list contains the values in the order under which they were collected. - - - #### `tf.Graph.as_graph_element(obj, allow_tensor=True, allow_operation=True)` Returns the object referred to by `obj`, as an `Operation` or `Tensor`. This function validates that `obj` represents an element of this graph, and gives an informative error message if it is not. This function is the canonical way to get/validate an object of one of the allowed types from an external argument reference in the Session API. This method may be called concurrently from multiple threads. ##### Args: * `obj`: A `Tensor`, an `Operation`, or the name of a tensor or operation. Can also be any object with an `_as_graph_element()` method that returns a value of one of these types. * `allow_tensor`: If true, `obj` may refer to a `Tensor`. * `allow_operation`: If true, `obj` may refer to an `Operation`. ##### Returns: The `Tensor` or `Operation` in the Graph corresponding to `obj`. ##### Raises: * `TypeError`: If `obj` is not a type we support attempting to convert to types. * `ValueError`: If `obj` is of an appropriate type but invalid. For example, an invalid string. * `KeyError`: If `obj` is not an object in the graph. - - - #### `tf.Graph.get_operation_by_name(name)` Returns the `Operation` with the given `name`. This method may be called concurrently from multiple threads. ##### Args: * `name`: The name of the `Operation` to return. ##### Returns: The `Operation` with the given `name`. ##### Raises: * `TypeError`: If `name` is not a string. * `KeyError`: If `name` does not correspond to an operation in this graph. - - - #### `tf.Graph.get_tensor_by_name(name)` Returns the `Tensor` with the given `name`. This method may be called concurrently from multiple threads. ##### Args: * `name`: The name of the `Tensor` to return. ##### Returns: The `Tensor` with the given `name`. ##### Raises: * `TypeError`: If `name` is not a string. * `KeyError`: If `name` does not correspond to a tensor in this graph. - - - #### `tf.Graph.get_operations()` Return the list of operations in the graph. You can modify the operations in place, but modifications to the list such as inserts/delete have no effect on the list of operations known to the graph. This method may be called concurrently from multiple threads. ##### Returns: A list of Operations. - - - #### `tf.Graph.get_default_device()` Returns the default device. ##### Returns: A string. - - - #### `tf.Graph.seed` - - - #### `tf.Graph.unique_name(name)` Return a unique Operation name for "name". Note: You rarely need to call unique_name() directly. Most of the time you just need to create "with g.name_scope()" blocks to generate structured names. `unique_name` is used to generate structured names, separated by "/", to help identify Operations when debugging a Graph. Operation names are displayed in error messages reported by the TensorFlow runtime, and in various visualization tools such as TensorBoard. ##### Args: * `name`: The name for an `Operation`. ##### Returns: A string to be passed to `create_op()` that will be used to name the operation being created. - - - #### `tf.Graph.version` Returns a version number that increases as ops are added to the graph. - - - #### `tf.Graph.create_op(op_type, inputs, dtypes, input_types=None, name=None, attrs=None, op_def=None, compute_shapes=True)` Creates an `Operation` in this graph. This is a low-level interface for creating an `Operation`. Most programs will not call this method directly, and instead use the Python op constructors, such as `tf.constant()`, which add ops to the default graph. ##### Args: * `op_type`: The `Operation` type to create. This corresponds to the `OpDef.name` field for the proto that defines the operation. * `inputs`: A list of `Tensor` objects that will be inputs to the `Operation`. * `dtypes`: A list of `DType` objects that will be the types of the tensors that the operation produces. * `input_types`: (Optional.) A list of `DType`s that will be the types of the tensors that the operation consumes. By default, uses the base `DType` of each input in `inputs`. Operations that expect reference-typed inputs must specify `input_types` explicitly. * `name`: (Optional.) A string name for the operation. If not specified, a name is generated based on `op_type`. * `attrs`: (Optional.) A list of `AttrValue` protos for the `attr` field of the `NodeDef` proto that will represent the operation. * `op_def`: (Optional.) The `OpDef` proto that describes the `op_type` that the operation will have. * `compute_shapes`: (Optional.) If True, shape inference will be performed to compute the shapes of the outputs. ##### Raises: * `TypeError`: if any of the inputs is not a `Tensor`. ##### Returns: An `Operation` object. - - - #### `tf.Graph.gradient_override_map(op_type_map)` EXPERIMENTAL: A context manager for overriding gradient functions. This context manager can be used to override the gradient function that will be used for ops within the scope of the context. For example: ```python @tf.RegisterGradient("CustomSquare") def _custom_square_grad(op, inputs): # ... with tf.Graph().as_default() as g: c = tf.constant(5.0) s_1 = tf.square(c) # Uses the default gradient for tf.square. with g.gradient_override_map({"Square": "CustomSquare"}): s_2 = tf.square(s_2) # Uses _custom_square_grad to compute the # gradient of s_2. ``` ##### Args: * `op_type_map`: A dictionary mapping op type strings to alternative op type strings. ##### Returns: A context manager that sets the alternative op type to be used for one or more ops created in that context. ##### Raises: * `TypeError`: If `op_type_map` is not a dictionary mapping strings to strings. - - - ### `class tf.Operation` Represents a graph node that performs computation on tensors. An `Operation` is a node in a TensorFlow `Graph` that takes zero or more `Tensor` objects as input, and produces zero or more `Tensor` objects as output. Objects of type `Operation` are created by calling a Python op constructor (such as [`tf.matmul()`](../../api_docs/python/math_ops.md#matmul)) or [`Graph.create_op()`](../../api_docs/python/framework.md#Graph.create_op). For example `c = tf.matmul(a, b)` creates an `Operation` of type "MatMul" that takes tensors `a` and `b` as input, and produces `c` as output. After the graph has been launched in a session, an `Operation` can be executed by passing it to [`Session.run()`](../../api_docs/python/client.md#Session.run). `op.run()` is a shortcut for calling `tf.get_default_session().run(op)`. - - - #### `tf.Operation.name` The full name of this operation. - - - #### `tf.Operation.type` The type of the op (e.g. `"MatMul"`). - - - #### `tf.Operation.inputs` The list of `Tensor` objects representing the data inputs of this op. - - - #### `tf.Operation.control_inputs` The `Operation` objects on which this op has a control dependency. Before this op is executed, TensorFlow will ensure that the operations in `self.control_inputs` have finished executing. This mechanism can be used to run ops sequentially for performance reasons, or to ensure that the side effects of an op are observed in the correct order. ##### Returns: A list of `Operation` objects. - - - #### `tf.Operation.outputs` The list of `Tensor` objects representing the outputs of this op. - - - #### `tf.Operation.device` The name of the device to which this op has been assigned, if any. ##### Returns: The string name of the device to which this op has been assigned, or None if it has not been assigned to a device. - - - #### `tf.Operation.graph` The `Graph` that contains this operation. - - - #### `tf.Operation.run(feed_dict=None, session=None)` Runs this operation in a `Session`. Calling this method will execute all preceding operations that produce the inputs needed for this operation. *N.B.* Before invoking `Operation.run()`, its graph must have been launched in a session, and either a default session must be available, or `session` must be specified explicitly. ##### Args: * `feed_dict`: A dictionary that maps `Tensor` objects to feed values. See [`Session.run()`](../../api_docs/python/client.md#Session.run) for a description of the valid feed values. * `session`: (Optional.) The `Session` to be used to run to this operation. If none, the default session will be used. - - - #### `tf.Operation.get_attr(name)` Returns the value of the attr of this op with the given `name`. ##### Args: * `name`: The name of the attr to fetch. ##### Returns: The value of the attr, as a Python object. ##### Raises: * `ValueError`: If this op does not have an attr with the given `name`. - - - #### `tf.Operation.traceback` Returns the call stack from when this operation was constructed. #### Other Methods - - - #### `tf.Operation.__init__(node_def, g, inputs=None, output_types=None, control_inputs=None, input_types=None, original_op=None, op_def=None)` Creates an `Operation`. NOTE: This constructor validates the name of the Operation (passed as "node_def.name"). Valid Operation names match the following regular expression: [A-Za-z0-9.][A-Za-z0-9_.\-/]* ##### Args: * `node_def`: graph_pb2.NodeDef. NodeDef for the Operation. Used for attributes of graph_pb2.NodeDef, typically "name", "op", and "device". The "input" attribute is irrelevant here as it will be computed when generating the model. * `g`: Graph. The parent graph. * `inputs`: list of Tensor objects. The inputs to this Operation. * `output_types`: list of types_pb2.DataType. List of the types of the Tensors computed by this operation. The length of this list indicates the number of output endpoints of the Operation. * `control_inputs`: list of operations or tensors from which to have a control dependency. * `input_types`: List of types_pb2.DataType representing the types of the Tensors accepted by the Operation. By default uses [x.dtype.base_dtype for x in inputs]. Operations that expect reference-typed inputs must specify these explicitly. * `original_op`: Optional. Used to associate the new Operation with an existing Operation (for example, a replica with the op that was replicated). * `op_def`: Optional. The op_def_pb2.OpDef proto that describes the op type that this Operation represents. ##### Raises: * `TypeError`: if control inputs are not Operations or Tensors, or if node_def is not a NodeDef, or if g is not a Graph, or if inputs are not Tensors, or if inputs and input_types are incompatible. * `ValueError`: if the node_def name is not valid. - - - #### `tf.Operation.node_def` Returns a serialized `NodeDef` representation of this operation. ##### Returns: A [`NodeDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto) protocol buffer. - - - #### `tf.Operation.op_def` Returns the `OpDef` proto that represents the type of this op. ##### Returns: An [`OpDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def.proto) protocol buffer. - - - #### `tf.Operation.values()` DEPRECATED: Use outputs. - - - ### `class tf.Tensor` Represents a value produced by an `Operation`. A `Tensor` is a symbolic handle to one of the outputs of an `Operation`. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow [`Session`](../../api_docs/python/client.md#Session). This class has two primary purposes: 1. A `Tensor` can be passed as an input to another `Operation`. This builds a dataflow connection between operations, which enables TensorFlow to execute an entire `Graph` that represents a large, multi-step computation. 2. After the graph has been launched in a session, the value of the `Tensor` can be computed by passing it to [`Session.run()`](../../api_docs/python/client.md#Session.run). `t.eval()` is a shortcut for calling `tf.get_default_session().run(t)`. In the following example, `c`, `d`, and `e` are symbolic `Tensor` objects, whereas `result` is a numpy array that stores a concrete value: ```python # Build a dataflow graph. c = tf.constant([[1.0, 2.0], [3.0, 4.0]]) d = tf.constant([[1.0, 1.0], [0.0, 1.0]]) e = tf.matmul(c, d) # Construct a `Session` to execut the graph. sess = tf.Session() # Execute the graph and store the value that `e` represents in `result`. result = sess.run(e) ``` - - - #### `tf.Tensor.dtype` The `DType` of elements in this tensor. - - - #### `tf.Tensor.name` The string name of this tensor. - - - #### `tf.Tensor.value_index` The index of this tensor in the outputs of its `Operation`. - - - #### `tf.Tensor.graph` The `Graph` that contains this tensor. - - - #### `tf.Tensor.op` The `Operation` that produces this tensor as an output. - - - #### `tf.Tensor.consumers()` Returns a list of `Operation`s that consume this tensor. ##### Returns: A list of `Operation`s. - - - #### `tf.Tensor.eval(feed_dict=None, session=None)` Evaluates this tensor in a `Session`. Calling this method will execute all preceding operations that produce the inputs needed for the operation that produces this tensor. *N.B.* Before invoking `Tensor.eval()`, its graph must have been launched in a session, and either a default session must be available, or `session` must be specified explicitly. ##### Args: * `feed_dict`: A dictionary that maps `Tensor` objects to feed values. See [`Session.run()`](../../api_docs/python/client.md#Session.run) for a description of the valid feed values. * `session`: (Optional.) The `Session` to be used to evaluate this tensor. If none, the default session will be used. ##### Returns: A numpy array corresponding to the value of this tensor. - - - #### `tf.Tensor.get_shape()` Returns the `TensorShape` that represents the shape of this tensor. The shape is computed using shape inference functions that are registered for each `Operation` type using `tf.RegisterShape`. See [`TensorShape`](../../api_docs/python/framework.md#TensorShape) for more details of what a shape represents. The inferred shape of a tensor is used to provide shape information without having to launch the graph in a session. This can be used for debugging, and providing early error messages. For example: ```python c = tf.constant([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]) print c.get_shape() ==> TensorShape([Dimension(2), Dimension(3)]) d = tf.constant([[1.0, 0.0], [0.0, 1.0], [1.0, 0.0], [0.0, 1.0]]) print d.get_shape() ==> TensorShape([Dimension(4), Dimension(2)]) # Raises a ValueError, because `c` and `d` do not have compatible # inner dimensions. e = tf.matmul(c, d) f = tf.matmul(c, d, transpose_a=True, transpose_b=True) print f.get_shape() ==> TensorShape([Dimension(3), Dimension(4)]) ``` In some cases, the inferred shape may have unknown dimensions. If the caller has additional information about the values of these dimensions, `Tensor.set_shape()` can be used to augment the inferred shape. ##### Returns: A `TensorShape` representing the shape of this tensor. - - - #### `tf.Tensor.set_shape(shape)` Updates the shape of this tensor. This method can be called multiple times, and will merge the given `shape` with the current shape of this tensor. It can be used to provide additional information about the shape of this tensor that cannot be inferred from the graph alone. For example, this can be used to provide additional information about the shapes of images: ```python _, image_data = tf.TFRecordReader(...).read(...) image = tf.image.decode_png(image_data, channels=3) # The height and width dimensions of `image` are data dependent, and # cannot be computed without executing the op. print image.get_shape() ==> TensorShape([Dimension(None), Dimension(None), Dimension(3)]) # We know that each image in this dataset is 28 x 28 pixels. image.set_shape([28, 28, 3]) print image.get_shape() ==> TensorShape([Dimension(28), Dimension(28), Dimension(3)]) ``` ##### Args: * `shape`: A `TensorShape` representing the shape of this tensor. ##### Raises: * `ValueError`: If `shape` is not compatible with the current shape of this tensor. #### Other Methods - - - #### `tf.Tensor.__init__(op, value_index, dtype)` Creates a new `Tensor`. ##### Args: * `op`: An `Operation`. `Operation` that computes this tensor. * `value_index`: An `int`. Index of the operation's endpoint that produces this tensor. * `dtype`: A `types.DType`. Type of data stored in this tensor. ##### Raises: * `TypeError`: If the op is not an `Operation`. - - - #### `tf.Tensor.device` The name of the device on which this tensor will be produced, or None. ## Tensor types - - - ### `class tf.DType` Represents the type of the elements in a `Tensor`. The following `DType` objects are defined: * `tf.float32`: 32-bit single-precision floating-point. * `tf.float64`: 64-bit double-precision floating-point. * `tf.bfloat16`: 16-bit truncated floating-point. * `tf.complex64`: 64-bit single-precision complex. * `tf.int8`: 8-bit signed integer. * `tf.uint8`: 8-bit unsigned integer. * `tf.int32`: 32-bit signed integer. * `tf.int64`: 64-bit signed integer. * `tf.bool`: Boolean. * `tf.string`: String. * `tf.qint8`: Quantized 8-bit signed integer. * `tf.quint8`: Quantized 8-bit unsigned integer. * `tf.qint32`: Quantized 32-bit signed integer. In addition, variants of these types with the `_ref` suffix are defined for reference-typed tensors. The `tf.as_dtype()` function converts numpy types and string type names to a `DType` object. - - - #### `tf.DType.is_compatible_with(other)` Returns True if the `other` DType will be converted to this DType. The conversion rules are as follows: ``` DType(T) .is_compatible_with(DType(T)) == True DType(T) .is_compatible_with(DType(T).as_ref) == True DType(T).as_ref.is_compatible_with(DType(T)) == False DType(T).as_ref.is_compatible_with(DType(T).as_ref) == True ``` ##### Args: * `other`: A `DType` (or object that may be converted to a `DType`). ##### Returns: True if a Tensor of the `other` `DType` will be implicitly converted to this `DType`. - - - #### `tf.DType.name` Returns the string name for this `DType`. - - - #### `tf.DType.base_dtype` Returns a non-reference `DType` based on this `DType`. - - - #### `tf.DType.is_ref_dtype` Returns `True` if this `DType` represents a reference type. - - - #### `tf.DType.as_ref` Returns a reference `DType` based on this `DType`. - - - #### `tf.DType.is_integer` Returns whether this is a (non-quantized) integer type. - - - #### `tf.DType.is_quantized` Returns whether this is a quantized data type. - - - #### `tf.DType.as_numpy_dtype` Returns a `numpy.dtype` based on this `DType`. - - - #### `tf.DType.as_datatype_enum` Returns a `types_pb2.DataType` enum value based on this `DType`. #### Other Methods - - - #### `tf.DType.__init__(type_enum)` Creates a new `DataType`. NOTE(mrry): In normal circumstances, you should not need to construct a DataType object directly. Instead, use the types.as_dtype() function. ##### Args: * `type_enum`: A `types_pb2.DataType` enum value. ##### Raises: * `TypeError`: If `type_enum` is not a value `types_pb2.DataType`. - - - #### `tf.DType.max` Returns the maximum representable value in this data type. ##### Raises: * `TypeError`: if this is a non-numeric, unordered, or quantized type. - - - #### `tf.DType.min` Returns the minimum representable value in this data type. ##### Raises: * `TypeError`: if this is a non-numeric, unordered, or quantized type. - - - ### `tf.as_dtype(type_value)` Converts the given `type_value` to a `DType`. ##### Args: * `type_value`: A value that can be converted to a `tf.DType` object. This may currently be a `tf.DType` object, a [`DataType` enum](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.proto), a string type name, or a `numpy.dtype`. ##### Returns: A `DType` corresponding to `type_value`. ##### Raises: * `TypeError`: If `type_value` cannot be converted to a `DType`. ## Utility functions - - - ### `tf.device(dev)` Wrapper for `Graph.device()` using the default graph. See [`Graph.name_scope()`](../../api_docs/python/framework.md#Graph.name_scope) for more details. ##### Args: * `device_name_or_function`: The device name or function to use in the context. ##### Returns: A context manager that specifies the default device to use for newly created ops. - - - ### `tf.name_scope(name)` Wrapper for `Graph.name_scope()` using the default graph. See [`Graph.name_scope()`](../../api_docs/python/framework.md#Graph.name_scope) for more details. ##### Args: * `name`: A name for the scope. ##### Returns: A context manager that installs `name` as a new name scope in the default graph. - - - ### `tf.control_dependencies(control_inputs)` Wrapper for `Graph.control_dependencies()` using the default graph. See [`Graph.control_dependencies()`](../../api_docs/python/framework.md#Graph.control_dependencies) for more details. ##### Args: * `control_inputs`: A list of `Operation` or `Tensor` objects, which must be executed or computed before running the operations defined in the context. ##### Returns: A context manager that specifies control dependencies for all operations constructed within the context. - - - ### `tf.convert_to_tensor(value, dtype=None, name=None)` Converts the given `value` to a `Tensor`. This function converts Python objects of various types to `Tensor` objects. It accepts `Tensor` objects, numpy arrays, Python lists, and Python scalars. For example: ```python import numpy as np array = np.random.rand((32, 100, 100)) def my_func(arg): arg = tf.convert_to_tensor(arg, dtype=tf.float32) return tf.matmul(arg, arg) + arg # The following calls are equivalent. value_1 = my_func(tf.constant([[1.0, 2.0], [3.0, 4.0]])) value_2 = my_func([[1.0, 2.0], [3.0, 4.0]]) value_3 = my_func(np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32)) ``` This function can be useful when composing a new operation in Python (such as `my_func` in the example above). All standard Python op constructors apply this function to each of their Tensor-valued inputs, which allows those ops to accept numpy arrays, Python lists, and scalars in addition to `Tensor` objects. ##### Args: * `value`: An object whose type has a registered `Tensor` conversion function. * `dtype`: Optional element type for the returned tensor. If missing, the type is inferred from the type of `value`. * `name`: Optional name to use if a new `Tensor` is created. ##### Returns: A `Tensor` based on `value`. ##### Raises: * `TypeError`: If no conversion function is registered for `value`. * `RuntimeError`: If a registered conversion function returns an invalid value. - - - ### `tf.get_default_graph()` Returns the default graph for the current thread. The returned graph will be the innermost graph on which a `Graph.as_default()` context has been entered, or a global default graph if none has been explicitly created. *N.B.* The default graph is a property of the current thread. If you create a new thread, and wish to use the default graph in that thread, you must explicitly add a `with g.as_default():` in that thread's function. ##### Returns: The default `Graph` being used in the current thread. - - - ### `tf.import_graph_def(graph_def, input_map=None, return_elements=None, name=None, op_dict=None)` Imports the TensorFlow graph in `graph_def` into the Python `Graph`. This function provides a way to import a serialized TensorFlow [`GraphDef`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto) protocol buffer, and extract individual objects in the `GraphDef` as [`Tensor`](#Tensor) and [`Operation`](#Operation) objects. See [`Graph.as_graph_def()`](#Graph.as_graph_def) for a way to create a `GraphDef` proto. ##### Args: * `graph_def`: A `GraphDef` proto containing operations to be imported into the default graph. * `input_map`: A dictionary mapping input names (as strings) in `graph_def` to `Tensor` objects. The values of the named input tensors in the imported graph will be re-mapped to the respective `Tensor` values. * `return_elements`: A list of strings containing operation names in `graph_def` that will be returned as `Operation` objects; and/or tensor names in `graph_def` that will be returned as `Tensor` objects. * `name`: (Optional.) A prefix that will be prepended to the names in `graph_def`. Defaults to `"import"`. * `op_dict`: (Optional.) A dictionary mapping op type names to `OpDef` protos. Must contain an `OpDef` proto for each op type named in `graph_def`. If omitted, uses the `OpDef` protos registered in the global registry. ##### Returns: A list of `Operation` and/or `Tensor` objects from the imported graph, corresponding to the names in `return_elements'. ##### Raises: * `TypeError`: If `graph_def` is not a `GraphDef` proto, `input_map' is not a dictionary mapping strings to `Tensor` objects, or `return_elements` is not a list of strings. * `ValueError`: If `input_map`, or `return_elements` contains names that do not appear in `graph_def`, or `graph_def` is not well-formed (e.g. it refers to an unknown tensor). ## Graph collections - - - ### `tf.add_to_collection(name, value)` Wrapper for `Graph.add_to_collection()` using the default graph. See [`Graph.add_to_collection()`](../../api_docs/python/framework.md#Graph.add_to_collection) for more details. ##### Args: * `name`: The key for the collection. For example, the `GraphKeys` class contains many standard names for collections. * `value`: The value to add to the collection. - - - ### `tf.get_collection(key, scope=None)` Wrapper for `Graph.get_collection()` using the default graph. See [`Graph.get_collection()`](../../api_docs/python/framework.md#Graph.get_collection) for more details. ##### Args: * `key`: The key for the collection. For example, the `GraphKeys` class contains many standard names for collections. * `scope`: (Optional.) If supplied, the resulting list is filtered to include only items whose name begins with this string. ##### Returns: The list of values in the collection with the given `name`, or an empty list if no value has been added to that collection. The list contains the values in the order under which they were collected. - - - ### `class tf.GraphKeys` Standard names to use for graph collections. The standard library uses various well-known names to collect and retrieve values associated with a graph. For example, the `tf.Optimizer` subclasses default to optimizing the variables collected under `tf.GraphKeys.TRAINABLE_VARIABLES` if none is specified, but it is also possible to pass an explicit list of variables. The following standard keys are defined: * `VARIABLES`: the `Variable` objects that comprise a model, and must be saved and restored together. See [`tf.all_variables()`](../../api_docs/python/state_ops.md#all_variables) for more details. * `TRAINABLE_VARIABLES`: the subset of `Variable` objects that will be trained by an optimizer. See [`tf.trainable_variables()`](../../api_docs/python/state_ops.md#trainable_variables) for more details. * `SUMMARIES`: the summary `Tensor` objects that have been created in the graph. See [`tf.merge_all_summaries()`](../../api_docs/python/train.md#merge_all_summaries) for more details. * `QUEUE_RUNNERS`: the `QueueRunner` objects that are used to produce input for a computation. See [`tf.start_queue_runners()`](../../api_docs/python/train.md#start_queue_runners) for more details. ## Defining new operations - - - ### `class tf.RegisterGradient` A decorator for registering the gradient function for an op type. This decorator is only used when defining a new op type. For an op with `m` inputs and `n` inputs, the gradient function is a function that takes the original `Operation` and `n` `Tensor` objects (representing the gradients with respect to each output of the op), and returns `m` `Tensor` objects (representing the partial gradients with respect to each input of the op). For example, assuming that operations of type `"Sub"` take two inputs `x` and `y`, and return a single output `x - y`, the following gradient function would be registered: ```python @tf.RegisterGradient("Sub") def _sub_grad(unused_op, grad): return grad, tf.Neg(grad) ``` The decorator argument `op_type` is the string type of an operation. This corresponds to the `OpDef.name` field for the proto that defines the operation. - - - #### `tf.RegisterGradient.__init__(op_type)` Creates a new decorator with `op_type` as the Operation type. ##### Args: * `op_type`: The string type of an operation. This corresponds to the `OpDef.name` field for the proto that defines the operation. - - - ### `tf.NoGradient(op_type)` Specifies that ops of type `op_type` do not have a defined gradient. This function is only used when defining a new op type. It may be used for ops such as `tf.size()` that are not differentiable. For example: ```python tf.NoGradient("Size") ``` ##### Args: * `op_type`: The string type of an operation. This corresponds to the `OpDef.name` field for the proto that defines the operation. ##### Raises: * `TypeError`: If `op_type` is not a string. - - - ### `class tf.RegisterShape` A decorator for registering the shape function for an op type. This decorator is only used when defining a new op type. A shape function is a function from an `Operation` object to a list of `TensorShape` objects, with one `TensorShape` for each output of the operation. For example, assuming that operations of type `"Sub"` take two inputs `x` and `y`, and return a single output `x - y`, all with the same shape, the following shape function would be registered: ```python @tf.RegisterShape("Sub") def _sub_shape(op): return [op.inputs[0].get_shape().merge_with(op.inputs[1].get_shape())] ``` The decorator argument `op_type` is the string type of an operation. This corresponds to the `OpDef.name` field for the proto that defines the operation. - - - #### `tf.RegisterShape.__init__(op_type)` Saves the "op_type" as the Operation type. - - - ### `class tf.TensorShape` Represents the shape of a `Tensor`. A `TensorShape` represents a possibly-partial shape specification for a `Tensor`. It may be one of the following: * *Fully-known shape:* has a known number of dimensions and a known size for each dimension. * *Partially-known shape:* has a known number of dimensions, and an unknown size for one or more dimension. * *Unknown shape:* has an unknown number of dimensions, and an unknown size in all dimensions. If a tensor is produced by an operation of type `"Foo"`, its shape may be inferred if there is a registered shape function for `"Foo"`. See [`tf.RegisterShape()`](../../api_docs/python/framework.md#RegisterShape) for details of shape functions and how to register them. Alternatively, the shape may be set explicitly using [`Tensor.set_shape()`](../../api_docs/python/framework.md#Tensor.set_shape). - - - #### `tf.TensorShape.merge_with(other)` Returns a `TensorShape` combining the information in `self` and `other`. The dimensions in `self` and `other` are merged elementwise, according to the rules defined for `Dimension.merge_with()`. ##### Args: * `other`: Another `TensorShape`. ##### Returns: A `TensorShape` containing the combined information of `self` and `other`. ##### Raises: * `ValueError`: If `self` and `other` are not compatible. - - - #### `tf.TensorShape.concatenate(other)` Returns the concatenation of the dimension in `self` and `other`. *N.B.* If either `self` or `other` is completely unknown, concatenation will discard information about the other shape. In future, we might support concatenation that preserves this information for use with slicing. ##### Args: * `other`: Another `TensorShape`. ##### Returns: A `TensorShape` whose dimensions are the concatenation of the dimensions in `self` and `other`. - - - #### `tf.TensorShape.ndims` Returns the rank of this shape, or None if it is unspecified. - - - #### `tf.TensorShape.dims` Returns a list of Dimensions, or None if the shape is unspecified. - - - #### `tf.TensorShape.as_list()` Returns a list of integers or None for each dimension. - - - #### `tf.TensorShape.is_compatible_with(other)` Returns True iff `self` is compatible with `other`. Two possibly-partially-defined shapes are compatible if there exists a fully-defined shape that both shapes can represent. Thus, compatibility allows the shape inference code to reason about partially-defined shapes. For example: * TensorShape(None) is compatible with all shapes. * TensorShape([None, None]) is compatible with all two-dimensional shapes, such as TensorShape([32, 784]), and also TensorShape(None). It is not compatible with, for example, TensorShape([None]) or TensorShape([None, None, None]). * TensorShape([32, None]) is compatible with all two-dimensional shapes with size 32 in the 0th dimension, and also TensorShape([None, None]) and TensorShape(None). It is not compatible with, for example, TensorShape([32]), TensorShape([32, None, 1]) or TensorShape([64, None]). * TensorShape([32, 784]) is compatible with itself, and also TensorShape([32, None]), TensorShape([None, 784]), TensorShape([None, None]) and TensorShape(None). It is not compatible with, for example, TensorShape([32, 1, 784]) or TensorShape([None]). The compatibility relation is reflexive and symmetric, but not transitive. For example, TensorShape([32, 784]) is compatible with TensorShape(None), and TensorShape(None) is compatible with TensorShape([4, 4]), but TensorShape([32, 784]) is not compatible with TensorShape([4, 4]). ##### Args: * `other`: Another TensorShape. ##### Returns: True iff `self` is compatible with `other`. - - - #### `tf.TensorShape.is_fully_defined()` Returns True iff `self` is fully defined in every dimension. - - - #### `tf.TensorShape.with_rank(rank)` Returns a shape based on `self` with the given rank. This method promotes a completely unknown shape to one with a known rank. ##### Args: * `rank`: An integer. ##### Returns: A shape that is at least as specific as `self` with the given rank. ##### Raises: * `ValueError`: If `self` does not represent a shape with the given `rank`. - - - #### `tf.TensorShape.with_rank_at_least(rank)` Returns a shape based on `self` with at least the given rank. ##### Args: * `rank`: An integer. ##### Returns: A shape that is at least as specific as `self` with at least the given rank. ##### Raises: * `ValueError`: If `self` does not represent a shape with at least the given `rank`. - - - #### `tf.TensorShape.with_rank_at_most(rank)` Returns a shape based on `self` with at most the given rank. ##### Args: * `rank`: An integer. ##### Returns: A shape that is at least as specific as `self` with at most the given rank. ##### Raises: * `ValueError`: If `self` does not represent a shape with at most the given `rank`. - - - #### `tf.TensorShape.assert_has_rank(rank)` Raises an exception if `self` is not compatible with the given `rank`. ##### Args: * `rank`: An integer. ##### Raises: * `ValueError`: If `self` does not represent a shape with the given `rank`. - - - #### `tf.TensorShape.assert_same_rank(other)` Raises an exception if `self` and `other` do not have compatible ranks. ##### Args: * `other`: Another `TensorShape`. ##### Raises: * `ValueError`: If `self` and `other` do not represent shapes with the same rank. - - - #### `tf.TensorShape.assert_is_compatible_with(other)` Raises exception if `self` and `other` do not represent the same shape. This method can be used to assert that there exists a shape that both `self` and `other` represent. ##### Args: * `other`: Another TensorShape. ##### Raises: * `ValueError`: If `self` and `other` do not represent the same shape. - - - #### `tf.TensorShape.assert_is_fully_defined()` Raises an exception if `self` is not fully defined in every dimension. ##### Raises: * `ValueError`: If `self` does not have a known value for every dimension. #### Other Methods - - - #### `tf.TensorShape.__init__(dims)` Creates a new TensorShape with the given dimensions. ##### Args: * `dims`: A list of Dimensions, or None if the shape is unspecified. * `DEPRECATED`: A single integer is treated as a singleton list. - - - #### `tf.TensorShape.as_dimension_list()` DEPRECATED: use as_list(). - - - #### `tf.TensorShape.num_elements()` Returns the total number of elements, or none for incomplete shapes. - - - ### `class tf.Dimension` Represents the value of one dimension in a TensorShape. - - - #### `tf.Dimension.__init__(value)` Creates a new Dimension with the given value. - - - #### `tf.Dimension.assert_is_compatible_with(other)` Raises an exception if `other` is not compatible with this Dimension. ##### Args: * `other`: Another Dimension. ##### Raises: * `ValueError`: If `self` and `other` are not compatible (see is_compatible_with). - - - #### `tf.Dimension.is_compatible_with(other)` Returns true if `other` is compatible with this Dimension. Two known Dimensions are compatible if they have the same value. An unknown Dimension is compatible with all other Dimensions. ##### Args: * `other`: Another Dimension. ##### Returns: True if this Dimension and `other` are compatible. - - - #### `tf.Dimension.merge_with(other)` Returns a Dimension that combines the information in `self` and `other`. Dimensions are combined as follows: Dimension(n) .merge_with(Dimension(n)) == Dimension(n) Dimension(n) .merge_with(Dimension(None)) == Dimension(n) Dimension(None).merge_with(Dimension(n)) == Dimension(n) Dimension(None).merge_with(Dimension(None)) == Dimension(None) Dimension(n) .merge_with(Dimension(m)) raises ValueError for n != m ##### Args: * `other`: Another Dimension. ##### Returns: A Dimension containing the combined information of `self` and `other`. ##### Raises: * `ValueError`: If `self` and `other` are not compatible (see is_compatible_with). - - - #### `tf.Dimension.value` The value of this dimension, or None if it is unknown. - - - ### `tf.op_scope(values, name, default_name)` Returns a context manager for use when defining a Python op. This context manager validates that the given `values` are from the same graph, ensures that that graph is the default graph, and pushes a name scope. For example, to define a new Python op called `my_op`: ```python def my_op(a, b, c, name=None): with tf.op_scope([a, b, c], name, "MyOp") as scope: a = tf.convert_to_tensor(a, name="a") b = tf.convert_to_tensor(b, name="b") c = tf.convert_to_tensor(c, name="c") # Define some computation that uses `a`, `b`, and `c`. return foo_op(..., name=scope) ``` ##### Args: * `values`: The list of `Tensor` arguments that are passed to the op function. * `name`: The name argument that is passed to the op function. * `default_name`: The default name to use if the `name` argument is `None`. ##### Returns: A context manager for use in defining a Python op. - - - ### `tf.get_seed(op_seed)` Returns the local seeds an operation should use given an op-specific seed. Given operation-specific seed, `op_seed`, this helper function returns two seeds derived from graph-level and op-level seeds. Many random operations internally use the two seeds to allow user to change the seed globally for a graph, or for only specific operations. For details on how the graph-level seed interacts with op seeds, see [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed). ##### Args: * `op_seed`: integer. ##### Returns: A tuple of two integers that should be used for the local seed of this operation. ================================================ FILE: SOURCE/api_docs/python/image.md ================================================ # Images Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Images](#AUTOGENERATED-images) * [Encoding and Decoding](#AUTOGENERATED-encoding-and-decoding) * [`tf.image.decode_jpeg(contents, channels=None, ratio=None, fancy_upscaling=None, try_recover_truncated=None, acceptable_fraction=None, name=None)`](#decode_jpeg) * [`tf.image.encode_jpeg(image, format=None, quality=None, progressive=None, optimize_size=None, chroma_downsampling=None, density_unit=None, x_density=None, y_density=None, xmp_metadata=None, name=None)`](#encode_jpeg) * [`tf.image.decode_png(contents, channels=None, name=None)`](#decode_png) * [`tf.image.encode_png(image, compression=None, name=None)`](#encode_png) * [Resizing](#AUTOGENERATED-resizing) * [`tf.image.resize_images(images, new_height, new_width, method=0)`](#resize_images) * [`tf.image.resize_area(images, size, name=None)`](#resize_area) * [`tf.image.resize_bicubic(images, size, name=None)`](#resize_bicubic) * [`tf.image.resize_bilinear(images, size, name=None)`](#resize_bilinear) * [`tf.image.resize_nearest_neighbor(images, size, name=None)`](#resize_nearest_neighbor) * [Cropping](#AUTOGENERATED-cropping) * [`tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)`](#resize_image_with_crop_or_pad) * [`tf.image.pad_to_bounding_box(image, offset_height, offset_width, target_height, target_width)`](#pad_to_bounding_box) * [`tf.image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)`](#crop_to_bounding_box) * [`tf.image.random_crop(image, size, seed=None, name=None)`](#random_crop) * [`tf.image.extract_glimpse(input, size, offsets, centered=None, normalized=None, uniform_noise=None, name=None)`](#extract_glimpse) * [Flipping and Transposing](#AUTOGENERATED-flipping-and-transposing) * [`tf.image.flip_up_down(image)`](#flip_up_down) * [`tf.image.random_flip_up_down(image, seed=None)`](#random_flip_up_down) * [`tf.image.flip_left_right(image)`](#flip_left_right) * [`tf.image.random_flip_left_right(image, seed=None)`](#random_flip_left_right) * [`tf.image.transpose_image(image)`](#transpose_image) * [Image Adjustments](#AUTOGENERATED-image-adjustments) * [`tf.image.adjust_brightness(image, delta, min_value=None, max_value=None)`](#adjust_brightness) * [`tf.image.random_brightness(image, max_delta, seed=None)`](#random_brightness) * [`tf.image.adjust_contrast(images, contrast_factor, min_value=None, max_value=None)`](#adjust_contrast) * [`tf.image.random_contrast(image, lower, upper, seed=None)`](#random_contrast) * [`tf.image.per_image_whitening(image)`](#per_image_whitening) ## Encoding and Decoding TensorFlow provides Ops to decode and encode JPEG and PNG formats. Encoded images are represented by scalar string Tensors, decoded images by 3-D uint8 tensors of shape `[height, width, channels]`. The encode and decode Ops apply to one image at a time. Their input and output are all of variable size. If you need fixed size images, pass the output of the decode Ops to one of the cropping and resizing Ops. Note: The PNG encode and decode Ops support RGBA, but the conversions Ops presently only support RGB, HSV, and GrayScale. - - - ### `tf.image.decode_jpeg(contents, channels=None, ratio=None, fancy_upscaling=None, try_recover_truncated=None, acceptable_fraction=None, name=None)` Decode a JPEG-encoded image to a uint8 tensor. The attr `channels` indicates the desired number of color channels for the decoded image. Accepted values are: * 0: Use the number of channels in the JPEG-encoded image. * 1: output a grayscale image. * 3: output an RGB image. If needed, the JPEG-encoded image is transformed to match the requested number of color channels. The attr `ratio` allows downscaling the image by an integer factor during decoding. Allowed values are: 1, 2, 4, and 8. This is much faster than downscaling the image later. ##### Args: * `contents`: A `Tensor` of type `string`. 0-D. The JPEG-encoded image. * `channels`: An optional `int`. Defaults to `0`. Number of color channels for the decoded image. * `ratio`: An optional `int`. Defaults to `1`. Downscaling ratio. * `fancy_upscaling`: An optional `bool`. Defaults to `True`. If true use a slower but nicer upscaling of the chroma planes (yuv420/422 only). * `try_recover_truncated`: An optional `bool`. Defaults to `False`. If true try to recover an image from truncated input. * `acceptable_fraction`: An optional `float`. Defaults to `1`. The minimum required fraction of lines before a truncated input is accepted. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `uint8`. 3-D with shape `[height, width, channels]`.. - - - ### `tf.image.encode_jpeg(image, format=None, quality=None, progressive=None, optimize_size=None, chroma_downsampling=None, density_unit=None, x_density=None, y_density=None, xmp_metadata=None, name=None)` JPEG-encode an image. `image` is a 3-D uint8 Tensor of shape `[height, width, channels]`. The attr `format` can be used to override the color format of the encoded output. Values can be: * `''`: Use a default format based on the number of channels in the image. * `grayscale`: Output a grayscale JPEG image. The `channels` dimension of `image` must be 1. * `rgb`: Output an RGB JPEG image. The `channels` dimension of `image` must be 3. If `format` is not specified or is the empty string, a default format is picked in function of the number of channels in `image`: * 1: Output a grayscale image. * 3: Output an RGB image. ##### Args: * `image`: A `Tensor` of type `uint8`. 3-D with shape `[height, width, channels]`. * `format`: An optional `string` from: `"", "grayscale", "rgb"`. Defaults to `""`. Per pixel image format. * `quality`: An optional `int`. Defaults to `95`. Quality of the compression from 0 to 100 (higher is better and slower). * `progressive`: An optional `bool`. Defaults to `False`. If True, create a JPEG that loads progressively (coarse to fine). * `optimize_size`: An optional `bool`. Defaults to `False`. If True, spend CPU/RAM to reduce size with no quality change. * `chroma_downsampling`: An optional `bool`. Defaults to `True`. See http://en.wikipedia.org/wiki/Chroma_subsampling. * `density_unit`: An optional `string` from: `"in", "cm"`. Defaults to `"in"`. Unit used to specify `x_density` and `y_density`: pixels per inch (`'in'`) or centimeter (`'cm'`). * `x_density`: An optional `int`. Defaults to `300`. Horizontal pixels per density unit. * `y_density`: An optional `int`. Defaults to `300`. Vertical pixels per density unit. * `xmp_metadata`: An optional `string`. Defaults to `""`. If not empty, embed this XMP metadata in the image header. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `string`. 0-D. JPEG-encoded image. - - - ### `tf.image.decode_png(contents, channels=None, name=None)` Decode a PNG-encoded image to a uint8 tensor. The attr `channels` indicates the desired number of color channels for the decoded image. Accepted values are: * 0: Use the number of channels in the PNG-encoded image. * 1: output a grayscale image. * 3: output an RGB image. * 4: output an RGBA image. If needed, the PNG-encoded image is transformed to match the requested number of color channels. ##### Args: * `contents`: A `Tensor` of type `string`. 0-D. The PNG-encoded image. * `channels`: An optional `int`. Defaults to `0`. Number of color channels for the decoded image. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `uint8`. 3-D with shape `[height, width, channels]`. - - - ### `tf.image.encode_png(image, compression=None, name=None)` PNG-encode an image. `image` is a 3-D uint8 Tensor of shape `[height, width, channels]` where `channels` is: * 1: for grayscale. * 3: for RGB. * 4: for RGBA. The ZLIB compression level, `compression`, can be -1 for the PNG-encoder default or a value from 0 to 9. 9 is the highest compression level, generating the smallest output, but is slower. ##### Args: * `image`: A `Tensor` of type `uint8`. 3-D with shape `[height, width, channels]`. * `compression`: An optional `int`. Defaults to `-1`. Compression level. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `string`. 0-D. PNG-encoded image. ## Resizing The resizing Ops accept input images as tensors of several types. They always output resized images as float32 tensors. The convenience function [resize_images()](#resize_images) supports both 4-D and 3-D tensors as input and output. 4-D tensors are for batches of images, 3-D tensors for individual images. Other resizing Ops only support 3-D individual images as input: [resize_area](#resize_area), [resize_bicubic](#resize_bicubic), [resize_bilinear](#resize_bilinear), [resize_nearest_neighbor](#resize_nearest_neighbor). Example: ```python # Decode a JPG image and resize it to 299 by 299. image = tf.image.decode_jpeg(...) resized_image = tf.image.resize_bilinear(image, [299, 299]) ``` Maybe refer to the Queue examples that show how to add images to a Queue after resizing them to a fixed size, and how to dequeue batches of resized images from the Queue. - - - ### `tf.image.resize_images(images, new_height, new_width, method=0)` Resize `images` to `new_width`, `new_height` using the specified `method`. Resized images will be distorted if their original aspect ratio is not the same as `new_width`, `new_height`. To avoid distortions see [resize_image_with_crop_or_pad](#resize_image_with_crop_or_pad). `method` can be one of: * ResizeMethod.BILINEAR: [Bilinear interpolation.] (https://en.wikipedia.org/wiki/Bilinear_interpolation) * ResizeMethod.NEAREST_NEIGHBOR: [Nearest neighbor interpolation.] (https://en.wikipedia.org/wiki/Nearest-neighbor_interpolation) * ResizeMethod.BICUBIC: [Bicubic interpolation.] (https://en.wikipedia.org/wiki/Bicubic_interpolation) * ResizeMethod.AREA: Area interpolation. ##### Args: * `images`: 4-D Tensor of shape `[batch, height, width, channels]` or 3-D Tensor of shape `[height, width, channels]`. * `new_height`: integer. * `new_width`: integer. * `method`: ResizeMethod. Defaults to `ResizeMethod.BILINEAR`. ##### Raises: * `ValueError`: if the shape of `images` is incompatible with the shape arguments to this function * `ValueError`: if an unsupported resize method is specified. ##### Returns: If `images` was 4-D, a 4-D float Tensor of shape `[batch, new_height, new_width, channels]`. If `images` was 3-D, a 3-D float Tensor of shape `[new_height, new_width, channels]`. - - - ### `tf.image.resize_area(images, size, name=None)` Resize `images` to `size` using area interpolation. Input images can be of different types but output images are always float. ##### Args: * `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`. 4-D with shape `[batch, height, width, channels]`. * `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The new size for the images. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. 4-D with shape `[batch, new_height, new_width, channels]`. - - - ### `tf.image.resize_bicubic(images, size, name=None)` Resize `images` to `size` using bicubic interpolation. Input images can be of different types but output images are always float. ##### Args: * `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`. 4-D with shape `[batch, height, width, channels]`. * `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The new size for the images. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. 4-D with shape `[batch, new_height, new_width, channels]`. - - - ### `tf.image.resize_bilinear(images, size, name=None)` Resize `images` to `size` using bilinear interpolation. Input images can be of different types but output images are always float. ##### Args: * `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`. 4-D with shape `[batch, height, width, channels]`. * `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The new size for the images. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. 4-D with shape `[batch, new_height, new_width, channels]`. - - - ### `tf.image.resize_nearest_neighbor(images, size, name=None)` Resize `images` to `size` using nearest neighbor interpolation. Input images can be of different types but output images are always float. ##### Args: * `images`: A `Tensor`. Must be one of the following types: `uint8`, `int8`, `int32`, `float32`, `float64`. 4-D with shape `[batch, height, width, channels]`. * `size`: A 1-D int32 Tensor of 2 elements: `new_height, new_width`. The new size for the images. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `images`. 4-D with shape `[batch, new_height, new_width, channels]`. ## Cropping - - - ### `tf.image.resize_image_with_crop_or_pad(image, target_height, target_width)` Crops and/or pads an image to a target width and height. Resizes an image to a target width and height by either centrally cropping the image or padding it evenly with zeros. If `width` or `height` is greater than the specified `target_width` or `target_height` respectively, this op centrally crops along that dimension. If `width` or `height` is smaller than the specified `target_width` or `target_height` respectively, this op centrally pads with 0 along that dimension. ##### Args: * `image`: 3-D tensor of shape [height, width, channels] * `target_height`: Target height. * `target_width`: Target width. ##### Raises: * `ValueError`: if `target_height` or `target_width` are zero or negative. ##### Returns: Cropped and/or padded image of shape `[target_height, target_width, channels]` - - - ### `tf.image.pad_to_bounding_box(image, offset_height, offset_width, target_height, target_width)` Pad `image` with zeros to the specified `height` and `width`. Adds `offset_height` rows of zeros on top, `offset_width` columns of zeros on the left, and then pads the image on the bottom and right with zeros until it has dimensions `target_height`, `target_width`. This op does nothing if `offset_*` is zero and the image already has size `target_height` by `target_width`. ##### Args: * `image`: 3-D tensor with shape `[height, width, channels]` * `offset_height`: Number of rows of zeros to add on top. * `offset_width`: Number of columns of zeros to add on the left. * `target_height`: Height of output image. * `target_width`: Width of output image. ##### Returns: 3-D tensor of shape `[target_height, target_width, channels]` ##### Raises: * `ValueError`: If the shape of `image` is incompatible with the `offset_*` or `target_*` arguments - - - ### `tf.image.crop_to_bounding_box(image, offset_height, offset_width, target_height, target_width)` Crops an image to a specified bounding box. This op cuts a rectangular part out of `image`. The top-left corner of the returned image is at `offset_height, offset_width` in `image`, and its lower-right corner is at `offset_height + target_height, offset_width + target_width'. ##### Args: * `image`: 3-D tensor with shape `[height, width, channels]` * `offset_height`: Vertical coordinate of the top-left corner of the result in the input. * `offset_width`: Horizontal coordinate of the top-left corner of the result in the input. * `target_height`: Height of the result. * `target_width`: Width of the result. ##### Returns: 3-D tensor of image with shape `[target_height, target_width, channels]` ##### Raises: * `ValueError`: If the shape of `image` is incompatible with the `offset_*` or `target_*` arguments - - - ### `tf.image.random_crop(image, size, seed=None, name=None)` Randomly crops `image` to size `[target_height, target_width]`. The offset of the output within `image` is uniformly random. `image` always fully contains the result. ##### Args: * `image`: 3-D tensor of shape `[height, width, channels]` * `size`: 1-D tensor with two elements, specifying target `[height, width]` * `seed`: A Python integer. Used to create a random seed. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. * `name`: A name for this operation (optional). ##### Returns: A cropped 3-D tensor of shape `[target_height, target_width, channels]`. - - - ### `tf.image.extract_glimpse(input, size, offsets, centered=None, normalized=None, uniform_noise=None, name=None)` Extracts a glimpse from the input tensor. Returns a set of windows called glimpses extracted at location `offsets` from the input tensor. If the windows only partially overlaps the inputs, the non overlapping areas will be filled with random noise. The result is a 4-D tensor of shape `[batch_size, glimpse_height, glimpse_width, channels]`. The channels and batch dimensions are the same as that of the input tensor. The height and width of the output windows are specified in the `size` parameter. The argument `normalized` and `centered` controls how the windows are built: * If the coordinates are normalized but not centered, 0.0 and 1.0 correspond to the minimum and maximum of each height and width dimension. * If the coordinates are both normalized and centered, they range from -1.0 to 1.0. The coordinates (-1.0, -1.0) correspond to the upper left corner, the lower right corner is located at (1.0, 1.0) and the center is at (0, 0). * If the coordinates are not normalized they are interpreted as numbers of pixels. ##### Args: * `input`: A `Tensor` of type `float32`. A 4-D float tensor of shape `[batch_size, height, width, channels]`. * `size`: A `Tensor` of type `int32`. A 1-D tensor of 2 elements containing the size of the glimpses to extract. The glimpse height must be specified first, following by the glimpse width. * `offsets`: A `Tensor` of type `float32`. A 2-D integer tensor of shape `[batch_size, 2]` containing the x, y locations of the center of each window. * `centered`: An optional `bool`. Defaults to `True`. indicates if the offset coordinates are centered relative to the image, in which case the (0, 0) offset is relative to the center of the input images. If false, the (0,0) offset corresponds to the upper left corner of the input images. * `normalized`: An optional `bool`. Defaults to `True`. indicates if the offset coordinates are normalized. * `uniform_noise`: An optional `bool`. Defaults to `True`. indicates if the noise should be generated using a uniform distribution or a gaussian distribution. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. A tensor representing the glimpses `[batch_size, glimpse_height, glimpse_width, channels]`. ## Flipping and Transposing - - - ### `tf.image.flip_up_down(image)` Flip an image horizontally (upside down). Outputs the contents of `image` flipped along the first dimension, which is `height`. See also `reverse()`. ##### Args: * `image`: A 3-D tensor of shape `[height, width, channels].` ##### Returns: A 3-D tensor of the same type and shape as `image`. ##### Raises: * `ValueError`: if the shape of `image` not supported. - - - ### `tf.image.random_flip_up_down(image, seed=None)` Randomly flips an image vertically (upside down). With a 1 in 2 chance, outputs the contents of `image` flipped along the first dimension, which is `height`. Otherwise output the image as-is. ##### Args: * `image`: A 3-D tensor of shape `[height, width, channels].` * `seed`: A Python integer. Used to create a random seed. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: A 3-D tensor of the same type and shape as `image`. ##### Raises: * `ValueError`: if the shape of `image` not supported. - - - ### `tf.image.flip_left_right(image)` Flip an image horizontally (left to right). Outputs the contents of `image` flipped along the second dimension, which is `width`. See also `reverse()`. ##### Args: * `image`: A 3-D tensor of shape `[height, width, channels].` ##### Returns: A 3-D tensor of the same type and shape as `image`. ##### Raises: * `ValueError`: if the shape of `image` not supported. - - - ### `tf.image.random_flip_left_right(image, seed=None)` Randomly flip an image horizontally (left to right). With a 1 in 2 chance, outputs the contents of `image` flipped along the second dimension, which is `width`. Otherwise output the image as-is. ##### Args: * `image`: A 3-D tensor of shape `[height, width, channels].` * `seed`: A Python integer. Used to create a random seed. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: A 3-D tensor of the same type and shape as `image`. ##### Raises: * `ValueError`: if the shape of `image` not supported. - - - ### `tf.image.transpose_image(image)` Transpose an image by swapping the first and second dimension. See also `transpose()`. ##### Args: * `image`: 3-D tensor of shape `[height, width, channels]` ##### Returns: A 3-D tensor of shape `[width, height, channels]` ##### Raises: * `ValueError`: if the shape of `image` not supported. ## Image Adjustments TensorFlow provides functions to adjust images in various ways: brightness, contrast, hue, and saturation. Each adjustment can be done with predefined parameters or with random parameters picked from predefined intervals. Random adjustments are often useful to expand a training set and reduce overfitting. - - - ### `tf.image.adjust_brightness(image, delta, min_value=None, max_value=None)` Adjust the brightness of RGB or Grayscale images. The value `delta` is added to all components of the tensor `image`. `image` and `delta` are cast to `float` before adding, and the resulting values are clamped to `[min_value, max_value]`. Finally, the result is cast back to `images.dtype`. If `min_value` or `max_value` are not given, they are set to the minimum and maximum allowed values for `image.dtype` respectively. ##### Args: * `image`: A tensor. * `delta`: A scalar. Amount to add to the pixel values. * `min_value`: Minimum value for output. * `max_value`: Maximum value for output. ##### Returns: A tensor of the same shape and type as `image`. - - - ### `tf.image.random_brightness(image, max_delta, seed=None)` Adjust the brightness of images by a random factor. Equivalent to `adjust_brightness()` using a `delta` randomly picked in the interval `[-max_delta, max_delta)`. Note that `delta` is picked as a float. Because for integer type images, the brightness adjusted result is rounded before casting, integer images may have modifications in the range `[-max_delta,max_delta]`. ##### Args: * `image`: 3-D tensor of shape `[height, width, channels]`. * `max_delta`: float, must be non-negative. * `seed`: A Python integer. Used to create a random seed. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: 3-D tensor of images of shape `[height, width, channels]` ##### Raises: * `ValueError`: if max_delta is negative. - - - ### `tf.image.adjust_contrast(images, contrast_factor, min_value=None, max_value=None)` Adjust contrast of RGB or grayscale images. `images` is a tensor of at least 3 dimensions. The last 3 dimensions are interpreted as `[height, width, channels]`. The other dimensions only represent a collection of images, such as `[batch, height, width, channels].` Contrast is adjusted independently for each channel of each image. For each channel, this Op first computes the mean of the image pixels in the channel and then adjusts each component `x` of each pixel to `(x - mean) * contrast_factor + mean`. The adjusted values are then clipped to fit in the `[min_value, max_value]` interval. If `min_value` or `max_value` is not given, it is replaced with the minimum and maximum values for the data type of `images` respectively. The contrast-adjusted image is always computed as `float`, and it is cast back to its original type after clipping. ##### Args: * `images`: Images to adjust. At least 3-D. * `contrast_factor`: A float multiplier for adjusting contrast. * `min_value`: Minimum value for clipping the adjusted pixels. * `max_value`: Maximum value for clipping the adjusted pixels. ##### Returns: The constrast-adjusted image or images. ##### Raises: * `ValueError`: if the arguments are invalid. - - - ### `tf.image.random_contrast(image, lower, upper, seed=None)` Adjust the contrase of an image by a random factor. Equivalent to `adjust_constrast()` but uses a `contrast_factor` randomly picked in the interval `[lower, upper]`. ##### Args: * `image`: 3-D tensor of shape `[height, width, channels]`. * `lower`: float. Lower bound for the random contrast factor. * `upper`: float. Upper bound for the random contrast factor. * `seed`: A Python integer. Used to create a random seed. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: 3-D tensor of shape `[height, width, channels]`. ##### Raises: * `ValueError`: if `upper <= lower` or if `lower < 0`. - - - ### `tf.image.per_image_whitening(image)` Linearly scales `image` to have zero mean and unit norm. This op computes `(x - mean) / adjusted_stddev`, where `mean` is the average of all values in image, and `adjusted_stddev = max(stddev, 1.0/srqt(image.NumElements()))`. `stddev` is the standard deviation of all values in `image`. It is capped away from zero to protect against division by 0 when handling uniform images. Note that this implementation is limited: * It only whitens based on the statistics of an individual image. * It does not take into account the covariance structure. ##### Args: * `image`: 3-D tensor of shape `[height, width, channels]`. ##### Returns: The whitened image with same shape as `image`. ##### Raises: * `ValueError`: if the shape of 'image' is incompatible with this function. ================================================ FILE: SOURCE/api_docs/python/index.md ================================================ # TensorFlow Python reference documentation * **[Building Graphs](../../api_docs/python/framework.md)**: * [`add_to_collection`](../../api_docs/python/framework.md#add_to_collection) * [`as_dtype`](../../api_docs/python/framework.md#as_dtype) * [`control_dependencies`](../../api_docs/python/framework.md#control_dependencies) * [`convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor) * [`device`](../../api_docs/python/framework.md#device) * [`Dimension`](../../api_docs/python/framework.md#Dimension) * [`DType`](../../api_docs/python/framework.md#DType) * [`get_collection`](../../api_docs/python/framework.md#get_collection) * [`get_default_graph`](../../api_docs/python/framework.md#get_default_graph) * [`get_seed`](../../api_docs/python/framework.md#get_seed) * [`Graph`](../../api_docs/python/framework.md#Graph) * [`GraphKeys`](../../api_docs/python/framework.md#GraphKeys) * [`import_graph_def`](../../api_docs/python/framework.md#import_graph_def) * [`name_scope`](../../api_docs/python/framework.md#name_scope) * [`NoGradient`](../../api_docs/python/framework.md#NoGradient) * [`op_scope`](../../api_docs/python/framework.md#op_scope) * [`Operation`](../../api_docs/python/framework.md#Operation) * [`RegisterGradient`](../../api_docs/python/framework.md#RegisterGradient) * [`RegisterShape`](../../api_docs/python/framework.md#RegisterShape) * [`Tensor`](../../api_docs/python/framework.md#Tensor) * [`TensorShape`](../../api_docs/python/framework.md#TensorShape) * **[Constants, Sequences, and Random Values](../../api_docs/python/constant_op.md)**: * [`constant`](../../api_docs/python/constant_op.md#constant) * [`fill`](../../api_docs/python/constant_op.md#fill) * [`linspace`](../../api_docs/python/constant_op.md#linspace) * [`ones`](../../api_docs/python/constant_op.md#ones) * [`ones_like`](../../api_docs/python/constant_op.md#ones_like) * [`random_normal`](../../api_docs/python/constant_op.md#random_normal) * [`random_shuffle`](../../api_docs/python/constant_op.md#random_shuffle) * [`random_uniform`](../../api_docs/python/constant_op.md#random_uniform) * [`range`](../../api_docs/python/constant_op.md#range) * [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) * [`truncated_normal`](../../api_docs/python/constant_op.md#truncated_normal) * [`zeros`](../../api_docs/python/constant_op.md#zeros) * [`zeros_like`](../../api_docs/python/constant_op.md#zeros_like) * **[Variables](../../api_docs/python/state_ops.md)**: * [`all_variables`](../../api_docs/python/state_ops.md#all_variables) * [`assert_variables_initialized`](../../api_docs/python/state_ops.md#assert_variables_initialized) * [`assign`](../../api_docs/python/state_ops.md#assign) * [`assign_add`](../../api_docs/python/state_ops.md#assign_add) * [`assign_sub`](../../api_docs/python/state_ops.md#assign_sub) * [`constant_initializer`](../../api_docs/python/state_ops.md#constant_initializer) * [`count_up_to`](../../api_docs/python/state_ops.md#count_up_to) * [`device`](../../api_docs/python/state_ops.md#device) * [`get_checkpoint_state`](../../api_docs/python/state_ops.md#get_checkpoint_state) * [`get_variable`](../../api_docs/python/state_ops.md#get_variable) * [`get_variable_scope`](../../api_docs/python/state_ops.md#get_variable_scope) * [`IndexedSlices`](../../api_docs/python/state_ops.md#IndexedSlices) * [`initialize_all_variables`](../../api_docs/python/state_ops.md#initialize_all_variables) * [`initialize_variables`](../../api_docs/python/state_ops.md#initialize_variables) * [`latest_checkpoint`](../../api_docs/python/state_ops.md#latest_checkpoint) * [`random_normal_initializer`](../../api_docs/python/state_ops.md#random_normal_initializer) * [`random_uniform_initializer`](../../api_docs/python/state_ops.md#random_uniform_initializer) * [`Saver`](../../api_docs/python/state_ops.md#Saver) * [`scatter_add`](../../api_docs/python/state_ops.md#scatter_add) * [`scatter_sub`](../../api_docs/python/state_ops.md#scatter_sub) * [`scatter_update`](../../api_docs/python/state_ops.md#scatter_update) * [`sparse_mask`](../../api_docs/python/state_ops.md#sparse_mask) * [`trainable_variables`](../../api_docs/python/state_ops.md#trainable_variables) * [`truncated_normal_initializer`](../../api_docs/python/state_ops.md#truncated_normal_initializer) * [`uniform_unit_scaling_initializer`](../../api_docs/python/state_ops.md#uniform_unit_scaling_initializer) * [`update_checkpoint_state`](../../api_docs/python/state_ops.md#update_checkpoint_state) * [`Variable`](../../api_docs/python/state_ops.md#Variable) * [`variable_scope`](../../api_docs/python/state_ops.md#variable_scope) * [`zeros_initializer`](../../api_docs/python/state_ops.md#zeros_initializer) * **[Tensor Transformations](../../api_docs/python/array_ops.md)**: * [`cast`](../../api_docs/python/array_ops.md#cast) * [`concat`](../../api_docs/python/array_ops.md#concat) * [`dynamic_partition`](../../api_docs/python/array_ops.md#dynamic_partition) * [`dynamic_stitch`](../../api_docs/python/array_ops.md#dynamic_stitch) * [`expand_dims`](../../api_docs/python/array_ops.md#expand_dims) * [`gather`](../../api_docs/python/array_ops.md#gather) * [`pack`](../../api_docs/python/array_ops.md#pack) * [`pad`](../../api_docs/python/array_ops.md#pad) * [`rank`](../../api_docs/python/array_ops.md#rank) * [`reshape`](../../api_docs/python/array_ops.md#reshape) * [`reverse`](../../api_docs/python/array_ops.md#reverse) * [`reverse_sequence`](../../api_docs/python/array_ops.md#reverse_sequence) * [`shape`](../../api_docs/python/array_ops.md#shape) * [`size`](../../api_docs/python/array_ops.md#size) * [`slice`](../../api_docs/python/array_ops.md#slice) * [`split`](../../api_docs/python/array_ops.md#split) * [`squeeze`](../../api_docs/python/array_ops.md#squeeze) * [`string_to_number`](../../api_docs/python/array_ops.md#string_to_number) * [`tile`](../../api_docs/python/array_ops.md#tile) * [`to_bfloat16`](../../api_docs/python/array_ops.md#to_bfloat16) * [`to_double`](../../api_docs/python/array_ops.md#to_double) * [`to_float`](../../api_docs/python/array_ops.md#to_float) * [`to_int32`](../../api_docs/python/array_ops.md#to_int32) * [`to_int64`](../../api_docs/python/array_ops.md#to_int64) * [`transpose`](../../api_docs/python/array_ops.md#transpose) * [`unpack`](../../api_docs/python/array_ops.md#unpack) * **[Math](../../api_docs/python/math_ops.md)**: * [`abs`](../../api_docs/python/math_ops.md#abs) * [`accumulate_n`](../../api_docs/python/math_ops.md#accumulate_n) * [`add`](../../api_docs/python/math_ops.md#add) * [`add_n`](../../api_docs/python/math_ops.md#add_n) * [`argmax`](../../api_docs/python/math_ops.md#argmax) * [`argmin`](../../api_docs/python/math_ops.md#argmin) * [`batch_cholesky`](../../api_docs/python/math_ops.md#batch_cholesky) * [`batch_matmul`](../../api_docs/python/math_ops.md#batch_matmul) * [`batch_matrix_determinant`](../../api_docs/python/math_ops.md#batch_matrix_determinant) * [`batch_matrix_inverse`](../../api_docs/python/math_ops.md#batch_matrix_inverse) * [`ceil`](../../api_docs/python/math_ops.md#ceil) * [`cholesky`](../../api_docs/python/math_ops.md#cholesky) * [`complex`](../../api_docs/python/math_ops.md#complex) * [`complex_abs`](../../api_docs/python/math_ops.md#complex_abs) * [`conj`](../../api_docs/python/math_ops.md#conj) * [`cos`](../../api_docs/python/math_ops.md#cos) * [`diag`](../../api_docs/python/math_ops.md#diag) * [`div`](../../api_docs/python/math_ops.md#div) * [`edit_distance`](../../api_docs/python/math_ops.md#edit_distance) * [`exp`](../../api_docs/python/math_ops.md#exp) * [`floor`](../../api_docs/python/math_ops.md#floor) * [`imag`](../../api_docs/python/math_ops.md#imag) * [`inv`](../../api_docs/python/math_ops.md#inv) * [`invert_permutation`](../../api_docs/python/math_ops.md#invert_permutation) * [`listdiff`](../../api_docs/python/math_ops.md#listdiff) * [`log`](../../api_docs/python/math_ops.md#log) * [`matmul`](../../api_docs/python/math_ops.md#matmul) * [`matrix_determinant`](../../api_docs/python/math_ops.md#matrix_determinant) * [`matrix_inverse`](../../api_docs/python/math_ops.md#matrix_inverse) * [`maximum`](../../api_docs/python/math_ops.md#maximum) * [`minimum`](../../api_docs/python/math_ops.md#minimum) * [`mod`](../../api_docs/python/math_ops.md#mod) * [`mul`](../../api_docs/python/math_ops.md#mul) * [`neg`](../../api_docs/python/math_ops.md#neg) * [`pow`](../../api_docs/python/math_ops.md#pow) * [`real`](../../api_docs/python/math_ops.md#real) * [`reduce_all`](../../api_docs/python/math_ops.md#reduce_all) * [`reduce_any`](../../api_docs/python/math_ops.md#reduce_any) * [`reduce_max`](../../api_docs/python/math_ops.md#reduce_max) * [`reduce_mean`](../../api_docs/python/math_ops.md#reduce_mean) * [`reduce_min`](../../api_docs/python/math_ops.md#reduce_min) * [`reduce_prod`](../../api_docs/python/math_ops.md#reduce_prod) * [`reduce_sum`](../../api_docs/python/math_ops.md#reduce_sum) * [`round`](../../api_docs/python/math_ops.md#round) * [`rsqrt`](../../api_docs/python/math_ops.md#rsqrt) * [`segment_max`](../../api_docs/python/math_ops.md#segment_max) * [`segment_mean`](../../api_docs/python/math_ops.md#segment_mean) * [`segment_min`](../../api_docs/python/math_ops.md#segment_min) * [`segment_prod`](../../api_docs/python/math_ops.md#segment_prod) * [`segment_sum`](../../api_docs/python/math_ops.md#segment_sum) * [`sign`](../../api_docs/python/math_ops.md#sign) * [`sin`](../../api_docs/python/math_ops.md#sin) * [`sparse_segment_mean`](../../api_docs/python/math_ops.md#sparse_segment_mean) * [`sparse_segment_sum`](../../api_docs/python/math_ops.md#sparse_segment_sum) * [`sqrt`](../../api_docs/python/math_ops.md#sqrt) * [`square`](../../api_docs/python/math_ops.md#square) * [`sub`](../../api_docs/python/math_ops.md#sub) * [`transpose`](../../api_docs/python/math_ops.md#transpose) * [`unique`](../../api_docs/python/math_ops.md#unique) * [`unsorted_segment_sum`](../../api_docs/python/math_ops.md#unsorted_segment_sum) * [`where`](../../api_docs/python/math_ops.md#where) * **[Control Flow](../../api_docs/python/control_flow_ops.md)**: * [`add_check_numerics_ops`](../../api_docs/python/control_flow_ops.md#add_check_numerics_ops) * [`Assert`](../../api_docs/python/control_flow_ops.md#Assert) * [`check_numerics`](../../api_docs/python/control_flow_ops.md#check_numerics) * [`count_up_to`](../../api_docs/python/control_flow_ops.md#count_up_to) * [`equal`](../../api_docs/python/control_flow_ops.md#equal) * [`greater`](../../api_docs/python/control_flow_ops.md#greater) * [`greater_equal`](../../api_docs/python/control_flow_ops.md#greater_equal) * [`group`](../../api_docs/python/control_flow_ops.md#group) * [`identity`](../../api_docs/python/control_flow_ops.md#identity) * [`is_finite`](../../api_docs/python/control_flow_ops.md#is_finite) * [`is_inf`](../../api_docs/python/control_flow_ops.md#is_inf) * [`is_nan`](../../api_docs/python/control_flow_ops.md#is_nan) * [`less`](../../api_docs/python/control_flow_ops.md#less) * [`less_equal`](../../api_docs/python/control_flow_ops.md#less_equal) * [`logical_and`](../../api_docs/python/control_flow_ops.md#logical_and) * [`logical_not`](../../api_docs/python/control_flow_ops.md#logical_not) * [`logical_or`](../../api_docs/python/control_flow_ops.md#logical_or) * [`logical_xor`](../../api_docs/python/control_flow_ops.md#logical_xor) * [`no_op`](../../api_docs/python/control_flow_ops.md#no_op) * [`not_equal`](../../api_docs/python/control_flow_ops.md#not_equal) * [`Print`](../../api_docs/python/control_flow_ops.md#Print) * [`select`](../../api_docs/python/control_flow_ops.md#select) * [`tuple`](../../api_docs/python/control_flow_ops.md#tuple) * [`verify_tensor_all_finite`](../../api_docs/python/control_flow_ops.md#verify_tensor_all_finite) * [`where`](../../api_docs/python/control_flow_ops.md#where) * **[Images](../../api_docs/python/image.md)**: * [`adjust_brightness`](../../api_docs/python/image.md#adjust_brightness) * [`adjust_contrast`](../../api_docs/python/image.md#adjust_contrast) * [`crop_to_bounding_box`](../../api_docs/python/image.md#crop_to_bounding_box) * [`decode_jpeg`](../../api_docs/python/image.md#decode_jpeg) * [`decode_png`](../../api_docs/python/image.md#decode_png) * [`encode_jpeg`](../../api_docs/python/image.md#encode_jpeg) * [`encode_png`](../../api_docs/python/image.md#encode_png) * [`extract_glimpse`](../../api_docs/python/image.md#extract_glimpse) * [`flip_left_right`](../../api_docs/python/image.md#flip_left_right) * [`flip_up_down`](../../api_docs/python/image.md#flip_up_down) * [`pad_to_bounding_box`](../../api_docs/python/image.md#pad_to_bounding_box) * [`per_image_whitening`](../../api_docs/python/image.md#per_image_whitening) * [`random_brightness`](../../api_docs/python/image.md#random_brightness) * [`random_contrast`](../../api_docs/python/image.md#random_contrast) * [`random_crop`](../../api_docs/python/image.md#random_crop) * [`random_flip_left_right`](../../api_docs/python/image.md#random_flip_left_right) * [`random_flip_up_down`](../../api_docs/python/image.md#random_flip_up_down) * [`resize_area`](../../api_docs/python/image.md#resize_area) * [`resize_bicubic`](../../api_docs/python/image.md#resize_bicubic) * [`resize_bilinear`](../../api_docs/python/image.md#resize_bilinear) * [`resize_image_with_crop_or_pad`](../../api_docs/python/image.md#resize_image_with_crop_or_pad) * [`resize_images`](../../api_docs/python/image.md#resize_images) * [`resize_nearest_neighbor`](../../api_docs/python/image.md#resize_nearest_neighbor) * [`transpose_image`](../../api_docs/python/image.md#transpose_image) * **[Sparse Tensors](../../api_docs/python/sparse_ops.md)**: * [`shape`](../../api_docs/python/sparse_ops.md#shape) * [`sparse_concat`](../../api_docs/python/sparse_ops.md#sparse_concat) * [`sparse_fill_empty_rows`](../../api_docs/python/sparse_ops.md#sparse_fill_empty_rows) * [`sparse_reorder`](../../api_docs/python/sparse_ops.md#sparse_reorder) * [`sparse_retain`](../../api_docs/python/sparse_ops.md#sparse_retain) * [`sparse_tensor_to_dense`](../../api_docs/python/sparse_ops.md#sparse_tensor_to_dense) * [`sparse_to_dense`](../../api_docs/python/sparse_ops.md#sparse_to_dense) * [`sparse_to_indicator`](../../api_docs/python/sparse_ops.md#sparse_to_indicator) * [`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor) * [`SparseTensorValue`](../../api_docs/python/sparse_ops.md#SparseTensorValue) * **[Inputs and Readers](../../api_docs/python/io_ops.md)**: * [`batch`](../../api_docs/python/io_ops.md#batch) * [`batch_join`](../../api_docs/python/io_ops.md#batch_join) * [`decode_csv`](../../api_docs/python/io_ops.md#decode_csv) * [`decode_raw`](../../api_docs/python/io_ops.md#decode_raw) * [`FIFOQueue`](../../api_docs/python/io_ops.md#FIFOQueue) * [`FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader) * [`IdentityReader`](../../api_docs/python/io_ops.md#IdentityReader) * [`limit_epochs`](../../api_docs/python/io_ops.md#limit_epochs) * [`match_filenames_once`](../../api_docs/python/io_ops.md#match_filenames_once) * [`matching_files`](../../api_docs/python/io_ops.md#matching_files) * [`parse_example`](../../api_docs/python/io_ops.md#parse_example) * [`parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example) * [`placeholder`](../../api_docs/python/io_ops.md#placeholder) * [`QueueBase`](../../api_docs/python/io_ops.md#QueueBase) * [`RandomShuffleQueue`](../../api_docs/python/io_ops.md#RandomShuffleQueue) * [`range_input_producer`](../../api_docs/python/io_ops.md#range_input_producer) * [`read_file`](../../api_docs/python/io_ops.md#read_file) * [`ReaderBase`](../../api_docs/python/io_ops.md#ReaderBase) * [`shuffle_batch`](../../api_docs/python/io_ops.md#shuffle_batch) * [`shuffle_batch_join`](../../api_docs/python/io_ops.md#shuffle_batch_join) * [`size`](../../api_docs/python/io_ops.md#size) * [`slice_input_producer`](../../api_docs/python/io_ops.md#slice_input_producer) * [`string_input_producer`](../../api_docs/python/io_ops.md#string_input_producer) * [`TextLineReader`](../../api_docs/python/io_ops.md#TextLineReader) * [`TFRecordReader`](../../api_docs/python/io_ops.md#TFRecordReader) * [`WholeFileReader`](../../api_docs/python/io_ops.md#WholeFileReader) * **[Data IO (Python functions)](../../api_docs/python/python_io.md)**: * [`tf_record_iterator`](../../api_docs/python/python_io.md#tf_record_iterator) * [`TFRecordWriter`](../../api_docs/python/python_io.md#TFRecordWriter) * **[Neural Network](../../api_docs/python/nn.md)**: * [`avg_pool`](../../api_docs/python/nn.md#avg_pool) * [`bias_add`](../../api_docs/python/nn.md#bias_add) * [`compute_accidental_hits`](../../api_docs/python/nn.md#compute_accidental_hits) * [`conv2d`](../../api_docs/python/nn.md#conv2d) * [`depthwise_conv2d`](../../api_docs/python/nn.md#depthwise_conv2d) * [`dropout`](../../api_docs/python/nn.md#dropout) * [`embedding_lookup`](../../api_docs/python/nn.md#embedding_lookup) * [`fixed_unigram_candidate_sampler`](../../api_docs/python/nn.md#fixed_unigram_candidate_sampler) * [`in_top_k`](../../api_docs/python/nn.md#in_top_k) * [`l2_loss`](../../api_docs/python/nn.md#l2_loss) * [`l2_normalize`](../../api_docs/python/nn.md#l2_normalize) * [`learned_unigram_candidate_sampler`](../../api_docs/python/nn.md#learned_unigram_candidate_sampler) * [`local_response_normalization`](../../api_docs/python/nn.md#local_response_normalization) * [`log_uniform_candidate_sampler`](../../api_docs/python/nn.md#log_uniform_candidate_sampler) * [`max_pool`](../../api_docs/python/nn.md#max_pool) * [`max_pool_with_argmax`](../../api_docs/python/nn.md#max_pool_with_argmax) * [`moments`](../../api_docs/python/nn.md#moments) * [`nce_loss`](../../api_docs/python/nn.md#nce_loss) * [`relu`](../../api_docs/python/nn.md#relu) * [`relu6`](../../api_docs/python/nn.md#relu6) * [`sampled_softmax_loss`](../../api_docs/python/nn.md#sampled_softmax_loss) * [`separable_conv2d`](../../api_docs/python/nn.md#separable_conv2d) * [`sigmoid`](../../api_docs/python/nn.md#sigmoid) * [`sigmoid_cross_entropy_with_logits`](../../api_docs/python/nn.md#sigmoid_cross_entropy_with_logits) * [`softmax`](../../api_docs/python/nn.md#softmax) * [`softmax_cross_entropy_with_logits`](../../api_docs/python/nn.md#softmax_cross_entropy_with_logits) * [`softplus`](../../api_docs/python/nn.md#softplus) * [`tanh`](../../api_docs/python/nn.md#tanh) * [`top_k`](../../api_docs/python/nn.md#top_k) * [`uniform_candidate_sampler`](../../api_docs/python/nn.md#uniform_candidate_sampler) * **[Running Graphs](../../api_docs/python/client.md)**: * [`AbortedError`](../../api_docs/python/client.md#AbortedError) * [`AlreadyExistsError`](../../api_docs/python/client.md#AlreadyExistsError) * [`CancelledError`](../../api_docs/python/client.md#CancelledError) * [`DataLossError`](../../api_docs/python/client.md#DataLossError) * [`DeadlineExceededError`](../../api_docs/python/client.md#DeadlineExceededError) * [`FailedPreconditionError`](../../api_docs/python/client.md#FailedPreconditionError) * [`get_default_session`](../../api_docs/python/client.md#get_default_session) * [`InteractiveSession`](../../api_docs/python/client.md#InteractiveSession) * [`InternalError`](../../api_docs/python/client.md#InternalError) * [`InvalidArgumentError`](../../api_docs/python/client.md#InvalidArgumentError) * [`NotFoundError`](../../api_docs/python/client.md#NotFoundError) * [`OpError`](../../api_docs/python/client.md#OpError) * [`OutOfRangeError`](../../api_docs/python/client.md#OutOfRangeError) * [`PermissionDeniedError`](../../api_docs/python/client.md#PermissionDeniedError) * [`ResourceExhaustedError`](../../api_docs/python/client.md#ResourceExhaustedError) * [`Session`](../../api_docs/python/client.md#Session) * [`UnauthenticatedError`](../../api_docs/python/client.md#UnauthenticatedError) * [`UnavailableError`](../../api_docs/python/client.md#UnavailableError) * [`UnimplementedError`](../../api_docs/python/client.md#UnimplementedError) * [`UnknownError`](../../api_docs/python/client.md#UnknownError) * **[Training](../../api_docs/python/train.md)**: * [`AdagradOptimizer`](../../api_docs/python/train.md#AdagradOptimizer) * [`AdamOptimizer`](../../api_docs/python/train.md#AdamOptimizer) * [`add_queue_runner`](../../api_docs/python/train.md#add_queue_runner) * [`AggregationMethod`](../../api_docs/python/train.md#AggregationMethod) * [`clip_by_average_norm`](../../api_docs/python/train.md#clip_by_average_norm) * [`clip_by_global_norm`](../../api_docs/python/train.md#clip_by_global_norm) * [`clip_by_norm`](../../api_docs/python/train.md#clip_by_norm) * [`clip_by_value`](../../api_docs/python/train.md#clip_by_value) * [`Coordinator`](../../api_docs/python/train.md#Coordinator) * [`exponential_decay`](../../api_docs/python/train.md#exponential_decay) * [`ExponentialMovingAverage`](../../api_docs/python/train.md#ExponentialMovingAverage) * [`FtrlOptimizer`](../../api_docs/python/train.md#FtrlOptimizer) * [`global_norm`](../../api_docs/python/train.md#global_norm) * [`global_step`](../../api_docs/python/train.md#global_step) * [`GradientDescentOptimizer`](../../api_docs/python/train.md#GradientDescentOptimizer) * [`gradients`](../../api_docs/python/train.md#gradients) * [`histogram_summary`](../../api_docs/python/train.md#histogram_summary) * [`image_summary`](../../api_docs/python/train.md#image_summary) * [`merge_all_summaries`](../../api_docs/python/train.md#merge_all_summaries) * [`merge_summary`](../../api_docs/python/train.md#merge_summary) * [`MomentumOptimizer`](../../api_docs/python/train.md#MomentumOptimizer) * [`Optimizer`](../../api_docs/python/train.md#Optimizer) * [`QueueRunner`](../../api_docs/python/train.md#QueueRunner) * [`RMSPropOptimizer`](../../api_docs/python/train.md#RMSPropOptimizer) * [`scalar_summary`](../../api_docs/python/train.md#scalar_summary) * [`start_queue_runners`](../../api_docs/python/train.md#start_queue_runners) * [`stop_gradient`](../../api_docs/python/train.md#stop_gradient) * [`summary_iterator`](../../api_docs/python/train.md#summary_iterator) * [`SummaryWriter`](../../api_docs/python/train.md#SummaryWriter) * [`write_graph`](../../api_docs/python/train.md#write_graph) * [`zero_fraction`](../../api_docs/python/train.md#zero_fraction) ================================================ FILE: SOURCE/api_docs/python/io_ops.md ================================================ # Inputs and Readers Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Inputs and Readers](#AUTOGENERATED-inputs-and-readers) * [Placeholders](#AUTOGENERATED-placeholders) * [`tf.placeholder(dtype, shape=None, name=None)`](#placeholder) * [Readers](#AUTOGENERATED-readers) * [`class tf.ReaderBase`](#ReaderBase) * [`class tf.TextLineReader`](#TextLineReader) * [`class tf.WholeFileReader`](#WholeFileReader) * [`class tf.IdentityReader`](#IdentityReader) * [`class tf.TFRecordReader`](#TFRecordReader) * [`class tf.FixedLengthRecordReader`](#FixedLengthRecordReader) * [Converting](#AUTOGENERATED-converting) * [`tf.decode_csv(records, record_defaults, field_delim=None, name=None)`](#decode_csv) * [`tf.decode_raw(bytes, out_type, little_endian=None, name=None)`](#decode_raw) * [Example protocol buffer](#AUTOGENERATED-example-protocol-buffer) * [`tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')`](#parse_example) * [`tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')`](#parse_single_example) * [Queues](#AUTOGENERATED-queues) * [`class tf.QueueBase`](#QueueBase) * [`class tf.FIFOQueue`](#FIFOQueue) * [`class tf.RandomShuffleQueue`](#RandomShuffleQueue) * [Dealing with the filesystem](#AUTOGENERATED-dealing-with-the-filesystem) * [`tf.matching_files(pattern, name=None)`](#matching_files) * [`tf.read_file(filename, name=None)`](#read_file) * [Input pipeline](#AUTOGENERATED-input-pipeline) * [Beginning of an input pipeline](#AUTOGENERATED-beginning-of-an-input-pipeline) * [`tf.train.match_filenames_once(pattern, name=None)`](#match_filenames_once) * [`tf.train.limit_epochs(tensor, num_epochs=None, name=None)`](#limit_epochs) * [`tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`](#range_input_producer) * [`tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`](#slice_input_producer) * [`tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)`](#string_input_producer) * [Batching at the end of an input pipeline](#AUTOGENERATED-batching-at-the-end-of-an-input-pipeline) * [`tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)`](#batch) * [`tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)`](#batch_join) * [`tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)`](#shuffle_batch) * [`tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)`](#shuffle_batch_join) ## Placeholders TensorFlow provides a placeholder operation that must be fed with data on execution. For more info, see the section on [Feeding data](../../how_tos/reading_data/index.md#feeding). - - - ### `tf.placeholder(dtype, shape=None, name=None)` Inserts a placeholder for a tensor that will be always fed. **Important**: This tensor will produce an error if evaluated. Its value must be fed using the `feed_dict` optional argument to `Session.run()`, `Tensor.eval()`, or `Operation.run()`. For example: ```python x = tf.placeholder(float, shape=(1024, 1024)) y = tf.matmul(x, x) with tf.Session() as sess: print sess.run(y) # ERROR: will fail because x was not fed. rand_array = np.random.rand(1024, 1024) print sess.run(y, feed_dict={x: rand_array}) # Will succeed. ``` ##### Args: * `dtype`: The type of elements in the tensor to be fed. * `shape`: The shape of the tensor to be fed (optional). If the shape is not specified, you can feed a tensor of any shape. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` that may be used as a handle for feeding a value, but not evaluated directly. ## Readers TensorFlow provides a set of Reader classes for reading data formats. For more information on inputs and readers, see [Reading data](../../how_tos/reading_data/index.md). - - - ### `class tf.ReaderBase` Base class for different Reader types, that produce a record every step. Conceptually, Readers convert string 'work units' into records (key, value pairs). Typically the 'work units' are filenames and the records are extracted from the contents of those files. We want a single record produced per step, but a work unit can correspond to many records. Therefore we introduce some decoupling using a queue. The queue contains the work units and the Reader dequeues from the queue when it is asked to produce a record (via Read()) but it has finished the last work unit. - - - #### `tf.ReaderBase.__init__(reader_ref, supports_serialize=False)` Creates a new ReaderBase. ##### Args: * `reader_ref`: The operation that implements the reader. * `supports_serialize`: True if the reader implementation can serialize its state. - - - #### `tf.ReaderBase.num_records_produced(name=None)` Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.ReaderBase.num_work_units_completed(name=None)` Returns the number of work units this reader has finished processing. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.ReaderBase.read(queue, name=None)` Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file). ##### Args: * `queue`: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. * `name`: A name for the operation (optional). ##### Returns: A tuple of Tensors (key, value). * `key`: A string scalar Tensor. * `value`: A string scalar Tensor. - - - #### `tf.ReaderBase.reader_ref` Op that implements the reader. - - - #### `tf.ReaderBase.reset(name=None)` Restore a reader to its initial clean state. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.ReaderBase.restore_state(state, name=None)` Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. ##### Args: * `state`: A string Tensor. Result of a SerializeState of a Reader with matching type. * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.ReaderBase.serialize_state(name=None)` Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. ##### Args: * `name`: A name for the operation (optional). ##### Returns: A string Tensor. - - - #### `tf.ReaderBase.supports_serialize` Whether the Reader implementation can serialize its state. - - - ### `class tf.TextLineReader` A Reader that outputs the lines of a file delimited by newlines. Newlines are stripped from the output. See ReaderBase for supported methods. - - - #### `tf.TextLineReader.__init__(skip_header_lines=None, name=None)` Create a TextLineReader. ##### Args: * `skip_header_lines`: An optional int. Defaults to 0. Number of lines to skip from the beginning of every file. * `name`: A name for the operation (optional). - - - #### `tf.TextLineReader.num_records_produced(name=None)` Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.TextLineReader.num_work_units_completed(name=None)` Returns the number of work units this reader has finished processing. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.TextLineReader.read(queue, name=None)` Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file). ##### Args: * `queue`: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. * `name`: A name for the operation (optional). ##### Returns: A tuple of Tensors (key, value). * `key`: A string scalar Tensor. * `value`: A string scalar Tensor. - - - #### `tf.TextLineReader.reader_ref` Op that implements the reader. - - - #### `tf.TextLineReader.reset(name=None)` Restore a reader to its initial clean state. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.TextLineReader.restore_state(state, name=None)` Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. ##### Args: * `state`: A string Tensor. Result of a SerializeState of a Reader with matching type. * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.TextLineReader.serialize_state(name=None)` Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. ##### Args: * `name`: A name for the operation (optional). ##### Returns: A string Tensor. - - - #### `tf.TextLineReader.supports_serialize` Whether the Reader implementation can serialize its state. - - - ### `class tf.WholeFileReader` A Reader that outputs the entire contents of a file as a value. To use, enqueue filenames in a Queue. The output of Read will be a filename (key) and the contents of that file (value). See ReaderBase for supported methods. - - - #### `tf.WholeFileReader.__init__(name=None)` Create a WholeFileReader. ##### Args: * `name`: A name for the operation (optional). - - - #### `tf.WholeFileReader.num_records_produced(name=None)` Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.WholeFileReader.num_work_units_completed(name=None)` Returns the number of work units this reader has finished processing. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.WholeFileReader.read(queue, name=None)` Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file). ##### Args: * `queue`: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. * `name`: A name for the operation (optional). ##### Returns: A tuple of Tensors (key, value). * `key`: A string scalar Tensor. * `value`: A string scalar Tensor. - - - #### `tf.WholeFileReader.reader_ref` Op that implements the reader. - - - #### `tf.WholeFileReader.reset(name=None)` Restore a reader to its initial clean state. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.WholeFileReader.restore_state(state, name=None)` Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. ##### Args: * `state`: A string Tensor. Result of a SerializeState of a Reader with matching type. * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.WholeFileReader.serialize_state(name=None)` Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. ##### Args: * `name`: A name for the operation (optional). ##### Returns: A string Tensor. - - - #### `tf.WholeFileReader.supports_serialize` Whether the Reader implementation can serialize its state. - - - ### `class tf.IdentityReader` A Reader that outputs the queued work as both the key and value. To use, enqueue strings in a Queue. Read will take the front work string and output (work, work). See ReaderBase for supported methods. - - - #### `tf.IdentityReader.__init__(name=None)` Create a IdentityReader. ##### Args: * `name`: A name for the operation (optional). - - - #### `tf.IdentityReader.num_records_produced(name=None)` Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.IdentityReader.num_work_units_completed(name=None)` Returns the number of work units this reader has finished processing. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.IdentityReader.read(queue, name=None)` Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file). ##### Args: * `queue`: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. * `name`: A name for the operation (optional). ##### Returns: A tuple of Tensors (key, value). * `key`: A string scalar Tensor. * `value`: A string scalar Tensor. - - - #### `tf.IdentityReader.reader_ref` Op that implements the reader. - - - #### `tf.IdentityReader.reset(name=None)` Restore a reader to its initial clean state. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.IdentityReader.restore_state(state, name=None)` Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. ##### Args: * `state`: A string Tensor. Result of a SerializeState of a Reader with matching type. * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.IdentityReader.serialize_state(name=None)` Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. ##### Args: * `name`: A name for the operation (optional). ##### Returns: A string Tensor. - - - #### `tf.IdentityReader.supports_serialize` Whether the Reader implementation can serialize its state. - - - ### `class tf.TFRecordReader` A Reader that outputs the records from a TFRecords file. See ReaderBase for supported methods. - - - #### `tf.TFRecordReader.__init__(name=None)` Create a TFRecordReader. ##### Args: * `name`: A name for the operation (optional). - - - #### `tf.TFRecordReader.num_records_produced(name=None)` Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.TFRecordReader.num_work_units_completed(name=None)` Returns the number of work units this reader has finished processing. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.TFRecordReader.read(queue, name=None)` Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file). ##### Args: * `queue`: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. * `name`: A name for the operation (optional). ##### Returns: A tuple of Tensors (key, value). * `key`: A string scalar Tensor. * `value`: A string scalar Tensor. - - - #### `tf.TFRecordReader.reader_ref` Op that implements the reader. - - - #### `tf.TFRecordReader.reset(name=None)` Restore a reader to its initial clean state. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.TFRecordReader.restore_state(state, name=None)` Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. ##### Args: * `state`: A string Tensor. Result of a SerializeState of a Reader with matching type. * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.TFRecordReader.serialize_state(name=None)` Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. ##### Args: * `name`: A name for the operation (optional). ##### Returns: A string Tensor. - - - #### `tf.TFRecordReader.supports_serialize` Whether the Reader implementation can serialize its state. - - - ### `class tf.FixedLengthRecordReader` A Reader that outputs fixed-length records from a file. See ReaderBase for supported methods. - - - #### `tf.FixedLengthRecordReader.__init__(record_bytes, header_bytes=None, footer_bytes=None, name=None)` Create a FixedLengthRecordReader. ##### Args: * `record_bytes`: An int. * `header_bytes`: An optional int. Defaults to 0. * `footer_bytes`: An optional int. Defaults to 0. * `name`: A name for the operation (optional). - - - #### `tf.FixedLengthRecordReader.num_records_produced(name=None)` Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.FixedLengthRecordReader.num_work_units_completed(name=None)` Returns the number of work units this reader has finished processing. ##### Args: * `name`: A name for the operation (optional). ##### Returns: An int64 Tensor. - - - #### `tf.FixedLengthRecordReader.read(queue, name=None)` Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g. when the Reader needs to start reading from a new file since it has finished with the previous file). ##### Args: * `queue`: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. * `name`: A name for the operation (optional). ##### Returns: A tuple of Tensors (key, value). * `key`: A string scalar Tensor. * `value`: A string scalar Tensor. - - - #### `tf.FixedLengthRecordReader.reader_ref` Op that implements the reader. - - - #### `tf.FixedLengthRecordReader.reset(name=None)` Restore a reader to its initial clean state. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.FixedLengthRecordReader.restore_state(state, name=None)` Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. ##### Args: * `state`: A string Tensor. Result of a SerializeState of a Reader with matching type. * `name`: A name for the operation (optional). ##### Returns: The created Operation. - - - #### `tf.FixedLengthRecordReader.serialize_state(name=None)` Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. ##### Args: * `name`: A name for the operation (optional). ##### Returns: A string Tensor. - - - #### `tf.FixedLengthRecordReader.supports_serialize` Whether the Reader implementation can serialize its state. ## Converting TensorFlow provides several operations that you can use to convert various data formats into tensors. - - - ### `tf.decode_csv(records, record_defaults, field_delim=None, name=None)` Convert CSV records to tensors. Each column maps to one tensor. RFC 4180 format is expected for the CSV records. (https://tools.ietf.org/html/rfc4180) Note that we allow leading and trailing spaces with int or float field. ##### Args: * `records`: A `Tensor` of type `string`. Each string is a record/row in the csv and all records should have the same format. * `record_defaults`: A list of `Tensor` objects with types from: `float32`, `int32`, `int64`, `string`. One tensor per column of the input record, with either a scalar default value for that column or empty if the column is required. * `field_delim`: An optional `string`. Defaults to `","`. delimiter to separate fields in a record. * `name`: A name for the operation (optional). ##### Returns: A list of `Tensor` objects. Has the same type as `record_defaults`. Each tensor will have the same shape as records. - - - ### `tf.decode_raw(bytes, out_type, little_endian=None, name=None)` Reinterpret the bytes of a string as a vector of numbers. ##### Args: * `bytes`: A `Tensor` of type `string`. All the elements must have the same length. * `out_type`: A `tf.DType` from: `tf.float32, tf.float64, tf.int32, tf.uint8, tf.int16, tf.int8, tf.int64`. * `little_endian`: An optional `bool`. Defaults to `True`. Whether the input bytes are in little-endian order. Ignored for out_types that are stored in a single byte like uint8. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `out_type`. A Tensor with one more dimension than the input bytes. The added dimension will have size equal to the length of the elements of bytes divided by the number of bytes to represent out_type. - - - ### Example protocol buffer TensorFlow's [recommended format for training examples](../../how_tos/reading_data/index.md#standard-tensorflow-format) is serialized `Example` protocol buffers, [described here](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto). They contain `Features`, [described here](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto). - - - ### `tf.parse_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseExample')` Parses `Example` protos. Parses a number of serialized [`Example`] (https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto) protos given in `serialized`. `names` may contain descriptive names for the corresponding serialized protos. These may be useful for debugging purposes, but they have no effect on the output. If not `None`, `names` must be the same length as `serialized`. This op parses serialized examples into a dictionary mapping keys to `Tensor` and `SparseTensor` objects respectively, depending on whether the keys appear in `sparse_keys` or `dense_keys`. The key `dense_keys[j]` is mapped to a `Tensor` of type `dense_types[j]` and of shape `(serialized.size(),) + dense_shapes[j]`. `dense_defaults` provides defaults for values referenced using `dense_keys`. If a key is not present in this dictionary, the corresponding dense `Feature` is required in all elements of `serialized`. `dense_shapes[j]` provides the shape of each `Feature` entry referenced by `dense_keys[j]`. The number of elements in the `Feature` corresponding to `dense_key[j]` must always have `np.prod(dense_shapes[j])` entries. The returned `Tensor` for `dense_key[j]` has shape `[N] + dense_shape[j]`, where `N` is the number of `Example`s in `serialized`. The key `sparse_keys[j]` is mapped to a `SparseTensor` of type `sparse_types[j]`. The `SparseTensor` represents a ragged matrix. Its indices are `[batch, index]` where `batch` is the batch entry the value is from, and `index` is the value's index in the list of values associated with that feature and example. Examples: For example, if one expects a `tf.float32` sparse feature `ft` and three serialized `Example`s are provided: ``` serialized = [ features: { feature: [ key: { "ft" value: float_list: { value: [1.0, 2.0] } } ] }, features: { feature: [] }, features: { feature: [ key: { "ft" value: float_list: { value: [3.0] } } ] } ] ``` then the output will look like: ``` {"ft": SparseTensor(indices=[[0, 0], [0, 1], [2, 0]], values=[1.0, 2.0, 3.0], shape=(3, 2)) } ``` Given two `Example` input protos in `serialized`: ``` [ features: { feature: { key: "kw" value: { bytes_list: { value: [ "knit", "big" ] } } } feature: { key: "gps" value: { float_list: { value: [] } } } }, features: { feature: { key: "kw" value: { bytes_list: { value: [ "emmy" ] } } } feature: { key: "dank" value: { int64_list: { value: [ 42 ] } } } feature: { key: "gps" value: { } } } ] ``` And arguments ``` names: ["input0", "input1"], sparse_keys: ["kw", "dank", "gps"] sparse_types: [DT_STRING, DT_INT64, DT_FLOAT] ``` Then the output is a dictionary: ```python { "kw": SparseTensor( indices=[[0, 0], [0, 1], [1, 0]], values=["knit", "big", "emmy"] shape=[2, 2]), "dank": SparseTensor( indices=[[1, 0]], values=[42], shape=[2, 1]), "gps": SparseTensor( indices=[], values=[], shape=[2, 0]), } ``` For dense results in two serialized `Example`s: ``` [ features: { feature: { key: "age" value: { int64_list: { value: [ 0 ] } } } feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } } }, features: { feature: { key: "age" value: { int64_list: { value: [] } } } feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } } } ] ``` We can use arguments: ``` names: ["input0", "input1"], dense_keys: np.array(["age", "gender"]), dense_types: [tf.int64, tf.string], dense_defaults: { "age": -1 # "age" defaults to -1 if missing # "gender" has no specified default so it's required } dense_shapes: [(1,), (1,)], # age, gender, label, weight ``` And the expected output is: ```python { "age": [[0], [-1]], "gender": [["f"], ["f"]], } ``` ##### Args: * `serialized`: A list of strings, a batch of binary serialized `Example` protos. * `names`: A list of strings, the names of the serialized protos. * `sparse_keys`: A list of string keys in the examples' features. The results for these keys will be returned as `SparseTensor` objects. * `sparse_types`: A list of `DTypes` of the same length as `sparse_keys`. Only `tf.float32` (`FloatList`), `tf.int64` (`Int64List`), and `tf.string` (`BytesList`) are supported. * `dense_keys`: A list of string keys in the examples' features. The results for these keys will be returned as `Tensor`s * `dense_types`: A list of DTypes of the same length as `dense_keys`. Only `tf.float32` (`FloatList`), `tf.int64` (`Int64List`), and `tf.string` (`BytesList`) are supported. * `dense_defaults`: A dict mapping string keys to `Tensor`s. The keys of the dict must match the dense_keys of the feature. * `dense_shapes`: A list of tuples with the same length as `dense_keys`. The shape of the data for each dense feature referenced by `dense_keys`. * `name`: A name for this operation (optional). ##### Returns: A `dict` mapping keys to `Tensor`s and `SparseTensor`s. ##### Raises: * `ValueError`: If sparse and dense key sets intersect, or input lengths do not match up. - - - ### `tf.parse_single_example(serialized, names=None, sparse_keys=None, sparse_types=None, dense_keys=None, dense_types=None, dense_defaults=None, dense_shapes=None, name='ParseSingleExample')` Parses a single `Example` proto. Similar to `parse_example`, except: For dense tensors, the returned `Tensor` is identical to the output of `parse_example`, except there is no batch dimension, the output shape is the same as the shape given in `dense_shape`. For `SparseTensor`s, the first (batch) column of the indices matrix is removed (the indices matrix is a column vector), the values vector is unchanged, and the first (batch_size) entry of the shape vector is removed (it is now a single element vector). See also `parse_example`. ##### Args: * `serialized`: A scalar string, a single serialized Example. See parse_example documentation for more details. * `names`: (Optional) A scalar string, the associated name. See parse_example documentation for more details. * `sparse_keys`: See parse_example documentation for more details. * `sparse_types`: See parse_example documentation for more details. * `dense_keys`: See parse_example documentation for more details. * `dense_types`: See parse_example documentation for more details. * `dense_defaults`: See parse_example documentation for more details. * `dense_shapes`: See parse_example documentation for more details. * `name`: A name for this operation (optional). ##### Returns: A dictionary mapping keys to Tensors and SparseTensors. ##### Raises: * `ValueError`: if "scalar" or "names" have known shapes, and are not scalars. ## Queues TensorFlow provides several implementations of 'Queues', which are structures within the TensorFlow computation graph to stage pipelines of tensors together. The following describe the basic Queue interface and some implementations. To see an example use, see [Threading and Queues](../../how_tos/threading_and_queues/index.md). - - - ### `class tf.QueueBase` Base class for queue implementations. A queue is a TensorFlow data structure that stores tensors across multiple steps, and exposes operations that enqueue and dequeue tensors. Each queue element is a tuple of one or more tensors, where each tuple component has a static dtype, and may have a static shape. The queue implementations support versions of enqueue and dequeue that handle single elements, versions that support enqueuing and dequeuing a batch of elements at once. See [`tf.FIFOQueue`](#FIFOQueue) and [`tf.RandomShuffleQueue`](#RandomShuffleQueue) for concrete implementations of this class, and instructions on how to create them. - - - #### `tf.QueueBase.enqueue(vals, name=None)` Enqueues one element to this queue. If the queue is full when this operation executes, it will block until the element has been enqueued. ##### Args: * `vals`: The tuple of `Tensor` objects to be enqueued. * `name`: A name for the operation (optional). ##### Returns: The operation that enqueues a new tuple of tensors to the queue. - - - #### `tf.QueueBase.enqueue_many(vals, name=None)` Enqueues zero or elements to this queue. This operation slices each component tensor along the 0th dimension to make multiple queue elements. All of the tensors in `vals` must have the same size in the 0th dimension. If the queue is full when this operation executes, it will block until all of the elements have been enqueued. ##### Args: * `vals`: The tensor or tuple of tensors from which the queue elements are taken. * `name`: A name for the operation (optional). ##### Returns: The operation that enqueues a batch of tuples of tensors to the queue. - - - #### `tf.QueueBase.dequeue(name=None)` Dequeues one element from this queue. If the queue is empty when this operation executes, it will block until there is an element to dequeue. ##### Args: * `name`: A name for the operation (optional). ##### Returns: The tuple of tensors that was dequeued. - - - #### `tf.QueueBase.dequeue_many(n, name=None)` Dequeues and concatenates `n` elements from this queue. This operation concatenates queue-element component tensors along the 0th dimension to make a single component tensor. All of the components in the dequeued tuple will have size `n` in the 0th dimension. If the queue contains fewer than `n` elements when this operation executes, it will block until `n` elements have been dequeued. ##### Args: * `n`: A scalar `Tensor` containing the number of elements to dequeue. * `name`: A name for the operation (optional). ##### Returns: The tuple of concatenated tensors that was dequeued. - - - #### `tf.QueueBase.size(name=None)` Compute the number of elements in this queue. ##### Args: * `name`: A name for the operation (optional). ##### Returns: A scalar tensor containing the number of elements in this queue. - - - #### `tf.QueueBase.close(cancel_pending_enqueues=False, name=None)` Closes this queue. This operation signals that no more elements will be enqueued in the given queue. Subsequent `enqueue` and `enqueue_many` operations will fail. Subsequent `dequeue` and `dequeue_many` operations will continue to succeed if sufficient elements remain in the queue. Subsequent `dequeue` and `dequeue_many` operations that would block will fail immediately. If `cancel_pending_enqueues` is `True`, all pending requests will also be cancelled. ##### Args: * `cancel_pending_enqueues`: (Optional.) A boolean, defaulting to `False` (described above). * `name`: A name for the operation (optional). ##### Returns: The operation that closes the queue. #### Other Methods - - - #### `tf.QueueBase.__init__(dtypes, shapes, queue_ref)` Constructs a queue object from a queue reference. ##### Args: * `dtypes`: A list of types. The length of dtypes must equal the number of tensors in each element. * `shapes`: Constraints on the shapes of tensors in an element: A list of shape tuples or None. This list is the same length as dtypes. If the shape of any tensors in the element are constrained, all must be; shapes can be None if the shapes should not be constrained. * `queue_ref`: The queue reference, i.e. the output of the queue op. - - - #### `tf.QueueBase.dtypes` The list of dtypes for each component of a queue element. - - - #### `tf.QueueBase.name` The name of the underlying queue. - - - #### `tf.QueueBase.queue_ref` The underlying queue reference. - - - ### `class tf.FIFOQueue` A queue implementation that dequeues elements in first-in-first out order. See [`tf.QueueBase`](#QueueBase) for a description of the methods on this class. - - - #### `tf.FIFOQueue.__init__(capacity, dtypes, shapes=None, shared_name=None, name='fifo_queue')` Creates a queue that dequeues elements in a first-in first-out order. A `FIFOQueue` has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery. A `FIFOQueue` holds a list of up to `capacity` elements. Each element is a fixed-length tuple of tensors whose dtypes are described by `dtypes`, and whose shapes are optionally described by the `shapes` argument. If the `shapes` argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of `dequeue_many` is disallowed. ##### Args: * `capacity`: An integer. The upper bound on the number of elements that may be stored in this queue. * `dtypes`: A list of `DType` objects. The length of `dtypes` must equal the number of tensors in each queue element. * `shapes`: (Optional.) A list of fully-defined `TensorShape` objects, with the same length as `dtypes` or `None`. * `shared_name`: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions. * `name`: Optional name for the queue operation. - - - ### `class tf.RandomShuffleQueue` A queue implementation that dequeues elements in a random order. See [`tf.QueueBase`](#QueueBase) for a description of the methods on this class. - - - #### `tf.RandomShuffleQueue.__init__(capacity, min_after_dequeue, dtypes, shapes=None, seed=None, shared_name=None, name='random_shuffle_queue')` Create a queue that dequeues elements in a random order. A `RandomShuffleQueue` has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery. A `RandomShuffleQueue` holds a list of up to `capacity` elements. Each element is a fixed-length tuple of tensors whose dtypes are described by `dtypes`, and whose shapes are optionally described by the `shapes` argument. If the `shapes` argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of `dequeue_many` is disallowed. The `min_after_dequeue` argument allows the caller to specify a minimum number of elements that will remain in the queue after a `dequeue` or `dequeue_many` operation completes, to ensure a minimum level of mixing of elements. This invariant is maintained by blocking those operations until sufficient elements have been enqueued. The `min_after_dequeue` argument is ignored after the queue has been closed. ##### Args: * `capacity`: An integer. The upper bound on the number of elements that may be stored in this queue. * `min_after_dequeue`: An integer (described above). * `dtypes`: A list of `DType` objects. The length of `dtypes` must equal the number of tensors in each queue element. * `shapes`: (Optional.) A list of fully-defined `TensorShape` objects, with the same length as `dtypes` or `None`. * `seed`: A Python integer. Used to create a random seed. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. * `shared_name`: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions. * `name`: Optional name for the queue operation. ## Dealing with the filesystem - - - ### `tf.matching_files(pattern, name=None)` Returns the set of files matching a pattern. Note that this routine only supports wildcard characters in the basename portion of the pattern, not in the directory portion. ##### Args: * `pattern`: A `Tensor` of type `string`. A (scalar) shell wildcard pattern. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `string`. A vector of matching filenames. - - - ### `tf.read_file(filename, name=None)` Reads and outputs the entire contents of the input filename. ##### Args: * `filename`: A `Tensor` of type `string`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `string`. ## Input pipeline TensorFlow functions for setting up an input-prefetching pipeline. Please see the [reading data how-to](../../how_tos/reading_data/index.md) for context. ### Beginning of an input pipeline The "producer" functions add a queue to the graph and a corresponding `QueueRunner` for running the subgraph that fills that queue. - - - ### `tf.train.match_filenames_once(pattern, name=None)` Save the list of files matching pattern, so it is only computed once. ##### Args: * `pattern`: A file pattern (glob). * `name`: A name for the operations (optional). ##### Returns: A variable that is initialized to the list of files matching pattern. - - - ### `tf.train.limit_epochs(tensor, num_epochs=None, name=None)` Returns tensor num_epochs times and then raises an OutOfRange error. ##### Args: * `tensor`: Any Tensor. * `num_epochs`: An integer (optional). If specified, limits the number of steps the output tensor may be evaluated. * `name`: A name for the operations (optional). ##### Returns: tensor or OutOfRange. - - - ### `tf.train.range_input_producer(limit, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)` Produces the integers from 0 to limit-1 in a queue. ##### Args: * `limit`: An int32 scalar tensor. * `num_epochs`: An integer (optional). If specified, `range_input_producer` produces each integer `num_epochs` times before generating an OutOfRange error. If not specified, `range_input_producer` can cycle through the integers an unlimited number of times. * `shuffle`: Boolean. If true, the integers are randomly shuffled within each epoch. * `seed`: An integer (optional). Seed used if shuffle == True. * `capacity`: An integer. Sets the queue capacity. * `name`: A name for the operations (optional). ##### Returns: A Queue with the output integers. A QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection. - - - ### `tf.train.slice_input_producer(tensor_list, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)` Produces a slice of each Tensor in tensor_list. Implemented using a Queue -- a QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection. ##### Args: * `tensor_list`: A list of Tensors. Every Tensor in tensor_list must have the same size in the first dimension. * `num_epochs`: An integer (optional). If specified, `slice_input_producer` produces each slice `num_epochs` times before generating an OutOfRange error. If not specified, `slice_input_producer` can cycle through the slices an unlimited number of times. * `seed`: An integer (optional). Seed used if shuffle == True. * `capacity`: An integer. Sets the queue capacity. * `name`: A name for the operations (optional). ##### Returns: A list of tensors, one for each element of tensor_list. If the tensor in tensor_list has shape [N, a, b, .., z], then the corresponding output tensor will have shape [a, b, ..., z]. - - - ### `tf.train.string_input_producer(string_tensor, num_epochs=None, shuffle=True, seed=None, capacity=32, name=None)` Output strings (e.g. filenames) to a queue for an input pipeline. ##### Args: * `string_tensor`: A 1-D string tensor with the strings to produce. * `num_epochs`: An integer (optional). If specified, `string_input_producer` produces each string from `string_tensor` `num_epochs` times before generating an OutOfRange error. If not specified, `string_input_producer` can cycle through the strings in `string_tensor` an unlimited number of times. * `shuffle`: Boolean. If true, the strings are randomly shuffled within each epoch. * `seed`: An integer (optional). Seed used if shuffle == True. * `capacity`: An integer. Sets the queue capacity. * `name`: A name for the operations (optional). ##### Returns: A queue with the output strings. A QueueRunner for the Queue is added to the current Graph's QUEUE_RUNNER collection. ### Batching at the end of an input pipeline These functions add a queue to the graph to assemble a batch of examples, with possible shuffling. They also add a `QueueRunner` for running the subgraph that fills that queue. Use [batch](#batch) or [batch_join](#batch_join) for batching examples that have already been well shuffled. Use [shuffle_batch](#shuffle_batch) or [shuffle_batch_join](#shuffle_batch_join) for examples that would benefit from additional shuffling. Use [batch](#batch) or [shuffle_batch](#shuffle_batch) if you want a single thread producing examples to batch, or if you have a single subgraph producing examples but you want to run it in N threads (where you increase N until it can keep the queue full). Use [batch_join](#batch_join) or [shuffle_batch_join](#shuffle_batch_join) if you have N different subgraphs producing examples to batch and you want them run by N threads. - - - ### `tf.train.batch(tensor_list, batch_size, num_threads=1, capacity=32, enqueue_many=False, shapes=None, name=None)` Creates batches of tensors in `tensor_list`. This function is implemented using a queue. A `QueueRunner` for the queue is added to the current `Graph`'s `QUEUE_RUNNER` collection. If `enqueue_many` is `False`, `tensor_list` is assumed to represent a single example. An input tensor with shape `[x, y, z]` will be output as a tensor with shape `[batch_size, x, y, z]`. If `enqueue_many` is `True`, `tensor_list` is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of `tensor_list` should have the same size in the first dimension. If an input tensor has shape `[*, x, y, z]`, the output will have shape `[batch_size, x, y, z]`. The `capacity` argument controls the how long the prefetching is allowed to grow the queues. ##### Args: * `tensor_list`: The list of tensors to enqueue. * `batch_size`: The new batch size pulled from the queue. * `num_threads`: The number of threads enqueuing `tensor_list`. * `capacity`: An integer. The maximum number of elements in the queue. * `enqueue_many`: Whether each tensor in `tensor_list` is a single example. * `shapes`: (Optional) The shapes for each example. Defaults to the inferred shapes for `tensor_list`. * `name`: (Optional) A name for the operations. ##### Returns: A list of tensors with the same number and types as `tensor_list`. - - - ### `tf.train.batch_join(tensor_list_list, batch_size, capacity=32, enqueue_many=False, shapes=None, name=None)` Runs a list of tensors to fill a queue to create batches of examples. Enqueues a different list of tensors in different threads. Implemented using a queue -- a `QueueRunner` for the queue is added to the current `Graph`'s `QUEUE_RUNNER` collection. `len(tensor_list_list)` threads will be started, with thread `i` enqueuing the tensors from `tensor_list_list[i]`. `tensor_list_list[i1][j]` must match `tensor_list_list[i2][j]` in type and shape, except in the first dimension if `enqueue_many` is true. If `enqueue_many` is `False`, each `tensor_list_list[i]` is assumed to represent a single example. An input tensor `x` will be output as a tensor with shape `[batch_size] + x.shape`. If `enqueue_many` is `True`, `tensor_list_list[i]` is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of `tensor_list_list[i]` should have the same size in the first dimension. The slices of any input tensor `x` are treated as examples, and the output tensors will have shape `[batch_size] + x.shape[1:]`. The `capacity` argument controls the how long the prefetching is allowed to grow the queues. ##### Args: * `tensor_list_list`: A list of tuples of tensors to enqueue. * `batch_size`: An integer. The new batch size pulled from the queue. * `capacity`: An integer. The maximum number of elements in the queue. * `enqueue_many`: Whether each tensor in `tensor_list_list` is a single example. * `shapes`: (Optional) The shapes for each example. Defaults to the inferred shapes for `tensor_list_list[i]`. * `name`: (Optional) A name for the operations. ##### Returns: A list of tensors with the same number and types as `tensor_list_list[i]`. - - - ### `tf.train.shuffle_batch(tensor_list, batch_size, capacity, min_after_dequeue, num_threads=1, seed=None, enqueue_many=False, shapes=None, name=None)` Creates batches by randomly shuffling tensors. This function adds the following to the current `Graph`: * A shuffling queue into which tensors from `tensor_list` are enqueued. * A `dequeue_many` operation to create batches from the queue. * A `QueueRunner` to `QUEUE_RUNNER` collection, to enqueue the tensors from `tensor_list`. If `enqueue_many` is `False`, `tensor_list` is assumed to represent a single example. An input tensor with shape `[x, y, z]` will be output as a tensor with shape `[batch_size, x, y, z]`. If `enqueue_many` is `True`, `tensor_list` is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of `tensor_list` should have the same size in the first dimension. If an input tensor has shape `[*, x, y, z]`, the output will have shape `[batch_size, x, y, z]`. The `capacity` argument controls the how long the prefetching is allowed to grow the queues. For example: ```python # Creates batches of 32 images and 32 labels. image_batch, label_batch = tf.train.shuffle_batch( [single_image, single_label], batch_size=32, num_threads=4, capacity=50000, min_after_dequeue=10000) ``` ##### Args: * `tensor_list`: The list of tensors to enqueue. * `batch_size`: The new batch size pulled from the queue. * `capacity`: An integer. The maximum number of elements in the queue. * `min_after_dequeue`: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements. * `num_threads`: The number of threads enqueuing `tensor_list`. * `seed`: Seed for the random shuffling within the queue. * `enqueue_many`: Whether each tensor in `tensor_list` is a single example. * `shapes`: (Optional) The shapes for each example. Defaults to the inferred shapes for `tensor_list`. * `name`: (Optional) A name for the operations. ##### Returns: A list of tensors with the same number and types as `tensor_list`. - - - ### `tf.train.shuffle_batch_join(tensor_list_list, batch_size, capacity, min_after_dequeue, seed=None, enqueue_many=False, shapes=None, name=None)` Create batches by randomly shuffling tensors. This version enqueues a different list of tensors in different threads. It adds the following to the current `Graph`: * A shuffling queue into which tensors from `tensor_list_list` are enqueued. * A `dequeue_many` operation to create batches from the queue. * A `QueueRunner` to `QUEUE_RUNNER` collection, to enqueue the tensors from `tensor_list_list`. `len(tensor_list_list)` threads will be started, with thread `i` enqueuing the tensors from `tensor_list_list[i]`. `tensor_list_list[i1][j]` must match `tensor_list_list[i2][j]` in type and shape, except in the first dimension if `enqueue_many` is true. If `enqueue_many` is `False`, each `tensor_list_list[i]` is assumed to represent a single example. An input tensor with shape `[x, y, z]` will be output as a tensor with shape `[batch_size, x, y, z]`. If `enqueue_many` is `True`, `tensor_list_list[i]` is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of `tensor_list_list[i]` should have the same size in the first dimension. If an input tensor has shape `[*, x, y, z]`, the output will have shape `[batch_size, x, y, z]`. The `capacity` argument controls the how long the prefetching is allowed to grow the queues. ##### Args: * `tensor_list_list`: A list of tuples of tensors to enqueue. * `batch_size`: An integer. The new batch size pulled from the queue. * `capacity`: An integer. The maximum number of elements in the queue. * `min_after_dequeue`: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements. * `seed`: Seed for the random shuffling within the queue. * `enqueue_many`: Whether each tensor in `tensor_list_list` is a single example. * `shapes`: (Optional) The shapes for each example. Defaults to the inferred shapes for `tensor_list_list[i]`. * `name`: (Optional) A name for the operations. ##### Returns: A list of tensors with the same number and types as `tensor_list_list[i]`. ================================================ FILE: SOURCE/api_docs/python/math_ops.md ================================================ # Math Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Math](#AUTOGENERATED-math) * [Arithmetic Operators](#AUTOGENERATED-arithmetic-operators) * [`tf.add(x, y, name=None)`](#add) * [`tf.sub(x, y, name=None)`](#sub) * [`tf.mul(x, y, name=None)`](#mul) * [`tf.div(x, y, name=None)`](#div) * [`tf.mod(x, y, name=None)`](#mod) * [Basic Math Functions](#AUTOGENERATED-basic-math-functions) * [`tf.add_n(inputs, name=None)`](#add_n) * [`tf.abs(x, name=None)`](#abs) * [`tf.neg(x, name=None)`](#neg) * [`tf.sign(x, name=None)`](#sign) * [`tf.inv(x, name=None)`](#inv) * [`tf.square(x, name=None)`](#square) * [`tf.round(x, name=None)`](#round) * [`tf.sqrt(x, name=None)`](#sqrt) * [`tf.rsqrt(x, name=None)`](#rsqrt) * [`tf.pow(x, y, name=None)`](#pow) * [`tf.exp(x, name=None)`](#exp) * [`tf.log(x, name=None)`](#log) * [`tf.ceil(x, name=None)`](#ceil) * [`tf.floor(x, name=None)`](#floor) * [`tf.maximum(x, y, name=None)`](#maximum) * [`tf.minimum(x, y, name=None)`](#minimum) * [`tf.cos(x, name=None)`](#cos) * [`tf.sin(x, name=None)`](#sin) * [Matrix Math Functions](#AUTOGENERATED-matrix-math-functions) * [`tf.diag(diagonal, name=None)`](#diag) * [`tf.transpose(a, perm=None, name='transpose')`](#transpose) * [`tf.matmul(a, b, transpose_a=False, transpose_b=False, a_is_sparse=False, b_is_sparse=False, name=None)`](#matmul) * [`tf.batch_matmul(x, y, adj_x=None, adj_y=None, name=None)`](#batch_matmul) * [`tf.matrix_determinant(input, name=None)`](#matrix_determinant) * [`tf.batch_matrix_determinant(input, name=None)`](#batch_matrix_determinant) * [`tf.matrix_inverse(input, name=None)`](#matrix_inverse) * [`tf.batch_matrix_inverse(input, name=None)`](#batch_matrix_inverse) * [`tf.cholesky(input, name=None)`](#cholesky) * [`tf.batch_cholesky(input, name=None)`](#batch_cholesky) * [Complex Number Functions](#AUTOGENERATED-complex-number-functions) * [`tf.complex(real, imag, name=None)`](#complex) * [`tf.complex_abs(x, name=None)`](#complex_abs) * [`tf.conj(in_, name=None)`](#conj) * [`tf.imag(in_, name=None)`](#imag) * [`tf.real(in_, name=None)`](#real) * [Reduction](#AUTOGENERATED-reduction) * [`tf.reduce_sum(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_sum) * [`tf.reduce_prod(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_prod) * [`tf.reduce_min(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_min) * [`tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_max) * [`tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_mean) * [`tf.reduce_all(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_all) * [`tf.reduce_any(input_tensor, reduction_indices=None, keep_dims=False, name=None)`](#reduce_any) * [`tf.accumulate_n(inputs, shape=None, tensor_dtype=None, name=None)`](#accumulate_n) * [Segmentation](#AUTOGENERATED-segmentation) * [`tf.segment_sum(data, segment_ids, name=None)`](#segment_sum) * [`tf.segment_prod(data, segment_ids, name=None)`](#segment_prod) * [`tf.segment_min(data, segment_ids, name=None)`](#segment_min) * [`tf.segment_max(data, segment_ids, name=None)`](#segment_max) * [`tf.segment_mean(data, segment_ids, name=None)`](#segment_mean) * [`tf.unsorted_segment_sum(data, segment_ids, num_segments, name=None)`](#unsorted_segment_sum) * [`tf.sparse_segment_sum(data, indices, segment_ids, name=None)`](#sparse_segment_sum) * [`tf.sparse_segment_mean(data, indices, segment_ids, name=None)`](#sparse_segment_mean) * [Sequence Comparison and Indexing](#AUTOGENERATED-sequence-comparison-and-indexing) * [`tf.argmin(input, dimension, name=None)`](#argmin) * [`tf.argmax(input, dimension, name=None)`](#argmax) * [`tf.listdiff(x, y, name=None)`](#listdiff) * [`tf.where(input, name=None)`](#where) * [`tf.unique(x, name=None)`](#unique) * [`tf.edit_distance(hypothesis, truth, normalize=True, name='edit_distance')`](#edit_distance) * [`tf.invert_permutation(x, name=None)`](#invert_permutation) ## Arithmetic Operators TensorFlow provides several operations that you can use to add basic arithmetic operators to your graph. - - - ### `tf.add(x, y, name=None)` Returns x + y element-wise. *NOTE*: Add supports broadcasting. AddN does not. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int8`, `int16`, `int32`, `complex64`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.sub(x, y, name=None)` Returns x - y element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.mul(x, y, name=None)` Returns x * y element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int8`, `int16`, `int32`, `complex64`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.div(x, y, name=None)` Returns x / y element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.mod(x, y, name=None)` Returns element-wise remainder of division. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `int32`, `int64`, `float32`, `float64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. ## Basic Math Functions TensorFlow provides several operations that you can use to add basic mathematical functions to your graph. - - - ### `tf.add_n(inputs, name=None)` Add all input tensors element wise. ##### Args: * `inputs`: A list of at least 1 `Tensor` objects of the same type in: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`. Must all be the same size and shape. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `inputs`. - - - ### `tf.abs(x, name=None)` Computes the absolute value of a tensor. Given a tensor of real numbers `x`, this operation returns a tensor containing the absolute value of each element in `x`. For example, if x is an input element and y is an output element, this operation computes \\(y = |x|\\). See [`tf.complex_abs()`](#tf_complex_abs) to compute the absolute value of a complex number. ##### Args: * `x`: A `Tensor` of type `float`, `double`, `int32`, or `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` the same size and type as `x` with absolute values. - - - ### `tf.neg(x, name=None)` Computes numerical negative value element-wise. I.e., \\(y = -x\\). ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.sign(x, name=None)` Returns an element-wise indication of the sign of a number. y = sign(x) = -1 if x < 0; 0 if x == 0; 1 if x > 0. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.inv(x, name=None)` Computes the reciprocal of x element-wise. I.e., \\(y = 1 / x\\). ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.square(x, name=None)` Computes square of x element-wise. I.e., \\(y = x * x = x^2\\). ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.round(x, name=None)` Rounds the values of a tensor to the nearest integer, element-wise. For example: ```python # 'a' is [0.9, 2.5, 2.3, -4.4] tf.round(a) ==> [ 1.0, 3.0, 2.0, -4.0 ] ``` ##### Args: * `x`: A `Tensor` of type `float` or `double`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of same shape and type as `x`. - - - ### `tf.sqrt(x, name=None)` Computes square root of x element-wise. I.e., \\(y = \sqrt{x} = x^{1/2}\\). ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.rsqrt(x, name=None)` Computes reciprocal of square root of x element-wise. I.e., \\(y = 1 / \sqrt{x}\\). ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.pow(x, y, name=None)` Computes the power of one value to another. Given a tensor `x` and a tensor `y`, this operation computes \\(x^y\\) for corresponding elements in `x` and `y`. For example: ``` # tensor 'x' is [[2, 2]], [3, 3]] # tensor 'y' is [[8, 16], [2, 3]] tf.pow(x, y) ==> [[256, 65536], [9, 27]] ``` ##### Args: * `x`: A `Tensor` of type `float`, `double`, `int32`, `complex64`, or `int64`. * `y`: A `Tensor` of type `float`, `double`, `int32`, `complex64`, or `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. - - - ### `tf.exp(x, name=None)` Computes exponential of x element-wise. \\(y = e^x\\). ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.log(x, name=None)` Computes natural logrithm of x element-wise. I.e., \\(y = \log_e x\\). ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.ceil(x, name=None)` Returns element-wise smallest integer in not less than x. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.floor(x, name=None)` Returns element-wise largest integer not greater than x. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.maximum(x, y, name=None)` Returns the max of x and y (i.e. x > y ? x : y) element-wise, broadcasts. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.minimum(x, y, name=None)` Returns the min of x and y (i.e. x < y ? x : y) element-wise, broadcasts. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. * `y`: A `Tensor`. Must have the same type as `x`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.cos(x, name=None)` Computes cos of x element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. - - - ### `tf.sin(x, name=None)` Computes sin of x element-wise. ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`, `int64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. ## Matrix Math Functions TensorFlow provides several operations that you can use to add basic mathematical functions for matrices to your graph. - - - ### `tf.diag(diagonal, name=None)` Returns a diagonal tensor with a given diagonal values. Given a `diagonal`, this operation returns a tensor with the `diagonal` and everything else padded with zeros. The diagonal is computed as follows: Assume `diagonal` has dimensions [D1,..., Dk], then the output is a tensor of rank 2k with dimensions [D1,..., Dk, D1,..., Dk] where: `output[i1,..., ik, i1,..., ik] = diagonal[i1, ..., ik]` and 0 everywhere else. For example: ```prettyprint # 'diagonal' is [1, 2, 3, 4] tf.diag(diagonal) ==> [[1, 0, 0, 0] [0, 2, 0, 0] [0, 0, 3, 0] [0, 0, 0, 4]] ``` ##### Args: * `diagonal`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`. Rank k tensor where k is at most 3. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `diagonal`. - - - ### `tf.transpose(a, perm=None, name='transpose')` Transposes `a`. Permutes the dimensions according to `perm`. The returned tensor's dimension i will correspond to the input dimension `perm[i]`. If `perm` is not given, it is set to (n-1...0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. For example: ```python # 'x' is [[1 2 3] # [4 5 6]] tf.transpose(x) ==> [[1 4] [2 5] [3 6]] # Equivalently tf.transpose(x perm=[0, 1]) ==> [[1 4] [2 5] [3 6]] # 'perm' is more useful for n-dimensional tensors, for n > 2 # 'x' is [[[1 2 3] # [4 5 6]] # [[7 8 9] # [10 11 12]]] # Take the transpose of the matrices in dimension-0 tf.transpose(b, perm=[0, 2, 1]) ==> [[[1 4] [2 5] [3 6]] [[7 10] [8 11] [9 12]]] ``` ##### Args: * `a`: A `Tensor`. * `perm`: A permutation of the dimensions of `a`. * `name`: A name for the operation (optional). ##### Returns: A transposed `Tensor`. - - - ### `tf.matmul(a, b, transpose_a=False, transpose_b=False, a_is_sparse=False, b_is_sparse=False, name=None)` Multiplies matrix `a` by matrix `b`, producing `a` * `b`. The inputs must be two-dimensional matrices, with matching inner dimensions, possibly after transposition. Both matrices must be of the same type. The supported types are: `float`, `double`, `int32`, `complex64`. Either matrix can be transposed on the fly by setting the corresponding flag to `True`. This is `False` by default. If one or both of the matrices contain a lot of zeros, a more efficient multiplication algorithm can be used by setting the corresponding `a_is_sparse` or `b_is_sparse` flag to `True`. These are `False` by default. For example: ```python # 2-D tensor `a` a = tf.constant([1, 2, 3, 4, 5, 6], shape=[2, 3]) => [[1. 2. 3.] [4. 5. 6.]] # 2-D tensor `b` b = tf.constant([7, 8, 9, 10, 11, 12], shape=[3, 2]) => [[7. 8.] [9. 10.] [11. 12.]] c = tf.matmul(a, b) => [[58 64] [139 154]] ``` ##### Args: * `a`: `Tensor` of type `float`, `double`, `int32` or `complex64`. * `b`: `Tensor` with same type as `a`. * `transpose_a`: If `True`, `a` is transposed before multiplication. * `transpose_b`: If `True`, `b` is transposed before multiplication. * `a_is_sparse`: If `True`, `a` is treated as a sparse matrix. * `b_is_sparse`: If `True`, `b` is treated as a sparse matrix. * `name`: Name for the operation (optional). ##### Returns: A `Tensor` of the same type as `a`. - - - ### `tf.batch_matmul(x, y, adj_x=None, adj_y=None, name=None)` Multiplies slices of two tensors in batches. Multiplies all slices of `Tensor` `x` and `y` (each slice can be viewed as an element of a batch), and arranges the individual results in a single output tensor of the same batch size. Each of the individual slices can optionally be adjointed (to adjoint a matrix means to transpose and conjugate it) before multiplication by setting the `adj_x` or `adj_y` flag to `True`, which are by default `False`. The input tensors `x` and `y` are 3-D or higher with shape `[..., r_x, c_x]` and `[..., r_y, c_y]`. The output tensor is 3-D or higher with shape `[..., r_o, c_o]`, where: r_o = c_x if adj_x else r_x c_o = r_y if adj_y else c_y It is computed as: out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :]) ##### Args: * `x`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `complex64`. 3-D or higher with shape `[..., r_x, c_x]`. * `y`: A `Tensor`. Must have the same type as `x`. 3-D or higher with shape `[..., r_y, c_y]`. * `adj_x`: An optional `bool`. Defaults to `False`. If `True`, adjoint the slices of `x`. Defaults to `False`. * `adj_y`: An optional `bool`. Defaults to `False`. If `True`, adjoint the slices of `y`. Defaults to `False`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `x`. 3-D or higher with shape `[..., r_o, c_o]` - - - ### `tf.matrix_determinant(input, name=None)` Calculates the determinant of a square matrix. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`. A tensor of shape `[M, M]`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. A scalar, equal to the determinant of the input. - - - ### `tf.batch_matrix_determinant(input, name=None)` Calculates the determinants for a batch of square matrices. The input is a tensor of shape `[..., M, M]` whose inner-most 2 dimensions form square matrices. The output is a 1-D tensor containing the determinants for all input submatrices `[..., :, :]`. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`. Shape is `[..., M, M]`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. Shape is `[...]`. - - - ### `tf.matrix_inverse(input, name=None)` Calculates the inverse of a square invertible matrix. Checks for invertibility. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`. Shape is `[M, M]`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. Shape is `[M, M]` containing the matrix inverse of the input. - - - ### `tf.batch_matrix_inverse(input, name=None)` Calculates the inverse of square invertible matrices. Checks for invertibility. The input is a tensor of shape `[..., M, M]` whose inner-most 2 dimensions form square matrices. The output is a tensor of the same shape as the input containing the inverse for all input submatrices `[..., :, :]`. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`. Shape is `[..., M, M]`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. Shape is `[..., M, M]`. - - - ### `tf.cholesky(input, name=None)` Calculates the Cholesky decomposition of a square matrix. The input has to be symmetric and positive definite. Only the lower-triangular part of the input will be used for this operation. The upper-triangular part will not be read. The result is the lower-triangular matrix of the Cholesky decomposition of the input. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float64`, `float32`. Shape is `[M, M]`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. Shape is `[M, M]`. - - - ### `tf.batch_cholesky(input, name=None)` Calculates the Cholesky decomposition of a batch of square matrices. The input is a tensor of shape `[..., M, M]` whose inner-most 2 dimensions form square matrices, with the same constraints as the single matrix Cholesky decomposition above. The output is a tensor of the same shape as the input containing the Cholesky decompositions for all input submatrices `[..., :, :]`. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float64`, `float32`. Shape is `[..., M, M]`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. Shape is `[..., M, M]`. ## Complex Number Functions TensorFlow provides several operations that you can use to add complex number functions to your graph. - - - ### `tf.complex(real, imag, name=None)` Converts two real numbers to a complex number. Given a tensor `real` representing the real part of a complex number, and a tensor `imag` representing the imaginary part of a complex number, this operation computes complex numbers elementwise of the form \\(a + bj\\), where *a* represents the `real` part and *b* represents the `imag` part. The input tensors `real` and `imag` must be the same shape. For example: ``` # tensor 'real' is [2.25, 3.25] # tensor `imag` is [4.75, 5.75] tf.complex(real, imag) ==> [[2.25 + 4.74j], [3.25 + 5.75j]] ``` ##### Args: * `real`: A `Tensor` of type `float`. * `imag`: A `Tensor` of type `float`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `complex64`. - - - ### `tf.complex_abs(x, name=None)` Computes the complex absolute value of a tensor. Given a tensor `x` of complex numbers, this operation returns a tensor of type `float` that is the absolute value of each element in `x`. All elements in `x` must be complex numbers of the form \\(a + bj\\). The absolute value is computed as \\( \sqrt{a^2 + b^2}\\). For example: ``` # tensor 'x' is [[-2.25 + 4.75j], [-3.25 + 5.75j]] tf.complex_abs(x) ==> [5.25594902, 6.60492229] ``` ##### Args: * `x`: A `Tensor` of type `complex64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. - - - ### `tf.conj(in_, name=None)` Returns the complex conjugate of a complex number. Given a tensor `in` of complex numbers, this operation returns a tensor of complex numbers that are the complex conjugate of each element in `in`. The complex numbers in `in` must be of the form \\(a + bj\\), where *a* is the real part and *b* is the imaginary part. The complex conjugate returned by this operation is of the form \\(a - bj\\). For example: ``` # tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j] tf.conj(in) ==> [-2.25 - 4.75j, 3.25 - 5.75j] ``` ##### Args: * `in_`: A `Tensor` of type `complex64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `complex64`. - - - ### `tf.imag(in_, name=None)` Returns the imaginary part of a complex number. Given a tensor `in` of complex numbers, this operation returns a tensor of type `float` that is the imaginary part of each element in `in`. All elements in `in` must be complex numbers of the form \\(a + bj\\), where *a* is the real part and *b* is the imaginary part returned by this operation. For example: ``` # tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j] tf.imag(in) ==> [4.75, 5.75] ``` ##### Args: * `in_`: A `Tensor` of type `complex64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. - - - ### `tf.real(in_, name=None)` Returns the real part of a complex number. Given a tensor `in` of complex numbers, this operation returns a tensor of type `float` that is the real part of each element in `in`. All elements in `in` must be complex numbers of the form \\(a + bj\\), where *a* is the real part returned by this operation and *b* is the imaginary part. For example: ``` # tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j] tf.real(in) ==> [-2.25, 3.25] ``` ##### Args: * `in_`: A `Tensor` of type `complex64`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. ## Reduction TensorFlow provides several operations that you can use to perform common math computations that reduce various dimensions of a tensor. - - - ### `tf.reduce_sum(input_tensor, reduction_indices=None, keep_dims=False, name=None)` Computes the sum of elements across dimensions of a tensor. Reduces `input_tensor` along the dimensions given in `reduction_indices`. Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions are retained with length 1. If `reduction_indices` has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: ```python # 'x' is [[1, 1, 1]] # [1, 1, 1]] tf.reduce_sum(x) ==> 6 tf.reduce_sum(x, 0) ==> [2, 2, 2] tf.reduce_sum(x, 1) ==> [3, 3] tf.reduce_sum(x, 1, keep_dims=True) ==> [[3], [3]] tf.reduce_sum(x, [0, 1]) ==> 6 ``` ##### Args: * `input_tensor`: The tensor to reduce. Should have numeric type. * `reduction_indices`: The dimensions to reduce. If `None` (the defaut), reduces all dimensions. * `keep_dims`: If true, retains reduced dimensions with length 1. * `name`: A name for the operation (optional). ##### Returns: The reduced tensor. - - - ### `tf.reduce_prod(input_tensor, reduction_indices=None, keep_dims=False, name=None)` Computes the product of elements across dimensions of a tensor. Reduces `input_tensor` along the dimensions given in `reduction_indices`. Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions are retained with length 1. If `reduction_indices` has no entries, all dimensions are reduced, and a tensor with a single element is returned. ##### Args: * `input_tensor`: The tensor to reduce. Should have numeric type. * `reduction_indices`: The dimensions to reduce. If `None` (the defaut), reduces all dimensions. * `keep_dims`: If true, retains reduced dimensions with length 1. * `name`: A name for the operation (optional). ##### Returns: The reduced tensor. - - - ### `tf.reduce_min(input_tensor, reduction_indices=None, keep_dims=False, name=None)` Computes the minimum of elements across dimensions of a tensor. Reduces `input_tensor` along the dimensions given in `reduction_indices`. Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions are retained with length 1. If `reduction_indices` has no entries, all dimensions are reduced, and a tensor with a single element is returned. ##### Args: * `input_tensor`: The tensor to reduce. Should have numeric type. * `reduction_indices`: The dimensions to reduce. If `None` (the defaut), reduces all dimensions. * `keep_dims`: If true, retains reduced dimensions with length 1. * `name`: A name for the operation (optional). ##### Returns: The reduced tensor. - - - ### `tf.reduce_max(input_tensor, reduction_indices=None, keep_dims=False, name=None)` Computes the maximum of elements across dimensions of a tensor. Reduces `input_tensor` along the dimensions given in `reduction_indices`. Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions are retained with length 1. If `reduction_indices` has no entries, all dimensions are reduced, and a tensor with a single element is returned. ##### Args: * `input_tensor`: The tensor to reduce. Should have numeric type. * `reduction_indices`: The dimensions to reduce. If `None` (the defaut), reduces all dimensions. * `keep_dims`: If true, retains reduced dimensions with length 1. * `name`: A name for the operation (optional). ##### Returns: The reduced tensor. - - - ### `tf.reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)` Computes the mean of elements across dimensions of a tensor. Reduces `input_tensor` along the dimensions given in `reduction_indices`. Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions are retained with length 1. If `reduction_indices` has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: ```python # 'x' is [[1., 1. ]] # [2., 2.]] tf.reduce_mean(x) ==> 1.5 tf.reduce_mean(x, 0) ==> [1.5, 1.5] tf.reduce_mean(x, 1) ==> [1., 2.] ``` ##### Args: * `input_tensor`: The tensor to reduce. Should have numeric type. * `reduction_indices`: The dimensions to reduce. If `None` (the defaut), reduces all dimensions. * `keep_dims`: If true, retains reduced dimensions with length 1. * `name`: A name for the operation (optional). ##### Returns: The reduced tensor. - - - ### `tf.reduce_all(input_tensor, reduction_indices=None, keep_dims=False, name=None)` Computes the "logical and" of elements across dimensions of a tensor. Reduces `input_tensor` along the dimensions given in `reduction_indices`. Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions are retained with length 1. If `reduction_indices` has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: ```python # 'x' is [[True, True]] # [False, False]] tf.reduce_all(x) ==> False tf.reduce_all(x, 0) ==> [False, False] tf.reduce_all(x, 1) ==> [True, False] ``` ##### Args: * `input_tensor`: The boolean tensor to reduce. * `reduction_indices`: The dimensions to reduce. If `None` (the defaut), reduces all dimensions. * `keep_dims`: If true, retains reduced dimensions with length 1. * `name`: A name for the operation (optional). ##### Returns: The reduced tensor. - - - ### `tf.reduce_any(input_tensor, reduction_indices=None, keep_dims=False, name=None)` Computes the "logical or" of elements across dimensions of a tensor. Reduces `input_tensor` along the dimensions given in `reduction_indices`. Unless `keep_dims` is true, the rank of the tensor is reduced by 1 for each entry in `reduction_indices`. If `keep_dims` is true, the reduced dimensions are retained with length 1. If `reduction_indices` has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: ```python # 'x' is [[True, True]] # [False, False]] tf.reduce_any(x) ==> True tf.reduce_any(x, 0) ==> [True, True] tf.reduce_any(x, 1) ==> [True, False] ``` ##### Args: * `input_tensor`: The boolean tensor to reduce. * `reduction_indices`: The dimensions to reduce. If `None` (the defaut), reduces all dimensions. * `keep_dims`: If true, retains reduced dimensions with length 1. * `name`: A name for the operation (optional). ##### Returns: The reduced tensor. - - - ### `tf.accumulate_n(inputs, shape=None, tensor_dtype=None, name=None)` Returns the element-wise sum of a list of tensors. Optionally, pass `shape` and `tensor_dtype` for shape and type checking, otherwise, these are inferred. For example: ```python # tensor 'a' is [[1, 2], [3, 4] # tensor `b` is [[5, 0], [0, 6]] tf.accumulate_n([a, b, a]) ==> [[7, 4], [6, 14]] # Explicitly pass shape and type tf.accumulate_n([a, b, a], shape=[2, 2], tensor_dtype=tf.int32) ==> [[7, 4], [6, 14]] ``` ##### Args: * `inputs`: A list of `Tensor` objects, each with same shape and type. * `shape`: Shape of elements of `inputs`. * `tensor_dtype`: The type of `inputs`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of same shape and type as the elements of `inputs`. ##### Raises: * `ValueError`: If `inputs` don't all have same shape and dtype or the shape cannot be inferred. ## Segmentation TensorFlow provides several operations that you can use to perform common math computations on tensor segments. Here a segmentation is a partitioning of a tensor along the first dimension, i.e. it defines a mapping from the first dimension onto `segment_ids`. The `segment_ids` tensor should be the size of the first dimension, `d0`, with consecutive IDs in the range `0` to `k`, where `k [[0 0 0 0] [5 6 7 8]] ``` - - - ### `tf.segment_sum(data, segment_ids, name=None)` Computes the sum along segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Computes a tensor such that \\(output_i = \sum_j data_j\\) where sum is over `j` such that `segment_ids[j] == i`.
##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A 1-D tensor whose rank is equal to the rank of `data`'s first dimension. Values should be sorted and can be repeated. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `k`, the number of segments. - - - ### `tf.segment_prod(data, segment_ids, name=None)` Computes the product along segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Computes a tensor such that \\(output_i = \prod_j data_j\\) where the product is over `j` such that `segment_ids[j] == i`.
##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A 1-D tensor whose rank is equal to the rank of `data`'s first dimension. Values should be sorted and can be repeated. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `k`, the number of segments. - - - ### `tf.segment_min(data, segment_ids, name=None)` Computes the minimum along segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Computes a tensor such that \\(output_i = \min_j(data_j)\\) where `min` is over `j` such that `segment_ids[j] == i`.
##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A 1-D tensor whose rank is equal to the rank of `data`'s first dimension. Values should be sorted and can be repeated. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `k`, the number of segments. - - - ### `tf.segment_max(data, segment_ids, name=None)` Computes the maximum along segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Computes a tensor such that \\(output_i = \max_j(data_j)\\) where `max` is over `j` such that `segment_ids[j] == i`.
##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A 1-D tensor whose rank is equal to the rank of `data`'s first dimension. Values should be sorted and can be repeated. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `k`, the number of segments. - - - ### `tf.segment_mean(data, segment_ids, name=None)` Computes the mean along segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Computes a tensor such that \\(output_i = \frac{\sum_j data_j}{N}\\) where `mean` is over `j` such that `segment_ids[j] == i` and `N` is the total number of values summed.
##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A 1-D tensor whose rank is equal to the rank of `data`'s first dimension. Values should be sorted and can be repeated. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `k`, the number of segments. - - - ### `tf.unsorted_segment_sum(data, segment_ids, num_segments, name=None)` Computes the sum along segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Computes a tensor such that \\(output_i = \sum_j data_j\\) where sum is over `j` such that `segment_ids[j] == i`. Unlike `SegmentSum`, `segment_ids` need not be sorted and need not cover all values in the full range of valid values. If the sum is empty for a given segment ID `i`, `output[i] = 0`. `num_segments` should equal the number of distinct segment IDs.
##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `segment_ids`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A 1-D tensor whose rank is equal to the rank of `data`'s first dimension. * `num_segments`: A `Tensor` of type `int32`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `num_segments`. - - - ### `tf.sparse_segment_sum(data, indices, segment_ids, name=None)` Computes the sum along sparse segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Like `SegmentSum`, but `segment_ids` can have rank less than `data`'s first dimension, selecting a subset of dimension_0, specified by `indices`. For example: ```prettyprint c = tf.constant([[1,2,3,4], [-1,-2,-3,-4], [5,6,7,8]]) # Select two rows, one segment. tf.sparse_segment_sum(c, tf.constant([0, 1]), tf.constant([0, 0])) ==> [[0 0 0 0]] # Select two rows, two segment. tf.sparse_segment_sum(c, tf.constant([0, 1]), tf.constant([0, 1])) ==> [[ 1 2 3 4] [-1 -2 -3 -4]] # Select all rows, two segments. tf.sparse_segment_sum(c, tf.constant([0, 1, 2]), tf.constant([0, 0, 1])) ==> [[0 0 0 0] [5 6 7 8]] # Which is equivalent to: tf.segment_sum(c, tf.constant([0, 0, 1])) ``` ##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `indices`: A `Tensor` of type `int32`. A 1-D tensor. Has same rank as `segment_ids`. * `segment_ids`: A `Tensor` of type `int32`. A 1-D tensor. Values should be sorted and can be repeated. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `k`, the number of segments. - - - ### `tf.sparse_segment_mean(data, indices, segment_ids, name=None)` Computes the mean along sparse segments of a tensor. Read [the section on Segmentation](../../api_docs/python/math_ops.md#segmentation) for an explanation of segments. Like `SegmentMean`, but `segment_ids` can have rank less than `data`'s first dimension, selecting a subset of dimension_0, specified by `indices`. ##### Args: * `data`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `indices`: A `Tensor` of type `int32`. A 1-D tensor. Has same rank as `segment_ids`. * `segment_ids`: A `Tensor` of type `int32`. A 1-D tensor. Values should be sorted and can be repeated. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `data`. Has same shape as data, except for dimension_0 which has size `k`, the number of segments. ## Sequence Comparison and Indexing TensorFlow provides several operations that you can use to add sequence comparison and index extraction to your graph. You can use these operations to determine sequence differences and determine the indexes of specific values in a tensor. - - - ### `tf.argmin(input, dimension, name=None)` Returns the index with the smallest value across dimensions of a tensor. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`. * `dimension`: A `Tensor` of type `int32`. int32, 0 <= dimension < rank(input). Describes which dimension of the input Tensor to reduce across. For vectors, use dimension = 0. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int64`. - - - ### `tf.argmax(input, dimension, name=None)` Returns the index with the largest value across dimensions of a tensor. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`. * `dimension`: A `Tensor` of type `int32`. int32, 0 <= dimension < rank(input). Describes which dimension of the input Tensor to reduce across. For vectors, use dimension = 0. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int64`. - - - ### `tf.listdiff(x, y, name=None)` Computes the difference between two lists of numbers. Given a list `x` and a list `y`, this operation returns a list `out` that represents all numbers that are in `x` but not in `y`. The returned list `out` is sorted in the same order that the numbers appear in `x` (duplicates are preserved). This operation also returns a list `idx` that represents the position of each `out` element in `x`. In other words: `out[i] = x[idx[i]] for i in [0, 1, ..., len(out) - 1]` For example, given this input: ```prettyprint x = [1, 2, 3, 4, 5, 6] y = [1, 3, 5] ``` This operation would return: ```prettyprint out ==> [2, 4, 6] idx ==> [1, 3, 5] ``` ##### Args: * `x`: A `Tensor`. 1-D. Values to keep. * `y`: A `Tensor`. Must have the same type as `x`. 1-D. Values to remove. * `name`: A name for the operation (optional). ##### Returns: A tuple of `Tensor` objects (out, idx). * `out`: A `Tensor`. Has the same type as `x`. 1-D. Values present in `x` but not in `y`. * `idx`: A `Tensor` of type `int32`. 1-D. Positions of `x` values preserved in `out`. - - - ### `tf.where(input, name=None)` Returns locations of true values in a boolean tensor. This operation returns the coordinates of true elements in `input`. The coordinates are returned in a 2-D tensor where the first dimension (rows) represents the number of true elements, and the second dimension (columns) represents the coordinates of the true elements. Keep in mind, the shape of the output tensor can vary depending on how many true values there are in `input`. Indices are output in row-major order. For example: ```prettyprint # 'input' tensor is [[True, False] # [True, False]] # 'input' has two true values, so output has two coordinates. # 'input' has rank of 2, so coordinates have two indices. where(input) ==> [[0, 0], [1, 0]] # `input` tensor is [[[True, False] # [True, False]] # [[False, True] # [False, True]] # [[False, False] # [False, True]]] # 'input' has 5 true values, so output has 5 coordinates. # 'input' has rank of 3, so coordinates have three indices. where(input) ==> [[0, 0, 0], [0, 1, 0], [1, 0, 1], [1, 1, 1], [2, 1, 1]] ``` ##### Args: * `input`: A `Tensor` of type `bool`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int64`. - - - ### `tf.unique(x, name=None)` Finds unique elements in a 1-D tensor. This operation returns a tensor `y` containing all of the unique elements of `x` sorted in the same order that they occur in `x`. This operation also returns a tensor `idx` the same size as `x` that contains the index of each value of `x` in the unique output `y`. In other words: `y[idx[i]] = x[i] for i in [0, 1,...,rank(x) - 1]` For example: ```prettyprint # tensor 'x' is [1, 1, 2, 4, 4, 4, 7, 8, 8] y, idx = unique(x) y ==> [1, 2, 4, 7, 8] idx ==> [0, 0, 1, 2, 2, 2, 3, 4, 4] ``` ##### Args: * `x`: A `Tensor`. 1-D. * `name`: A name for the operation (optional). ##### Returns: A tuple of `Tensor` objects (y, idx). * `y`: A `Tensor`. Has the same type as `x`. 1-D. * `idx`: A `Tensor` of type `int32`. 1-D. - - - ### `tf.edit_distance(hypothesis, truth, normalize=True, name='edit_distance')` Computes the Levenshtein distance between sequences. This operation takes variable-length sequences (`hypothesis` and `truth`), each provided as a `SparseTensor`, and computes the Levenshtein distance. You can normalize the edit distance by length of `truth` by setting `normalize` to true. For example, given the following input: ```python # 'hypothesis' is a tensor of shape `[2, 1]` with variable-length values: # (0,0) = ["a"] # (1,0) = ["b"] hypothesis = tf.SparseTensor( [[0, 0, 0], [1, 0, 0]], ["a", "b"] (2, 1, 1)) # 'truth' is a tensor of shape `[2, 2]` with variable-length values: # (0,0) = [] # (0,1) = ["a"] # (1,0) = ["b", "c"] # (1,1) = ["a"] truth = tf.SparseTensor( [[0, 1, 0], [1, 0, 0], [1, 0, 1], [1, 1, 0]] ["a", "b", "c", "a"], (2, 2, 2)) normalize = True ``` This operation would return the following: ```python # 'output' is a tensor of shape `[2, 2]` with edit distances normalized # by 'truth' lengths. output ==> [[inf, 1.0], # (0,0): no truth, (0,1): no hypothesis [0.5, 1.0]] # (1,0): addition, (1,1): no hypothesis ``` ##### Args: * `hypothesis`: A `SparseTensor` containing hypothesis sequences. * `truth`: A `SparseTensor` containing truth sequences. * `normalize`: A `bool`. If `True`, normalizes the Levenshtein distance by length of `truth.` * `name`: A name for the operation (optional). ##### Returns: A dense `Tensor` with rank `R - 1`, where R is the rank of the `SparseTensor` inputs `hypothesis` and `truth`. ##### Raises: * `TypeError`: If either `hypothesis` or `truth` are not a `SparseTensor`. - - - ### `tf.invert_permutation(x, name=None)` Computes the inverse permutation of a tensor. This operation computes the inverse of an index permutation. It takes a 1-D integer tensor `x`, which represents the indices of a zero-based array, and swaps each value with its index position. In other words, for an ouput tensor `y` and an input tensor `x`, this operation computes the following: `y[x[i]] = i for i in [0, 1, ..., len(x) - 1]` The values must include 0. There can be no duplicate values or negative values. For example: ```prettyprint # tensor `x` is [3, 4, 0, 2, 1] invert_permutation(x) ==> [2, 4, 3, 0, 1] ``` ##### Args: * `x`: A `Tensor` of type `int32`. 1-D. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `int32`. 1-D. ================================================ FILE: SOURCE/api_docs/python/nn.md ================================================ # Neural Network Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Neural Network](#AUTOGENERATED-neural-network) * [Activation Functions](#AUTOGENERATED-activation-functions) * [`tf.nn.relu(features, name=None)`](#relu) * [`tf.nn.relu6(features, name=None)`](#relu6) * [`tf.nn.softplus(features, name=None)`](#softplus) * [`tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)`](#dropout) * [`tf.nn.bias_add(value, bias, name=None)`](#bias_add) * [`tf.sigmoid(x, name=None)`](#sigmoid) * [`tf.tanh(x, name=None)`](#tanh) * [Convolution](#AUTOGENERATED-convolution) * [`tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)`](#conv2d) * [`tf.nn.depthwise_conv2d(input, filter, strides, padding, name=None)`](#depthwise_conv2d) * [`tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter, strides, padding, name=None)`](#separable_conv2d) * [Pooling](#AUTOGENERATED-pooling) * [`tf.nn.avg_pool(value, ksize, strides, padding, name=None)`](#avg_pool) * [`tf.nn.max_pool(value, ksize, strides, padding, name=None)`](#max_pool) * [`tf.nn.max_pool_with_argmax(input, ksize, strides, padding, Targmax=None, name=None)`](#max_pool_with_argmax) * [Normalization](#AUTOGENERATED-normalization) * [`tf.nn.l2_normalize(x, dim, epsilon=1e-12, name=None)`](#l2_normalize) * [`tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)`](#local_response_normalization) * [`tf.nn.moments(x, axes, name=None)`](#moments) * [Losses](#AUTOGENERATED-losses) * [`tf.nn.l2_loss(t, name=None)`](#l2_loss) * [Classification](#AUTOGENERATED-classification) * [`tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None)`](#sigmoid_cross_entropy_with_logits) * [`tf.nn.softmax(logits, name=None)`](#softmax) * [`tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)`](#softmax_cross_entropy_with_logits) * [Embeddings](#AUTOGENERATED-embeddings) * [`tf.nn.embedding_lookup(params, ids, name=None)`](#embedding_lookup) * [Evaluation](#AUTOGENERATED-evaluation) * [`tf.nn.top_k(input, k, name=None)`](#top_k) * [`tf.nn.in_top_k(predictions, targets, k, name=None)`](#in_top_k) * [Candidate Sampling](#AUTOGENERATED-candidate-sampling) * [Sampled Loss Functions](#AUTOGENERATED-sampled-loss-functions) * [`tf.nn.nce_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=False, name='nce_loss')`](#nce_loss) * [`tf.nn.sampled_softmax_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=True, name='sampled_softmax_loss')`](#sampled_softmax_loss) * [Candidate Samplers](#AUTOGENERATED-candidate-samplers) * [`tf.nn.uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`](#uniform_candidate_sampler) * [`tf.nn.log_uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`](#log_uniform_candidate_sampler) * [`tf.nn.learned_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)`](#learned_unigram_candidate_sampler) * [`tf.nn.fixed_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, vocab_file='', distortion=0.0, num_reserved_ids=0, num_shards=1, shard=0, unigrams=[], seed=None, name=None)`](#fixed_unigram_candidate_sampler) * [Miscellaneous candidate sampling utilities](#AUTOGENERATED-miscellaneous-candidate-sampling-utilities) * [`tf.nn.compute_accidental_hits(true_classes, sampled_candidates, num_true, seed=None, name=None)`](#compute_accidental_hits) ## Activation Functions The activation ops provide different types of nonlinearities for use in neural networks. These include smooth nonlinearities (`sigmoid`, `tanh`, and `softplus`), continuous but not everywhere differentiable functions (`relu`, `relu6`, and `relu_x`), and random regularization (`dropout`). All activation ops apply componentwise, and produce a tensor of the same shape as the input tensor. - - - ### `tf.nn.relu(features, name=None)` Computes rectified linear: `max(features, 0)`. ##### Args: * `features`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `features`. - - - ### `tf.nn.relu6(features, name=None)` Computes Rectified Linear 6: `min(max(features, 0), 6)`. ##### Args: * `features`: A `Tensor` with type `float`, `double`, `int32`, `int64`, `uint8`, `int16`, or `int8`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with the same type as `features`. - - - ### `tf.nn.softplus(features, name=None)` Computes softplus: `log(exp(features) + 1)`. ##### Args: * `features`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `features`. - - - ### `tf.nn.dropout(x, keep_prob, noise_shape=None, seed=None, name=None)` Computes dropout. With probability `keep_prob`, outputs the input element scaled up by `1 / keep_prob`, otherwise outputs `0`. The scaling is so that the expected sum is unchanged. By default, each element is kept or dropped independently. If `noise_shape` is specified, it must be [broadcastable](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html) to the shape of `x`, and only dimensions with `noise_shape[i] == shape(x)[i]` will make independent decisions. For example, if `shape(x) = [k, l, m, n]` and `noise_shape = [k, 1, 1, n]`, each batch and channel component will be kept independently and each row and column will be kept or not kept together. ##### Args: * `x`: A tensor. * `keep_prob`: A Python float. The probability that each element is kept. * `noise_shape`: A 1-D `Tensor` of type `int32`, representing the shape for randomly generated keep/drop flags. * `seed`: A Python integer. Used to create random seeds. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. * `name`: A name for this operation (optional). ##### Returns: A Tensor of the same shape of `x`. ##### Raises: * `ValueError`: If `keep_prob` is not in `(0, 1]`. - - - ### `tf.nn.bias_add(value, bias, name=None)` Adds `bias` to `value`. This is (mostly) a special case of `tf.add` where `bias` is restricted to 1-D. Broadcasting is supported, so `value` may have any number of dimensions. Unlike `tf.add`, the type of `bias` is allowed to differ from `value` in the case where both types are quantized. ##### Args: * `value`: A `Tensor` with type `float`, `double`, `int64`, `int32`, `uint8`, `int16`, `int8`, or `complex64`. * `bias`: A 1-D `Tensor` with size matching the last dimension of `value`. Must be the same type as `value` unless `value` is a quantized type, in which case a different quantized type may be used. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with the same type as `value`. - - - ### `tf.sigmoid(x, name=None)` Computes sigmoid of `x` element-wise. Specifically, `y = 1 / (1 + exp(-x))`. ##### Args: * `x`: A Tensor with type `float`, `double`, `int32`, `complex64`, `int64`, or `qint32`. * `name`: A name for the operation (optional). ##### Returns: A Tensor with the same type as `x` if `x.dtype != qint32` otherwise the return type is `quint8`. - - - ### `tf.tanh(x, name=None)` Computes hyperbolic tangent of `x` element-wise. ##### Args: * `x`: A Tensor with type `float`, `double`, `int32`, `complex64`, `int64`, or `qint32`. * `name`: A name for the operation (optional). ##### Returns: A Tensor with the same type as `x` if `x.dtype != qint32` otherwise the return type is `quint8`. ## Convolution The convolution ops sweep a 2-D filter over a batch of images, applying the filter to each window of each image of the appropriate size. The different ops trade off between generic vs. specific filters: * `conv2d`: Arbitrary filters that can mix channels together. * `depthwise_conv2d`: Filters that operate on each channel independently. * `separable_conv2d`: A depthwise spatial filter followed by a pointwise filter. Note that although these ops are called "convolution", they are strictly speaking "cross-correlation" since the filter is combined with an input window without reversing the filter. For details, see [the properties of cross-correlation](https://en.wikipedia.org/wiki/Cross-correlation#Properties). The filter is applied to image patches of the same size as the filter and strided according to the `strides` argument. `strides = [1, 1, 1, 1]` applies the filter to a patch at every offset, `strides = [1, 2, 2, 1]` applies the filter to every other image patch in each dimension, etc. Ignoring channels for the moment, the spatial semantics of the convolution ops are as follows. If the 4-D `input` has shape `[batch, in_height, in_width, ...]` and the 4-D `filter` has shape `[filter_height, filter_width, ...]`, then shape(output) = [batch, (in_height - filter_height + 1) / strides[1], (in_width - filter_width + 1) / strides[2], ...] output[b, i, j, :] = sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, ...] * filter[di, dj, ...] Since `input` is 4-D, each `input[b, i, j, :]` is a vector. For `conv2d`, these vectors are multiplied by the `filter[di, dj, :, :]` matrices to produce new vectors. For `depthwise_conv_2d`, each scalar component `input[b, i, j, k]` is multiplied by a vector `filter[di, dj, k]`, and all the vectors are concatenated. In the formula for `shape(output)`, the rounding direction depends on padding: * `padding = 'SAME'`: Round down (only full size windows are considered). * `padding = 'VALID'`: Round up (partial windows are included). - - - ### `tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, name=None)` Computes a 2-D convolution given 4-D `input` and `filter` tensors. Given an input tensor of shape `[batch, in_height, in_width, in_channels]` and a filter / kernel tensor of shape `[filter_height, filter_width, in_channels, out_channels]`, this op performs the following: 1. Flattens the filter to a 2-D matrix with shape `[filter_height * filter_width * in_channels, output_channels]`. 2. Extracts image patches from the the input tensor to form a *virtual* tensor of shape `[batch, out_height, out_width, filter_height * filter_width * in_channels]`. 3. For each patch, right-multiplies the filter matrix and the image patch vector. In detail, output[b, i, j, k] = sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] * filter[di, dj, q, k] Must have `strides[0] = strides[3] = 1`. For the most common case of the same horizontal and vertices strides, `strides = [1, stride, stride, 1]`. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`. * `filter`: A `Tensor`. Must have the same type as `input`. * `strides`: A list of `ints`. 1-D of length 4. The stride of the sliding window for each dimension of `input`. * `padding`: A `string` from: `"SAME", "VALID"`. The type of padding algorithm to use. * `use_cudnn_on_gpu`: An optional `bool`. Defaults to `True`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. - - - ### `tf.nn.depthwise_conv2d(input, filter, strides, padding, name=None)` Depthwise 2-D convolution. Given an input tensor of shape `[batch, in_height, in_width, in_channels]` and a filter tensor of shape `[filter_height, filter_width, in_channels, channel_multiplier]` containing `in_channels` convolutional filters of depth 1, `depthwise_conv2d` applies a different filter to each input channel (expanding from 1 channel to `channel_multiplier` channels for each), then concatenates the results together. The output has `in_channels * channel_multiplier` channels. In detail, output[b, i, j, k * channel_multiplier + q] = sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, k] * filter[di, dj, k, q] Must have `strides[0] = strides[3] = 1`. For the most common case of the same horizontal and vertical strides, `strides = [1, stride, stride, 1]`. ##### Args: * `input`: 4-D with shape `[batch, in_height, in_width, in_channels]`. * `filter`: 4-D with shape `[filter_height, filter_width, in_channels, channel_multiplier]`. * `strides`: 1-D of size 4. The stride of the sliding window for each dimension of `input`. * `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm. * `name`: A name for this operation (optional). ##### Returns: A 4-D `Tensor` of shape `[batch, out_height, out_width, in_channels * channel_multiplier].` - - - ### `tf.nn.separable_conv2d(input, depthwise_filter, pointwise_filter, strides, padding, name=None)` 2-D convolution with separable filters. Performs a depthwise convolution that acts separately on channels followed by a pointwise convolution that mixes channels. Note that this is separability between dimensions `[1, 2]` and `3`, not spatial separability between dimensions `1` and `2`. In detail, output[b, i, j, k] = sum_{di, dj, q, r] input[b, strides[1] * i + di, strides[2] * j + dj, q] * depthwise_filter[di, dj, q, r] * pointwise_filter[0, 0, q * channel_multiplier + r, k] `strides` controls the strides for the depthwise convolution only, since the pointwise convolution has implicit strides of `[1, 1, 1, 1]`. Must have `strides[0] = strides[3] = 1`. For the most common case of the same horizontal and vertical strides, `strides = [1, stride, stride, 1]`. ##### Args: * `input`: 4-D `Tensor` with shape `[batch, in_height, in_width, in_channels]`. * `depthwise_filter`: 4-D `Tensor` with shape `[filter_height, filter_width, in_channels, channel_multiplier]`. Contains `in_channels` convolutional filters of depth 1. * `pointwise_filter`: 4-D `Tensor` with shape `[1, 1, channel_multiplier * in_channels, out_channels]`. Pointwise filter to mix channels after `depthwise_filter` has convolved spatially. * `strides`: 1-D of size 4. The strides for the depthwise convolution for each dimension of `input`. * `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm. * `name`: A name for this operation (optional). ##### Returns: A 4-D `Tensor` of shape `[batch, out_height, out_width, out_channels]`. ## Pooling The pooling ops sweep a rectangular window over the input tensor, computing a reduction operation for each window (average, max, or max with argmax). Each pooling op uses rectangular windows of size `ksize` separated by offset `strides`. For example, if `strides` is all ones every window is used, if `strides` is all twos every other window is used in each dimension, etc. In detail, the output is output[i] = reduce(value[strides * i:strides * i + ksize]) for each tuple of indices `i`. The output shape is shape(output) = (shape(value) - ksize + 1) / strides where the rounding direction depends on padding: * `padding = 'SAME'`: Round down (only full size windows are considered). * `padding = 'VALID'`: Round up (partial windows are included). - - - ### `tf.nn.avg_pool(value, ksize, strides, padding, name=None)` Performs the average pooling on the input. Each entry in `output` is the mean of the corresponding size `ksize` window in `value`. ##### Args: * `value`: A 4-D `Tensor` of shape `[batch, height, width, channels]` and type `float32`, `float64`, `qint8`, `quint8`, or `qint32`. * `ksize`: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor. * `strides`: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor. * `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm. * `name`: Optional name for the operation. ##### Returns: A `Tensor` with the same type as `value`. The average pooled output tensor. - - - ### `tf.nn.max_pool(value, ksize, strides, padding, name=None)` Performs the max pooling on the input. ##### Args: * `value`: A 4-D `Tensor` with shape `[batch, height, width, channels]` and type `float32`, `float64`, `qint8`, `quint8`, `qint32`. * `ksize`: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor. * `strides`: A list of ints that has length >= 4. The stride of the sliding window for each dimension of the input tensor. * `padding`: A string, either `'VALID'` or `'SAME'`. The padding algorithm. * `name`: Optional name for the operation. ##### Returns: A `Tensor` with the same type as `value`. The max pooled output tensor. - - - ### `tf.nn.max_pool_with_argmax(input, ksize, strides, padding, Targmax=None, name=None)` Performs max pooling on the input and outputs both max values and indices. The indices in `argmax` are flattened, so that a maximum value at position `[b, y, x, c]` becomes flattened index `((b * height + y) * width + x) * channels + c`. ##### Args: * `input`: A `Tensor` of type `float32`. 4-D with shape `[batch, height, width, channels]`. Input to pool over. * `ksize`: A list of `ints` that has length `>= 4`. The size of the window for each dimension of the input tensor. * `strides`: A list of `ints` that has length `>= 4`. The stride of the sliding window for each dimension of the input tensor. * `padding`: A `string` from: `"SAME", "VALID"`. The type of padding algorithm to use. * `Targmax`: An optional `tf.DType` from: `tf.int32, tf.int64`. Defaults to `tf.int64`. * `name`: A name for the operation (optional). ##### Returns: A tuple of `Tensor` objects (output, argmax). * `output`: A `Tensor` of type `float32`. The max pooled output tensor. * `argmax`: A `Tensor` of type `Targmax`. 4-D. The flattened indices of the max values chosen for each output. ## Normalization Normalization is useful to prevent neurons from saturating when inputs may have varying scale, and to aid generalization. - - - ### `tf.nn.l2_normalize(x, dim, epsilon=1e-12, name=None)` Normalizes along dimension `dim` using an L2 norm. For a 1-D tensor with `dim = 0`, computes output = x / sqrt(max(sum(x**2), epsilon)) For `x` with more dimensions, independently normalizes each 1-D slice along dimension `dim`. ##### Args: * `x`: A `Tensor`. * `dim`: Dimension along which to normalize. * `epsilon`: A lower bound value for the norm. Will use `sqrt(epsilon)` as the divisor if `norm < sqrt(epsilon)`. * `name`: A name for this operation (optional). ##### Returns: A `Tensor` with the same shape as `x`. - - - ### `tf.nn.local_response_normalization(input, depth_radius=None, bias=None, alpha=None, beta=None, name=None)` Local Response Normalization. The 4-D `input` tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within `depth_radius`. In detail, sqr_sum[a, b, c, d] = sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2) output = input / (bias + alpha * sqr_sum ** beta) For details, see [Krizhevsky et al., ImageNet classification with deep convolutional neural networks (NIPS 2012)] (http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks). ##### Args: * `input`: A `Tensor` of type `float32`. 4-D. * `depth_radius`: An optional `int`. Defaults to `5`. 0-D. Half-width of the 1-D normalization window. * `bias`: An optional `float`. Defaults to `1`. An offset (usually positive to avoid dividing by 0). * `alpha`: An optional `float`. Defaults to `1`. A scale factor, usually positive. * `beta`: An optional `float`. Defaults to `0.5`. An exponent. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `float32`. - - - ### `tf.nn.moments(x, axes, name=None)` Calculate the mean and variance of `x`. The mean and variance are calculated by aggregating the contents of `x` across `axes`. If `x` is 1-D and `axes = [0]` this is just the mean and variance of a vector. For so-called "global normalization" needed for convolutional filters pass `axes=[0, 1, 2]` (batch, height, width). For batch normalization pass `axes=[0]` (batch). ##### Args: * `x`: A `Tensor`. * `axes`: array of ints. Axes along which to compute mean and variance. * `name`: Name used to scope the operations that compute the moments. ##### Returns: Two `Tensors`: `mean` and `variance`. ## Losses The loss ops measure error between two tensors, or between a tensor and zero. These can be used for measuring accuracy of a network in a regression task or for regularization purposes (weight decay). - - - ### `tf.nn.l2_loss(t, name=None)` L2 Loss. Computes half the L2 norm of a tensor without the `sqrt`: output = sum(t ** 2) / 2 ##### Args: * `t`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`. Typically 2-D, but may have any dimensions. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `t`. 0-D. ## Classification TensorFlow provides several operations that help you perform classification. - - - ### `tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None)` Computes sigmoid cross entropy given `logits`. Measures the probability error in discrete classification tasks in which each class is independent and not mutually exclusive. For instance, one could perform multilabel classification where a picture can contain both an elephant and a dog at the same time. For brevity, let `x = logits`, `z = targets`. The logistic loss is x - x * z + log(1 + exp(-x)) To ensure stability and avoid overflow, the implementation uses max(x, 0) - x * z + log(1 + exp(-abs(x))) `logits` and `targets` must have the same type and shape. ##### Args: * `logits`: A `Tensor` of type `float32` or `float64`. * `targets`: A `Tensor` of the same type and shape as `logits`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of the same shape as `logits` with the componentwise logistic losses. - - - ### `tf.nn.softmax(logits, name=None)` Computes softmax activations. For each batch `i` and class `j` we have softmax[i, j] = exp(logits[i, j]) / sum(exp(logits[i])) ##### Args: * `logits`: A `Tensor`. Must be one of the following types: `float32`, `float64`. 2-D with shape `[batch_size, num_classes]`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `logits`. Same shape as `logits`. - - - ### `tf.nn.softmax_cross_entropy_with_logits(logits, labels, name=None)` Computes softmax cross entropy between `logits` and `labels`. Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both. **WARNING:** This op expects unscaled logits, since it performs a `softmax` on `logits` internally for efficiency. Do not call this op with the output of `softmax`, as it will produce incorrect results. `logits` and `labels` must have the same shape `[batch_size, num_classes]` and the same dtype (either `float32` or `float64`). ##### Args: * `logits`: Unscaled log probabilities. * `labels`: Each row `labels[i]` must be a valid probability distribution. * `name`: A name for the operation (optional). ##### Returns: A 1-D `Tensor` of length `batch_size` of the same type as `logits` with the softmax cross entropy loss. ## Embeddings TensorFlow provides library support for looking up values in embedding tensors. - - - ### `tf.nn.embedding_lookup(params, ids, name=None)` Looks up `ids` in a list of embedding tensors. This function is used to perform parallel lookups on the list of tensors in `params`. It is a generalization of [`tf.gather()`](../../api_docs/python/array_ops.md#gather), where `params` is interpreted as a partition of a larger embedding tensor. If `len(params) > 1`, each element `id` of `ids` is partitioned between the elements of `params` by computing `p = id % len(params)`, and is then used to look up the slice `params[p][id // len(params), ...]`. The results of the lookup are then concatenated into a dense tensor. The returned tensor has shape `shape(ids) + shape(params)[1:]`. ##### Args: * `params`: A list of tensors with the same shape and type. * `ids`: A `Tensor` with type `int32` containing the ids to be looked up in `params`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor` with the same type as the tensors in `params`. ##### Raises: * `ValueError`: If `params` is empty. ## Evaluation The evaluation ops are useful for measuring the performance of a network. Since they are nondifferentiable, they are typically used at evaluation time. - - - ### `tf.nn.top_k(input, k, name=None)` Returns the values and indices of the k largest elements for each row. \\(values_{i, j}\\) represents the j-th largest element in \\(input_i\\). \\(indices_{i, j}\\) gives the column index of the corresponding element, such that \\(input_{i, indices_{i, j}} = values_{i, j}\\). If two elements are equal, the lower-index element appears first. ##### Args: * `input`: A `Tensor`. Must be one of the following types: `float32`, `float64`, `int32`, `int64`, `uint8`, `int16`, `int8`. A batch_size x classes tensor * `k`: An `int` that is `>= 1`. Number of top elements to look for within each row * `name`: A name for the operation (optional). ##### Returns: A tuple of `Tensor` objects (values, indices). * `values`: A `Tensor`. Has the same type as `input`. A batch_size x k tensor with the k largest elements for each row, sorted in descending order * `indices`: A `Tensor` of type `int32`. A batch_size x k tensor with the index of each value within each row - - - ### `tf.nn.in_top_k(predictions, targets, k, name=None)` Says whether the targets are in the top K predictions. This outputs a batch_size bool array, an entry out[i] is true if the prediction for the target class is among the top k predictions among all predictions for example i. Note that the behavior of InTopK differs from the TopK op in its handling of ties; if multiple classes have the same prediction value and straddle the top-k boundary, all of those classes are considered to be in the top k. More formally, let \\(predictions_i\\) be the predictions for all classes for example i, \\(targets_i\\) be the target class for example i, \\(out_i\\) be the output for example i, $$out_i = predictions_{i, targets_i} \in TopKIncludingTies(predictions_i)$$ ##### Args: * `predictions`: A `Tensor` of type `float32`. A batch_size x classes tensor * `targets`: A `Tensor` of type `int32`. A batch_size vector of class ids * `k`: An `int`. Number of top elements to look at for computing precision * `name`: A name for the operation (optional). ##### Returns: A `Tensor` of type `bool`. Computed Precision at k as a bool Tensor ## Candidate Sampling Do you want to train a multiclass or multilabel model with thousands or millions of output classes (for example, a language model with a large vocabulary)? Training with a full Softmax is slow in this case, since all of the classes are evaluated for every training example. Candidate Sampling training algorithms can speed up your step times by only considering a small randomly-chosen subset of contrastive classes (called candidates) for each batch of training examples. See our [Candidate Sampling Algorithms Reference] (../../extras/candidate_sampling.pdf) ### Sampled Loss Functions TensorFlow provides the following sampled loss functions for faster training. - - - ### `tf.nn.nce_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=False, name='nce_loss')` Computes and returns the noise-contrastive estimation training loss. See [Noise-contrastive estimation: A new estimation principle for unnormalized statistical models] (http://www.jmlr.org/proceedings/papers/v9/gutmann10a/gutmann10a.pdf). Also see our [Candidate Sampling Algorithms Reference] (http://www.tensorflow.org/extras/candidate_sampling.pdf) Note: In the case where num_true > 1, we assign to each target class the target probability 1 / num_true so that the target probabilities sum to 1 per-example. Note: It would be useful to allow a variable number of target classes per example. We hope to provide this functionality in a future release. For now, if you have a variable number of target classes, you can pad them out to a constant number by either repeating them or by padding with an otherwise unused class. ##### Args: * `weights`: A `Tensor` of shape [num_classes, dim]. The class embeddings. * `biases`: A `Tensor` of shape [num_classes]. The class biases. * `inputs`: A `Tensor` of shape [batch_size, dim]. The forward activations of the input network. * `labels`: A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. * `num_sampled`: An `int`. The number of classes to randomly sample per batch. * `num_classes`: An `int`. The number of possible classes. * `num_true`: An `int`. The number of target classes per training example. * `sampled_values`: a tuple of `(sampled_candidates, true_expected_count, sampled_expected_count)` returned by a *_candidate_sampler function. (if None, we default to LogUniformCandidateSampler) * `remove_accidental_hits`: A `bool`. Whether to remove "accidental hits" where a sampled class equals one of the target classes. If set to `True`, this is a "Sampled Logistic" loss instead of NCE, and we are learning to generate log-odds instead of log probabilities. See our [Candidate Sampling Algorithms Reference] (http://www.tensorflow.org/extras/candidate_sampling.pdf). Default is False. * `name`: A name for the operation (optional). ##### Returns: A batch_size 1-D tensor of per-example NCE losses. - - - ### `tf.nn.sampled_softmax_loss(weights, biases, inputs, labels, num_sampled, num_classes, num_true=1, sampled_values=None, remove_accidental_hits=True, name='sampled_softmax_loss')` Computes and returns the sampled softmax training loss. This is a faster way to train a softmax classifier over a huge number of classes. This operation is for training only. It is generally an underestimate of the full softmax loss. At inference time, you can compute full softmax probabilities with the expression `tf.nn.softmax(tf.matmul(inputs, weights) + biases)`. See our [Candidate Sampling Algorithms Reference] (http://www.tensorflow.org/extras/candidate_sampling.pdf) Also see Section 3 of http://arxiv.org/abs/1412.2007 for the math. ##### Args: * `weights`: A `Tensor` of shape [num_classes, dim]. The class embeddings. * `biases`: A `Tensor` of shape [num_classes]. The class biases. * `inputs`: A `Tensor` of shape [batch_size, dim]. The forward activations of the input network. * `labels`: A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. Note that this format differs from the `labels` argument of `nn.softmax_cross_entropy_with_logits`. * `num_sampled`: An `int`. The number of classes to randomly sample per batch. * `num_classes`: An `int`. The number of possible classes. * `num_true`: An `int`. The number of target classes per training example. * `sampled_values`: a tuple of `(sampled_candidates, true_expected_count, sampled_expected_count)` returned by a *_candidate_sampler function. (if None, we default to LogUniformCandidateSampler) * `remove_accidental_hits`: A `bool`. whether to remove "accidental hits" where a sampled class equals one of the target classes. Default is True. * `name`: A name for the operation (optional). ##### Returns: A batch_size 1-D tensor of per-example sampled softmax losses. ### Candidate Samplers TensorFlow provides the following samplers for randomly sampling candidate classes when using one of the sampled loss functions above. - - - ### `tf.nn.uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)` Samples a set of classes using a uniform base distribution. This operation randomly samples a tensor of sampled classes (`sampled_candidates`) from the range of integers `[0, range_max]`. The elements of `sampled_candidates` are drawn without replacement (if `unique=True`) or with replacement (if `unique=False`) from the base distribution. The base distribution for this operation is the uniform distribution over the range of integers `[0, range_max]`. In addition, this operation returns tensors `true_expected_count` and `sampled_expected_count` representing the number of times each of the target classes (`true_classes`) and the sampled classes (`sampled_candidates`) is expected to occur in an average tensor of sampled classes. These values correspond to `Q(y|x)` defined in [this document](http://www.tensorflow.org/extras/candidate_sampling.pdf). If `unique=True`, then these are post-rejection probabilities and we compute them approximately. ##### Args: * `true_classes`: A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. * `num_true`: An `int`. The number of target classes per training example. * `num_sampled`: An `int`. The number of classes to randomly sample per batch. * `unique`: A `bool`. Determines whether all sampled classes in a batch are unique. * `range_max`: An `int`. The number of possible classes. * `seed`: An `int`. An operation-specific seed. Default is 0. * `name`: A name for the operation (optional). ##### Returns: * `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`. The sampled classes. * `true_expected_count`: A tensor of type `float`. Same shape as `true_classes`. The expected counts under the sampling distribution of each of `true_classes`. * `sampled_expected_count`: A tensor of type `float`. Same shape as `sampled_candidates`. The expected counts under the sampling distribution of each of `sampled_candidates`. - - - ### `tf.nn.log_uniform_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)` Samples a set of classes using a log-uniform (Zipfian) base distribution. This operation randomly samples a tensor of sampled classes (`sampled_candidates`) from the range of integers `[0, range_max]`. The elements of `sampled_candidates` are drawn without replacement (if `unique=True`) or with replacement (if `unique=False`) from the base distribution. The base distribution for this operation is an approximately log-uniform or Zipfian distribution: `P(class) = (log(class + 2) - log(class + 1)) / log(range_max + 1)` This sampler is useful when the target classes approximately follow such a distribution - for example, if the classes represent words in a lexicon sorted in decreasing order of frequency. If your classes are not ordered by decreasing frequency, do not use this op. In addition, this operation returns tensors `true_expected_count` and `sampled_expected_count` representing the number of times each of the target classes (`true_classes`) and the sampled classes (`sampled_candidates`) is expected to occur in an average tensor of sampled classes. These values correspond to `Q(y|x)` defined in [this document](http://www.tensorflow.org/extras/candidate_sampling.pdf). If `unique=True`, then these are post-rejection probabilities and we compute them approximately. ##### Args: * `true_classes`: A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. * `num_true`: An `int`. The number of target classes per training example. * `num_sampled`: An `int`. The number of classes to randomly sample per batch. * `unique`: A `bool`. Determines whether all sampled classes in a batch are unique. * `range_max`: An `int`. The number of possible classes. * `seed`: An `int`. An operation-specific seed. Default is 0. * `name`: A name for the operation (optional). ##### Returns: * `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`. The sampled classes. * `true_expected_count`: A tensor of type `float`. Same shape as `true_classes`. The expected counts under the sampling distribution of each of `true_classes`. * `sampled_expected_count`: A tensor of type `float`. Same shape as `sampled_candidates`. The expected counts under the sampling distribution of each of `sampled_candidates`. - - - ### `tf.nn.learned_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, seed=None, name=None)` Samples a set of classes from a distribution learned during training. This operation randomly samples a tensor of sampled classes (`sampled_candidates`) from the range of integers `[0, range_max]`. The elements of `sampled_candidates` are drawn without replacement (if `unique=True`) or with replacement (if `unique=False`) from the base distribution. The base distribution for this operation is constructed on the fly during training. It is a unigram distribution over the target classes seen so far during training. Every integer in `[0, range_max]` begins with a weight of 1, and is incremented by 1 each time it is seen as a target class. The base distribution is not saved to checkpoints, so it is reset when the model is reloaded. In addition, this operation returns tensors `true_expected_count` and `sampled_expected_count` representing the number of times each of the target classes (`true_classes`) and the sampled classes (`sampled_candidates`) is expected to occur in an average tensor of sampled classes. These values correspond to `Q(y|x)` defined in [this document](http://www.tensorflow.org/extras/candidate_sampling.pdf). If `unique=True`, then these are post-rejection probabilities and we compute them approximately. ##### Args: * `true_classes`: A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. * `num_true`: An `int`. The number of target classes per training example. * `num_sampled`: An `int`. The number of classes to randomly sample per batch. * `unique`: A `bool`. Determines whether all sampled classes in a batch are unique. * `range_max`: An `int`. The number of possible classes. * `seed`: An `int`. An operation-specific seed. Default is 0. * `name`: A name for the operation (optional). ##### Returns: * `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`. The sampled classes. * `true_expected_count`: A tensor of type `float`. Same shape as `true_classes`. The expected counts under the sampling distribution of each of `true_classes`. * `sampled_expected_count`: A tensor of type `float`. Same shape as `sampled_candidates`. The expected counts under the sampling distribution of each of `sampled_candidates`. - - - ### `tf.nn.fixed_unigram_candidate_sampler(true_classes, num_true, num_sampled, unique, range_max, vocab_file='', distortion=0.0, num_reserved_ids=0, num_shards=1, shard=0, unigrams=[], seed=None, name=None)` Samples a set of classes using the provided (fixed) base distribution. This operation randomly samples a tensor of sampled classes (`sampled_candidates`) from the range of integers `[0, range_max]`. The elements of `sampled_candidates` are drawn without replacement (if `unique=True`) or with replacement (if `unique=False`) from the base distribution. The base distribution is read from a file or passed in as an in-memory array. There is also an option to skew the distribution by applying a distortion power to the weights. In addition, this operation returns tensors `true_expected_count` and `sampled_expected_count` representing the number of times each of the target classes (`true_classes`) and the sampled classes (`sampled_candidates`) is expected to occur in an average tensor of sampled classes. These values correspond to `Q(y|x)` defined in [this document](http://www.tensorflow.org/extras/candidate_sampling.pdf). If `unique=True`, then these are post-rejection probabilities and we compute them approximately. ##### Args: * `true_classes`: A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. * `num_true`: An `int`. The number of target classes per training example. * `num_sampled`: An `int`. The number of classes to randomly sample per batch. * `unique`: A `bool`. Determines whether all sampled classes in a batch are unique. * `range_max`: An `int`. The number of possible classes. * `vocab_file`: Each valid line in this file (which should have a CSV-like format) corresponds to a valid word ID. IDs are in sequential order, starting from num_reserved_ids. The last entry in each line is expected to be a value corresponding to the count or relative probability. Exactly one of `vocab_file` and `unigrams` needs to be passed to this operation. * `distortion`: The distortion is used to skew the unigram probability distribution. Each weight is first raised to the distortion's power before adding to the internal unigram distribution. As a result, `distortion = 1.0` gives regular unigram sampling (as defined by the vocab file), and `distortion = 0.0` gives a uniform distribution. * `num_reserved_ids`: Optionally some reserved IDs can be added in the range `[0, num_reserved_ids]` by the users. One use case is that a special unknown word token is used as ID 0. These IDs will have a sampling probability of 0. * `num_shards`: A sampler can be used to sample from a subset of the original range in order to speed up the whole computation through parallelism. This parameter (together with `shard`) indicates the number of partitions that are being used in the overall computation. * `shard`: A sampler can be used to sample from a subset of the original range in order to speed up the whole computation through parallelism. This parameter (together with `num_shards`) indicates the particular partition number of the operation, when partitioning is being used. * `unigrams`: A list of unigram counts or probabilities, one per ID in sequential order. Exactly one of `vocab_file` and `unigrams` should be passed to this operation. * `seed`: An `int`. An operation-specific seed. Default is 0. * `name`: A name for the operation (optional). ##### Returns: * `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`. The sampled classes. * `true_expected_count`: A tensor of type `float`. Same shape as `true_classes`. The expected counts under the sampling distribution of each of `true_classes`. * `sampled_expected_count`: A tensor of type `float`. Same shape as `sampled_candidates`. The expected counts under the sampling distribution of each of `sampled_candidates`. ### Miscellaneous candidate sampling utilities - - - ### `tf.nn.compute_accidental_hits(true_classes, sampled_candidates, num_true, seed=None, name=None)` Compute the ids of positions in sampled_candidates matching true_classes. In Candidate Sampling, this operation facilitates virtually removing sampled classes which happen to match target classes. This is done in Sampled Softmax and Sampled Logistic. See our [Candidate Sampling Algorithms Reference](http://www.tensorflow.org/extras/candidate_sampling.pdf). We presuppose that the `sampled_candidates` are unique. We call it an 'accidental hit' when one of the target classes matches one of the sampled classes. This operation reports accidental hits as triples `(index, id, weight)`, where `index` represents the row number in `true_classes`, `id` represents the position in `sampled_candidates`, and weight is `-FLOAT_MAX`. The result of this op should be passed through a `sparse_to_dense` operation, then added to the logits of the sampled classes. This removes the contradictory effect of accidentally sampling the true target classes as noise classes for the same example. ##### Args: * `true_classes`: A `Tensor` of type `int64` and shape `[batch_size, num_true]`. The target classes. * `sampled_candidates`: A tensor of type `int64` and shape `[num_sampled]`. The sampled_candidates output of CandidateSampler. * `num_true`: An `int`. The number of target classes per training example. * `seed`: An `int`. An operation-specific seed. Default is 0. * `name`: A name for the operation (optional). ##### Returns: * `indices`: A `Tensor` of type `int32` and shape `[num_accidental_hits]`. Values indicate rows in `true_classes`. * `ids`: A `Tensor` of type `int64` and shape `[num_accidental_hits]`. Values indicate positions in `sampled_candidates`. * `weights`: A `Tensor` of type `float` and shape `[num_accidental_hits]`. Each value is `-FLOAT_MAX`. ================================================ FILE: SOURCE/api_docs/python/python_io.md ================================================ # Data IO (Python functions) ## Contents ### [Data IO (Python functions)](#AUTOGENERATED-data-io--python-functions-) * [Data IO (Python Functions)](#AUTOGENERATED-data-io--python-functions-) * [`class tf.python_io.TFRecordWriter`](#TFRecordWriter) * [`tf.python_io.tf_record_iterator(path)`](#tf_record_iterator) * [TFRecords Format Details](#AUTOGENERATED-tfrecords-format-details) ## Data IO (Python Functions) A TFRecords file represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired. - - - ### `class tf.python_io.TFRecordWriter` A class to write records to a TFRecords file. This class implements `__enter__` and `__exit__`, and can be used in `with` blocks like a normal file. - - - #### `tf.python_io.TFRecordWriter.__init__(path)` Opens file `path` and creates a `TFRecordWriter` writing to it. ##### Args: * `path`: The path to the TFRecords file. ##### Raises: * `IOError`: If `path` cannot be opened for writing. - - - #### `tf.python_io.TFRecordWriter.write(record)` Write a string record to the file. ##### Args: * `record`: str - - - #### `tf.python_io.TFRecordWriter.close()` Close the file. - - - ### `tf.python_io.tf_record_iterator(path)` An iterator that read the records from a TFRecords file. ##### Args: * `path`: The path to the TFRecords file. ##### Yields: Strings. ##### Raises: * `IOError`: If `path` cannot be opened for reading. - - - ### TFRecords Format Details A TFRecords file contains a sequence of strings with CRC hashes. Each record has the format uint64 length uint32 masked_crc32_of_length byte data[length] uint32 masked_crc32_of_data and the records are concatenated together to produce the file. The CRC32s are [described here](https://en.wikipedia.org/wiki/Cyclic_redundancy_check), and the mask of a CRC is masked_crc = ((crc >> 15) | (crc << 17)) + 0xa282ead8ul ================================================ FILE: SOURCE/api_docs/python/sparse_ops.md ================================================ # Sparse Tensors Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Sparse Tensors](#AUTOGENERATED-sparse-tensors) * [Sparse Tensor Representation](#AUTOGENERATED-sparse-tensor-representation) * [`class tf.SparseTensor`](#SparseTensor) * [`class tf.SparseTensorValue`](#SparseTensorValue) * [Sparse to Dense Conversion](#AUTOGENERATED-sparse-to-dense-conversion) * [`tf.sparse_to_dense(sparse_indices, output_shape, sparse_values, default_value, name=None)`](#sparse_to_dense) * [`tf.sparse_tensor_to_dense(sp_input, default_value, name=None)`](#sparse_tensor_to_dense) * [`tf.sparse_to_indicator(sp_input, vocab_size, name=None)`](#sparse_to_indicator) * [Manipulation](#AUTOGENERATED-manipulation) * [`tf.sparse_concat(concat_dim, sp_inputs, name=None)`](#sparse_concat) * [`tf.sparse_reorder(sp_input, name=None)`](#sparse_reorder) * [`tf.sparse_retain(sp_input, to_retain)`](#sparse_retain) * [`tf.sparse_fill_empty_rows(sp_input, default_value, name=None)`](#sparse_fill_empty_rows) ## Sparse Tensor Representation Tensorflow supports a `SparseTensor` representation for data that is sparse in multiple dimensions. Contrast this representation with `IndexedSlices`, which is efficient for representing tensors that are sparse in their first dimension, and dense along all other dimensions. - - - ### `class tf.SparseTensor` Represents a sparse tensor. Tensorflow represents a sparse tensor as three separate dense tensors: `indices`, `values`, and `dense_shape`. In Python, the three tensors are collected into a `SparseTensor` class for ease of use. If you have separate `indices`, `values`, and `dense_shape` tensors, wrap them in a `SparseTensor` object before passing to the Ops below. Concretely, the sparse tensor `SparseTensor(values, indices, dense_shape)` is * `indices`: A 2-D int64 tensor of shape `[N, ndims]`. * `values`: A 1-D tensor of any type and shape `[N]`. * `dense_shape`: A 1-D int64 tensor of shape `[ndims]`. where `N` and `ndims` are the number of values, and number of dimensions in the `SparseTensor` respectively. The corresponding dense tensor satisfies ```python dense.shape = dense_shape dense[tuple(indices[i])] = values[i] ``` By convention, `indices` should be sorted in row-major order (or equivalently lexigraphic order on the tuples `indices[i]`). This is not enforced when `SparseTensor` objects are constructed, but most Ops assume correct ordering. If the ordering is wrong, it can be fixed by calling `sparse_reorder` on the misordered `SparseTensor`. Example: The sparse tensor ```python SparseTensor(values=[1, 2], indices=[[0, 0], [1, 2]], shape=[3, 4]) ``` represents the dense tensor ```python [[1, 0, 0, 0] [0, 0, 2, 0] [0, 0, 0, 0]] ``` - - - #### `tf.SparseTensor.__init__(indices, values, shape)` Creates a `SparseTensor`. ##### Args: * `indices`: A 2-D int64 tensor of shape `[N, ndims]`. * `values`: A 1-D tensor of any type and shape `[N]`. * `dense_shape`: A 1-D int64 tensor of shape `[ndims]`. ##### Returns: A `SparseTensor` - - - #### `tf.SparseTensor.indices` The indices of non-zero values in the represented dense tensor. ##### Returns: A 2-D Tensor of int64 with shape `[N, ndims]`, where `N` is the number of non-zero values in the tensor, and `ndims` is the rank. - - - #### `tf.SparseTensor.values` The non-zero values in the represented dense tensor. ##### Returns: A 1-D Tensor of any data type. - - - #### `tf.SparseTensor.dtype` The `DType` of elements in this tensor. - - - #### `tf.SparseTensor.shape` A 1-D Tensor of int64 representing the shape of the dense tensor. - - - #### `tf.SparseTensor.graph` The `Graph` that contains the index, value, and shape tensors. - - - ### `class tf.SparseTensorValue` SparseTensorValue(indices, values, shape) - - - #### `tf.SparseTensorValue.indices` Alias for field number 0 - - - #### `tf.SparseTensorValue.shape` Alias for field number 2 - - - #### `tf.SparseTensorValue.values` Alias for field number 1 ## Sparse to Dense Conversion - - - ### `tf.sparse_to_dense(sparse_indices, output_shape, sparse_values, default_value, name=None)` Converts a sparse representation into a dense tensor. Builds an array `dense` with shape `output_shape` such that ```prettyprint # If sparse_indices is scalar dense[i] = (i == sparse_indices ? sparse_values : default_value) # If sparse_indices is a vector, then for each i dense[sparse_indices[i]] = sparse_values[i] # If sparse_indices is an n by d matrix, then for each i in [0, n) dense[sparse_indices[i][0], ..., sparse_indices[i][d-1]] = sparse_values[i] ``` All other values in `dense` are set to `default_value`. If `sparse_values` is a scalar, all sparse indices are set to this single value. ##### Args: * `sparse_indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`. 0-D, 1-D, or 2-D. `sparse_indices[i]` contains the complete index where `sparse_values[i]` will be placed. * `output_shape`: A `Tensor`. Must have the same type as `sparse_indices`. 1-D. Shape of the dense output tensor. * `sparse_values`: A `Tensor`. 1-D. Values corresponding to each row of `sparse_indices`, or a scalar value to be used for all sparse indices. * `default_value`: A `Tensor`. Must have the same type as `sparse_values`. Scalar value to set for indices not specified in `sparse_indices`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `sparse_values`. Dense output tensor of shape `output_shape`. - - - ### `tf.sparse_tensor_to_dense(sp_input, default_value, name=None)` Converts a `SparseTensor` into a dense tensor. This op is a convenience wrapper around `sparse_to_dense` for `SparseTensor`s. For example, if `sp_input` has shape `[3, 5]` and non-empty string values: [0, 1]: a [0, 3]: b [2, 0]: c and `default_value` is `x`, then the output will be a dense `[3, 5]` string tensor with values: [[x a x b x] [x x x x x] [c x x x x]] ##### Args: * `sp_input`: The input `SparseTensor`. * `default_value`: Scalar value to set for indices not specified in `sp_input`. * `name`: A name prefix for the returned tensors (optional). ##### Returns: A dense tensor with shape `sp_input.shape` and values specified by the non-empty values in `sp_input`. Indices not in `sp_input` are assigned `default_value`. ##### Raises: * `TypeError`: If `sp_input` is not a `SparseTensor`. - - - ### `tf.sparse_to_indicator(sp_input, vocab_size, name=None)` Converts a `SparseTensor` of ids into a dense bool indicator tensor. The last dimension of `sp_input` is discarded and replaced with the values of `sp_input`. If `sp_input.shape = [D0, D1, ..., Dn, K]`, then `output.shape = [D0, D1, ..., Dn, vocab_size]`, where output[d_0, d_1, ..., d_n, sp_input[d_0, d_1, ..., d_n, k]] = True and False elsewhere in `output`. For example, if `sp_input.shape = [2, 3, 4]` with non-empty values: [0, 0, 0]: 0 [0, 1, 0]: 10 [1, 0, 3]: 103 [1, 1, 2]: 112 [1, 1, 3]: 113 [1, 2, 1]: 121 and `vocab_size = 200`, then the output will be a `[2, 3, 200]` dense bool tensor with False everywhere except at positions (0, 0, 0), (0, 1, 10), (1, 0, 103), (1, 1, 112), (1, 1, 113), (1, 2, 121). This op is useful for converting `SparseTensor`s into dense formats for compatibility with ops that expect dense tensors. The input `SparseTensor` must be in row-major order. ##### Args: * `sp_input`: A `SparseTensor` of type `int32` or `int64`. * `vocab_size`: The new size of the last dimension, with `all(0 <= sp_input.values < vocab_size)`. * `name`: A name prefix for the returned tensors (optional) ##### Returns: A dense bool indicator tensor representing the indices with specified value. ##### Raises: * `TypeError`: If `sp_input` is not a `SparseTensor`. ## Manipulation - - - ### `tf.sparse_concat(concat_dim, sp_inputs, name=None)` Concatenates a list of `SparseTensor` along the specified dimension. Concatenation is with respect to the dense versions of each sparse input. It is assumed that each inputs is a `SparseTensor` whose elements are ordered along increasing dimension number. All inputs' shapes must match, except for the concat dimension. The `indices`, `values`, and `shapes` lists must have the same length. The output shape is identical to the inputs', except along the concat dimension, where it is the sum of the inputs' sizes along that dimension. The output elements will be resorted to preserve the sort order along increasing dimension number. This op runs in `O(M log M)` time, where `M` is the total number of non-empty values across all inputs. This is due to the need for an internal sort in order to concatenate efficiently across an arbitrary dimension. For example, if `concat_dim = 1` and the inputs are sp_inputs[0]: shape = [2, 3] [0, 2]: "a" [1, 0]: "b" [1, 1]: "c" sp_inputs[1]: shape = [2, 4] [0, 1]: "d" [0, 2]: "e" then the output will be shape = [2, 7] [0, 2]: "a" [0, 4]: "d" [0, 5]: "e" [1, 0]: "b" [1, 1]: "c" Graphically this is equivalent to doing [ a] concat [ d e ] = [ a d e ] [b c ] [ ] [b c ] ##### Args: * `concat_dim`: Dimension to concatenate along. * `sp_inputs`: List of `SparseTensor` to concatenate. * `name`: A name prefix for the returned tensors (optional). ##### Returns: A `SparseTensor` with the concatenated output. ##### Raises: * `TypeError`: If `sp_inputs` is not a list of `SparseTensor`. - - - ### `tf.sparse_reorder(sp_input, name=None)` Reorders a `SparseTensor` into the canonical, row-major ordering. Note that by convention, all sparse ops preserve the canonical ordering along increasing dimension number. The only time ordering can be violated is during manual manipulation of the indices and values to add entries. Reordering does not affect the shape of the `SparseTensor`. For example, if sp_input has shape `[4, 5]` and `indices` / `values`: [0, 3]: b [0, 1]: a [3, 1]: d [2, 0]: c then the output will be a `SparseTensor` of shape `[4, 5]` and `indices` / `values`: [0, 1]: a [0, 3]: b [2, 0]: c [3, 1]: d ##### Args: * `sp_input`: The input `SparseTensor`. * `name`: A name prefix for the returned tensors (optional) ##### Returns: A `SparseTensor` with the same shape and non-empty values, but in canonical ordering. ##### Raises: * `TypeError`: If `sp_input` is not a `SparseTensor`. - - - ### `tf.sparse_retain(sp_input, to_retain)` Retains specified non-empty values within a `SparseTensor`. For example, if `sp_input` has shape `[4, 5]` and 4 non-empty string values: [0, 1]: a [0, 3]: b [2, 0]: c [3, 1]: d and `to_retain = [True, False, False, True]`, then the output will be a `SparseTensor` of shape `[4, 5]` with 2 non-empty values: [0, 1]: a [3, 1]: d ##### Args: * `sp_input`: The input `SparseTensor` with `N` non-empty elements. * `to_retain`: A bool vector of length `N` with `M` true values. ##### Returns: A `SparseTensor` with the same shape as the input and `M` non-empty elements corresponding to the true positions in `to_retain`. ##### Raises: * `TypeError`: If `sp_input` is not a `SparseTensor`. - - - ### `tf.sparse_fill_empty_rows(sp_input, default_value, name=None)` Fills empty rows in the input 2-D `SparseTensor` with a default value. This op adds entries with the specified `default_value` at index `[row, 0]` for any row in the input that does not already have a value. For example, suppose `sp_input` has shape `[5, 6]` and non-empty values: [0, 1]: a [0, 3]: b [2, 0]: c [3, 1]: d Rows 1 and 4 are empty, so the output will be of shape `[5, 6]` with values: [0, 1]: a [0, 3]: b [1, 0]: default_value [2, 0]: c [3, 1]: d [4, 0]: default_value Note that the input may have empty columns at the end, with no effect on this op. The output `SparseTensor` will be in row-major order and will have the same shape as the input. This op also returns an indicator vector such that empty_row_indicator[i] = True iff row i was an empty row. ##### Args: * `sp_input`: A `SparseTensor` with shape `[N, M]`. * `default_value`: The value to fill for empty rows, with the same type as `sp_input.` * `name`: A name prefix for the returned tensors (optional) ##### Returns: * `sp_ordered_output`: A `SparseTensor` with shape `[N, M]`, and with all empty rows filled in with `default_value`. * `empty_row_indicator`: A bool vector of length `N` indicating whether each input row was empty. ##### Raises: * `TypeError`: If `sp_input` is not a `SparseTensor`. ================================================ FILE: SOURCE/api_docs/python/state_ops.md ================================================ # Variables Note: Functions taking `Tensor` arguments can also take anything accepted by [`tf.convert_to_tensor`](../../api_docs/python/framework.md#convert_to_tensor). ## Contents ### [Variables](#AUTOGENERATED-variables) * [Variables](#AUTOGENERATED-variables) * [`class tf.Variable`](#Variable) * [Variable helper functions](#AUTOGENERATED-variable-helper-functions) * [`tf.all_variables()`](#all_variables) * [`tf.trainable_variables()`](#trainable_variables) * [`tf.initialize_all_variables()`](#initialize_all_variables) * [`tf.initialize_variables(var_list, name='init')`](#initialize_variables) * [`tf.assert_variables_initialized(var_list=None)`](#assert_variables_initialized) * [Saving and Restoring Variables](#AUTOGENERATED-saving-and-restoring-variables) * [`class tf.train.Saver`](#Saver) * [`tf.train.latest_checkpoint(checkpoint_dir, latest_filename=None)`](#latest_checkpoint) * [`tf.train.get_checkpoint_state(checkpoint_dir, latest_filename=None)`](#get_checkpoint_state) * [`tf.train.update_checkpoint_state(save_dir, model_checkpoint_path, all_model_checkpoint_paths=None, latest_filename=None)`](#update_checkpoint_state) * [Sharing Variables](#AUTOGENERATED-sharing-variables) * [`tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None)`](#get_variable) * [`tf.get_variable_scope()`](#get_variable_scope) * [`tf.variable_scope(name_or_scope, reuse=None, initializer=None)`](#variable_scope) * [`tf.constant_initializer(value=0.0)`](#constant_initializer) * [`tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None)`](#random_normal_initializer) * [`tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None)`](#truncated_normal_initializer) * [`tf.random_uniform_initializer(minval=0.0, maxval=1.0, seed=None)`](#random_uniform_initializer) * [`tf.uniform_unit_scaling_initializer(factor=1.0, seed=None)`](#uniform_unit_scaling_initializer) * [`tf.zeros_initializer(shape, dtype=tf.float32)`](#zeros_initializer) * [Sparse Variable Updates](#AUTOGENERATED-sparse-variable-updates) * [`tf.scatter_update(ref, indices, updates, use_locking=None, name=None)`](#scatter_update) * [`tf.scatter_add(ref, indices, updates, use_locking=None, name=None)`](#scatter_add) * [`tf.scatter_sub(ref, indices, updates, use_locking=None, name=None)`](#scatter_sub) * [`tf.sparse_mask(a, mask_indices, name=None)`](#sparse_mask) * [`class tf.IndexedSlices`](#IndexedSlices) ## Variables - - - ### `class tf.Variable` See the [Variables How To](../../how_tos/variables/index.md) for a high level overview. A variable maintains state in the graph across calls to `run()`. You add a variable to the graph by constructing an instance of the class `Variable`. The `Variable()` constructor requires an initial value for the variable, which can be a `Tensor` of any type and shape. The initial value defines the type and shape of the variable. After construction, the type and shape of the variable are fixed. The value can be changed using one of the assign methods. If you want to change the shape of a variable later you have to use an `assign` Op with `validate_shape=False`. Just like any `Tensor`, variables created with `Variable()` can be used as inputs for other Ops in the graph. Additionally, all the operators overloaded for the `Tensor` class are carried over to variables, so you can also add nodes to the graph by just doing arithmetic on variables. ```python import tensorflow as tf # Create a variable. w = tf.Variable(, name=) # Use the variable in the graph like any Tensor. y = tf.matmul(w, ...another variable or tensor...) # The overloaded operators are available too. z = tf.sigmoid(w + b) # Assign a new value to the variable with `assign()` or a related method. w.assign(w + 1.0) w.assign_add(1.0) ``` When you launch the graph, variables have to be explicitly initialized before you can run Ops that use their value. You can initialize a variable by running its *initializer op*, restoring the variable from a save file, or simply running an `assign` Op that assigns a value to the variable. In fact, the variable *initializer op* is just an `assign` Op that assigns the variable's initial value to the variable itself. ```python # Launch the graph in a session. with tf.Session() as sess: # Run the variable initializer. sess.run(w.initializer) # ...you now can run ops that use the value of 'w'... ``` The most common initialization pattern is to use the convenience function `initialize_all_variables()` to add an Op to the graph that initializes all the variables. You then run that Op after launching the graph. ```python # Add an Op to initialize all variables. init_op = tf.initialize_all_variables() # Launch the graph in a session. with tf.Session() as sess: # Run the Op that initializes all variables. sess.run(init_op) # ...you can now run any Op that uses variable values... ``` If you need to create a variable with an initial value dependent on another variable, use the other variable's `initialized_value()`. This ensures that variables are initialized in the right order. All variables are automatically collected in the graph where they are created. By default, the constructor adds the new variable to the graph collection `GraphKeys.VARIABLES`. The convenience function `all_variables()` returns the contents of that collection. When building a machine learning model it is often convenient to distinguish betwen variables holding the trainable model parameters and other variables such as a `global step` variable used to count training steps. To make this easier, the variable constructor supports a `trainable=` parameter. If `True`, the new variable is also added to the graph collection `GraphKeys.TRAINABLE_VARIABLES`. The convenience function `trainable_variables()` returns the contents of this collection. The various `Optimizer` classes use this collection as the default list of variables to optimize. Creating a variable. - - - #### `tf.Variable.__init__(initial_value, trainable=True, collections=None, validate_shape=True, name=None)` Creates a new variable with value `initial_value`. The new variable is added to the graph collections listed in `collections`, which defaults to `[GraphKeys.VARIABLES]`. If `trainable` is `True` the variable is also added to the graph collection `GraphKeys.TRAINABLE_VARIABLES`. This constructor creates both a `variable` Op and an `assign` Op to set the variable to its initial value. ##### Args: * `initial_value`: A `Tensor`, or Python object convertible to a `Tensor`. The initial value for the Variable. Must have a shape specified unless `validate_shape` is set to False. * `trainable`: If `True`, the default, also adds the variable to the graph collection `GraphKeys.TRAINABLE_VARIABLES`. This collection is used as the default list of variables to use by the `Optimizer` classes. * `collections`: List of graph collections keys. The new variable is added to these collections. Defaults to `[GraphKeys.VARIABLES]`. * `validate_shape`: If `False`, allows the variable to be initialized with a value of unknown shape. If `True`, the default, the shape of `initial_value` must be known. * `name`: Optional name for the variable. Defaults to `'Variable'` and gets uniquified automatically. ##### Returns: A Variable. ##### Raises: * `ValueError`: If the initial value does not have a shape and `validate_shape` is `True`. - - - #### `tf.Variable.initialized_value()` Returns the value of the initialized variable. You should use this instead of the variable itself to initialize another variable with a value that depends on the value of this variable. ```python # Initialize 'v' with a random tensor. v = tf.Variable(tf.truncated_normal([10, 40])) # Use `initialized_value` to guarantee that `v` has been # initialized before its value is used to initialize `w`. # The random values are picked only once. w = tf.Variable(v.initialized_value() * 2.0) ``` ##### Returns: A `Tensor` holding the value of this variable after its initializer has run. Changing a variable value. - - - #### `tf.Variable.assign(value, use_locking=False)` Assigns a new value to the variable. This is essentially a shortcut for `assign(self, value)`. ##### Args: * `value`: A `Tensor`. The new value for this variable. * `use_locking`: If `True`, use locking during the assignment. ##### Returns: A `Tensor` that will hold the new value of this variable after the assignment has completed. - - - #### `tf.Variable.assign_add(delta, use_locking=False)` Adds a value to this variable. This is essentially a shortcut for `assign_add(self, delta)`. ##### Args: * `delta`: A `Tensor`. The value to add to this variable. * `use_locking`: If `True`, use locking during the operation. ##### Returns: A `Tensor` that will hold the new value of this variable after the addition has completed. - - - #### `tf.Variable.assign_sub(delta, use_locking=False)` Subtracts a value from this variable. This is essentially a shortcut for `assign_sub(self, delta)`. ##### Args: * `delta`: A `Tensor`. The value to subtract from this variable. * `use_locking`: If `True`, use locking during the operation. ##### Returns: A `Tensor` that will hold the new value of this variable after the subtraction has completed. - - - #### `tf.Variable.scatter_sub(sparse_delta, use_locking=False)` Subtracts `IndexedSlices` from this variable. This is essentially a shortcut for `scatter_sub(self, sparse_delta.indices, sparse_delta.values)`. ##### Args: * `sparse_delta`: `IndexedSlices` to be subtracted from this variable. * `use_locking`: If `True`, use locking during the operation. ##### Returns: A `Tensor` that will hold the new value of this variable after the scattered subtraction has completed. ##### Raises: * `ValueError`: if `sparse_delta` is not an `IndexedSlices`. - - - #### `tf.Variable.count_up_to(limit)` Increments this variable until it reaches `limit`. When that Op is run it tries to increment the variable by `1`. If incrementing the variable would bring it above `limit` then the Op raises the exception `OutOfRangeError`. If no error is raised, the Op outputs the value of the variable before the increment. This is essentially a shortcut for `count_up_to(self, limit)`. ##### Args: * `limit`: value at which incrementing the variable raises an error. ##### Returns: A `Tensor` that will hold the variable value before the increment. If no other Op modifies this variable, the values produced will all be distinct. - - - #### `tf.Variable.eval(session=None)` In a session, computes and returns the value of this variable. This is not a graph construction method, it does not add ops to the graph. This convenience method requires a session where the graph containing this variable has been launched. If no session is passed, the default session is used. See the [Session class](../../api_docs/python/client.md#Session) for more information on launching a graph and on sessions. ```python v = tf.Variable([1, 2]) init = tf.initialize_all_variables() with tf.Session() as sess: sess.run(init) # Usage passing the session explicitly. print v.eval(sess) # Usage with the default session. The 'with' block # above makes 'sess' the default session. print v.eval() ``` ##### Args: * `session`: The session to use to evaluate this variable. If none, the default session is used. ##### Returns: A numpy `ndarray` with a copy of the value of this variable. Properties. - - - #### `tf.Variable.name` The name of this variable. - - - #### `tf.Variable.dtype` The `DType` of this variable. - - - #### `tf.Variable.get_shape()` The `TensorShape` of this variable. ##### Returns: A `TensorShape`. - - - #### `tf.Variable.device` The device of this variable. - - - #### `tf.Variable.initializer` The initializer operation for this variable. - - - #### `tf.Variable.graph` The `Graph` of this variable. - - - #### `tf.Variable.op` The `Operation` of this variable. ## Variable helper functions TensorFlow provides a set of functions to help manage the set of variables collected in the graph. - - - ### `tf.all_variables()` Returns all variables collected in the graph. The `Variable()` constructor automatically adds new variables to the graph collection `GraphKeys.VARIABLES`. This convenience function returns the contents of that collection. ##### Returns: A list of `Variable` objects. - - - ### `tf.trainable_variables()` Returns all variables created with `trainable=True`. When passed `trainable=True`, the `Variable()` constructor automatically adds new variables to the graph collection `GraphKeys.TRAINABLE_VARIABLES`. This convenience function returns the contents of that collection. ##### Returns: A list of Variable objects. - - - ### `tf.initialize_all_variables()` Returns an Op that initializes all variables. This is just a shortcut for `initialize_variables(all_variables())` ##### Returns: An Op that initializes all variables in the graph. - - - ### `tf.initialize_variables(var_list, name='init')` Returns an Op that initializes a list of variables. After you launch the graph in a session, you can run the returned Op to initialize all the variables in `var_list`. This Op runs all the initializers of the variables in `var_list` in parallel. Calling `initialize_variables()` is equivalent to passing the list of initializers to `Group()`. If `var_list` is empty, however, the function still returns an Op that can be run. That Op just has no effect. ##### Args: * `var_list`: List of `Variable` objects to initialize. * `name`: Optional name for the returned operation. ##### Returns: An Op that run the initializers of all the specified variables. - - - ### `tf.assert_variables_initialized(var_list=None)` Returns an Op to check if variables are initialized. When run, the returned Op will raise the exception `FailedPreconditionError` if any of the variables has not yet been initialized. Note: This function is implemented by trying to fetch the values of the variables. If one of the variables is not initialized a message may be logged by the C++ runtime. This is expected. ##### Args: * `var_list`: List of `Variable` objects to check. Defaults to the value of `all_variables().` ##### Returns: An Op, or None if there are no variables. ## Saving and Restoring Variables - - - ### `class tf.train.Saver` Saves and restores variables. See [Variables](../../how_tos/variables/index.md) for an overview of variables, saving and restoring. The `Saver` class adds ops to save and restore variables to and from *checkpoints*. It also provides convenience methods to run these ops. Checkpoints are binary files in a proprietary format which map variable names to tensor values. The best way to examine the contents of a checkpoint is to load it using a `Saver`. Savers can automatically number checkpoint filenames with a provided counter. This lets you keep multiple checkpoints at different steps while training a model. For example you can number the checkpoint filenames with the training step number. To avoid filling up disks, savers manage checkpoint files automatically. For example, they can keep only the N most recent files, or one checkpoint for every N hours of training. You number checkpoint filenames by passing a value to the optional `global_step` argument to `save()`: ```python saver.save(sess, 'my-model', global_step=0) ==> filename: 'my-model-0' ... saver.save(sess, 'my-model', global_step=1000) ==> filename: 'my-model-1000' ``` Additionally, optional arguments to the `Saver()` constructor let you control the proliferation of checkpoint files on disk: * `max_to_keep` indicates the maximum number of recent checkpoint files to keep. As new files are created, older files are deleted. If None or 0, all checkpoint files are kept. Defaults to 5 (that is, the 5 most recent checkpoint files are kept.) * `keep_checkpoint_every_n_hours`: In addition to keeping the most recent `max_to_keep` checkpoint files, you might want to keep one checkpoint file for every N hours of training. This can be useful if you want to later analyze how a model progressed during a long training session. For example, passing `keep_checkpoint_every_n_hours=2` ensures that you keep one checkpoint file for every 2 hours of training. The default value of 10,000 hours effectively disables the feature. Note that you still have to call the `save()` method to save the model. Passing these arguments to the constructor will not save variables automatically for you. A training program that saves regularly looks like: ```python ... # Create a saver. saver = tf.train.Saver(...variables...) # Launch the graph and train, saving the model every 1,000 steps. sess = tf.Session() for step in xrange(1000000): sess.run(..training_op..) if step % 1000 == 0: # Append the step number to the checkpoint name: saver.save(sess, 'my-model', global_step=step) ``` In addition to checkpoint files, savers keep a protocol buffer on disk with the list of recent checkpoints. This is used to manage numbered checkpoint files and by `latest_checkpoint()`, which makes it easy to discover the path to the most recent checkpoint. That protocol buffer is stored in a file named 'checkpoint' next to the checkpoint files. If you create several savers, you can specify a different filename for the protocol buffer file in the call to `save()`. - - - #### `tf.train.Saver.__init__(var_list=None, reshape=False, sharded=False, max_to_keep=5, keep_checkpoint_every_n_hours=10000.0, name=None, restore_sequentially=False, saver_def=None, builder=None)` Creates a `Saver`. The constructor adds ops to save and restore variables. `var_list` specifies the variables that will be saved and restored. It can be passed as a `dict` or a list: * A `dict` of names to variables: The keys are the names that will be used to save or restore the variables in the checkpoint files. * A list of variables: The variables will be keyed with their op name in the checkpoint files. For example: ```python v1 = tf.Variable(..., name='v1') v2 = tf.Variable(..., name='v2') # Pass the variables as a dict: saver = tf.train.Saver({'v1': v1, 'v2': v2}) # Or pass them as a list. saver = tf.train.Saver([v1, v2]) # Passing a list is equivalent to passing a dict with the variable op names # as keys: saver = tf.train.Saver({v.op.name: v for v in [v1, v2]}) ``` The optional `reshape` argument, if True, allows restoring a variable from a save file where the variable had a different shape, but the same number of elements and type. This is useful if you have reshaped a variable and want to reload it from an older checkpoint. The optional `sharded` argument, if True, instructs the saver to shard checkpoints per device. ##### Args: * `var_list`: A list of Variables or a dictionary mapping names to Variables. If None, defaults to the list of all variables. * `reshape`: If True, allows restoring parameters from a checkpoint where the variables have a different shape. * `sharded`: If True, shard the checkpoints, one per device. * `max_to_keep`: maximum number of recent checkpoints to keep. Defaults to 10,000 hours. * `keep_checkpoint_every_n_hours`: How often to keep checkpoints. Defaults to 10,000 hours. * `name`: string. Optional name to use as a prefix when adding operations. * `restore_sequentially`: A Bool, which if true, causes restore of different variables to happen sequentially within each device. This can lower memory usage when restoring very large models. * `saver_def`: Optional SaverDef proto to use instead of running the builder. This is only useful for specialty code that wants to recreate a Saver object for a previously built Graph that had a Saver. The saver_def proto should be the one returned by the as_saver_def() call of the Saver that was created for that Graph. * `builder`: Optional SaverBuilder to use if a saver_def was not provided. Defaults to BaseSaverBuilder(). ##### Raises: * `TypeError`: If `var_list` is invalid. * `ValueError`: If any of the keys or values in `var_list` is not unique. - - - #### `tf.train.Saver.save(sess, save_path, global_step=None, latest_filename=None)` Saves variables. This method runs the ops added by the constructor for saving variables. It requires a session in which the graph was launched. The variables to save must also have been initialized. The method returns the path of the newly created checkpoint file. This path can be passed directly to a call to `restore()`. ##### Args: * `sess`: A Session to use to save the variables. * `save_path`: string. Path to the checkpoint filename. If the saver is `sharded`, this is the prefix of the sharded checkpoint filename. * `global_step`: If provided the global step number is appended to `save_path` to create the checkpoint filename. The optional argument can be a Tensor, a Tensor name or an integer. * `latest_filename`: Optional name for the protocol buffer file that will contains the list of most recent checkpoint filenames. That file, kept in the same directory as the checkpoint files, is automatically managed by the saver to keep track of recent checkpoints. Defaults to 'checkpoint'. ##### Returns: A string: path at which the variables were saved. If the saver is sharded, this string ends with: '-?????-of-nnnnn' where 'nnnnn' is the number of shards created. ##### Raises: * `TypeError`: If `sess` is not a Session. - - - #### `tf.train.Saver.restore(sess, save_path)` Restores previously saved variables. This method runs the ops added by the constructor for restoring variables. It requires a session in which the graph was launched. The variables to restore do not have to have been initialized, as restoring is itself a way to initialize variables. The `save_path` argument is typically a value previously returned from a `save()` call, or a call to `latest_checkpoint()`. ##### Args: * `sess`: A Session to use to restore the parameters. * `save_path`: Path where parameters were previously saved. Other utility methods. - - - #### `tf.train.Saver.last_checkpoints` List of not-yet-deleted checkpoint filenames. You can pass any of the returned values to `restore()`. ##### Returns: A list of checkpoint filenames, sorted from oldest to newest. - - - #### `tf.train.Saver.set_last_checkpoints(last_checkpoints)` Sets the list of not-yet-deleted checkpoint filenames. ##### Args: * `last_checkpoints`: a list of checkpoint filenames. ##### Raises: * `AssertionError`: if the list of checkpoint filenames has already been set. - - - #### `tf.train.Saver.as_saver_def()` Generates a `SaverDef` representation of this saver. ##### Returns: A `SaverDef` proto. - - - ### `tf.train.latest_checkpoint(checkpoint_dir, latest_filename=None)` Finds the filename of latest saved checkpoint file. ##### Args: * `checkpoint_dir`: Directory where the variables were saved. * `latest_filename`: Optional name for the protocol buffer file that contains the list of most recent checkpoint filenames. See the corresponding argument to `Saver.save()`. ##### Returns: The full path to the latest checkpoint or None if no checkpoint was found. - - - ### `tf.train.get_checkpoint_state(checkpoint_dir, latest_filename=None)` Returns CheckpointState proto from the "checkpoint" file. If the "checkpoint" file contains a valid CheckpointState proto, returns it. ##### Args: * `checkpoint_dir`: The directory of checkpoints. * `latest_filename`: Optional name of the checkpoint file. Default to 'checkpoint'. ##### Returns: A CheckpointState if the state was available, None otherwise. - - - ### `tf.train.update_checkpoint_state(save_dir, model_checkpoint_path, all_model_checkpoint_paths=None, latest_filename=None)` Updates the content of the 'checkpoint' file. This updates the checkpoint file containing a CheckpointState proto. ##### Args: * `save_dir`: Directory where the model was saved. * `model_checkpoint_path`: The checkpoint file. * `all_model_checkpoint_paths`: list of strings. Paths to all not-yet-deleted checkpoints, sorted from oldest to newest. If this is a non-empty list, the last element must be equal to model_checkpoint_path. These paths are also saved in the CheckpointState proto. * `latest_filename`: Optional name of the checkpoint file. Default to 'checkpoint'. ##### Raises: * `RuntimeError`: If the save paths conflict. ## Sharing Variables TensorFlow provides several classes and operations that you can use to create variables contingent on certain conditions. - - - ### `tf.get_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None)` Gets an existing variable with these parameters or create a new one. This function prefixes the name with the current variable scope and performs reuse checks. See the [Variable Scope How To](../../how_tos/variable_scope/index.md) for an extensive description of how reusing works. Here is a basic example: ```python with tf.variable_scope("foo"): v = get_variable("v", [1]) # v.name == "foo/v:0" w = get_variable("w", [1]) # w.name == "foo/w:0" with tf.variable_scope("foo", reuse=True) v1 = get_variable("v") # The same as v above. ``` If initializer is `None` (the default), the default initializer passed in the constructor is used. If that one is `None` too, a `UniformUnitScalingInitializer` will be used. ##### Args: * `name`: the name of the new or existing variable. * `shape`: shape of the new or existing variable. * `dtype`: type of the new or existing variable (defaults to `DT_FLOAT`). * `initializer`: initializer for the variable if one is created. * `trainable`: If `True` also add the variable to the graph collection `GraphKeys.TRAINABLE_VARIABLES` (see variables.Variable). * `collections`: List of graph collections keys to add the Variable to. Defaults to `[GraphKeys.VARIABLES]` (see variables.Variable). ##### Returns: The created or existing variable. ##### Raises: * `ValueError`: when creating a new variable and shape is not declared, or when violating reuse during variable creation. Reuse is set inside `variable_scope`. - - - ### `tf.get_variable_scope()` Returns the current variable scope. - - - ### `tf.variable_scope(name_or_scope, reuse=None, initializer=None)` Returns a context for variable scope. Variable scope allows to create new variables and to share already created ones while providing checks to not create or share by accident. For details, see the [Variable Scope How To](../../how_tos/variable_scope/index.md), here we present only a few basic examples. Simple example of how to create a new variable: ```python with tf.variable_scope("foo"): with tf.variable_scope("bar"): v = tf.get_variable("v", [1]) assert v.name == "foo/bar/v:0" ``` Basic example of sharing a variable: ```python with tf.variable_scope("foo"): v = get_variable("v", [1]) with tf.variable_scope("foo", reuse=True): v1 = tf.get_variable("v", [1]) assert v1 == v ``` Sharing a variable by capturing a scope and setting reuse: ```python with tf.variable_scope("foo") as scope. v = get_variable("v", [1]) scope.reuse_variables() v1 = tf.get_variable("v", [1]) assert v1 == v ``` To prevent accidental sharing of variables, we raise an exception when getting an existing variable in a non-reusing scope. ```python with tf.variable_scope("foo") as scope. v = get_variable("v", [1]) v1 = tf.get_variable("v", [1]) # Raises ValueError("... v already exists ..."). ``` Similarly, we raise an exception when trying to get a variable that does not exist in reuse mode. ```python with tf.variable_scope("foo", reuse=True): v = get_variable("v", [1]) # Raises ValueError("... v does not exists ..."). ``` Note that the `reuse` flag is inherited: if we open a reusing scope, then all its sub-scopes become reusing as well. ##### Args: * `name_or_scope`: `string` or `VariableScope`: the scope to open. * `reuse`: `True` or `None`; if `True`, we go into reuse mode for this scope as well as all sub-scopes; if `None`, we just inherit the parent scope reuse. * `initializer`: default initializer for variables within this scope. ##### Yields: A scope that can be to captured and reused. ##### Raises: * `ValueError`: when trying to reuse within a create scope, or create within a reuse scope, or if reuse is not `None` or `True`. * `TypeError`: when the types of some arguments are not appropriate. - - - ### `tf.constant_initializer(value=0.0)` Returns an initializer that generates Tensors with a single value. ##### Args: * `value`: A Python scalar. All elements of the initialized variable will be set to this value. ##### Returns: An initializer that generates Tensors with a single value. - - - ### `tf.random_normal_initializer(mean=0.0, stddev=1.0, seed=None)` Returns an initializer that generates Tensors with a normal distribution. ##### Args: * `mean`: a python scalar or a scalar tensor. Mean of the random values to generate. * `stddev`: a python scalar or a scalar tensor. Standard deviation of the random values to generate. * `seed`: A Python integer. Used to create random seeds. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: An initializer that generates Tensors with a normal distribution. - - - ### `tf.truncated_normal_initializer(mean=0.0, stddev=1.0, seed=None)` Returns an initializer that generates a truncated normal distribution. These values are similar to values from a random_normal_initializer except that values more than two standard deviations from the mean are discarded and re-drawn. This is the recommended initializer for neural network weights and filters. ##### Args: * `mean`: a python scalar or a scalar tensor. Mean of the random values to generate. * `stddev`: a python scalar or a scalar tensor. Standard deviation of the random values to generate. * `seed`: A Python integer. Used to create random seeds. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: An initializer that generates Tensors with a truncated normal distribution. - - - ### `tf.random_uniform_initializer(minval=0.0, maxval=1.0, seed=None)` Returns an initializer that generates Tensors with a uniform distribution. ##### Args: * `minval`: a python scalar or a scalar tensor. lower bound of the range of random values to generate. * `maxval`: a python scalar or a scalar tensor. upper bound of the range of random values to generate. * `seed`: A Python integer. Used to create random seeds. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: An initializer that generates Tensors with a uniform distribution. - - - ### `tf.uniform_unit_scaling_initializer(factor=1.0, seed=None)` Returns an initializer that generates tensors without scaling variance. When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. If the input is `x` and the operation `x * W`, and we want to initialize `W` uniformly at random, we need to pick `W` from [-sqrt(3) / sqrt(dim), sqrt(3) / sqrt(dim)] to keep the scale intact, where `dim = W.shape[0]` (the size of the input). A similar calculation for convolutional networks gives an analogous result with `dim` equal to the product of the first 3 dimensions. When nonlinearities are present, we need to multiply this by a constant `factor`. See for deeper motivation, experiments and the calculation of constants. In section 2.3 there, the constants were numerically computed: for a linear layer it's 1.0, relu: ~1.43, tanh: ~1.15. ##### Args: * `factor`: Float. A multiplicative factor by which the values will be scaled. * `seed`: A Python integer. Used to create random seeds. See [`set_random_seed`](../../api_docs/python/constant_op.md#set_random_seed) for behavior. ##### Returns: An initializer that generates tensors with unit variance. - - - ### `tf.zeros_initializer(shape, dtype=tf.float32)` An adaptor for zeros() to match the Initializer spec. ## Sparse Variable Updates The sparse update ops modify a subset of the entries in a dense `Variable`, either overwriting the entries or adding / subtracting a delta. These are useful for training embedding models and similar lookup-based networks, since only a small subset of embedding vectors change in any given step. Since a sparse update of a large tensor may be generated automatically during gradient computation (as in the gradient of [`tf.gather`](../../api_docs/python/array_ops.md#gather)), an [`IndexedSlices`](#IndexedSlices) class is provided that encapsulates a set of sparse indices and values. `IndexedSlices` objects are detected and handled automatically by the optimizers in most cases. - - - ### `tf.scatter_update(ref, indices, updates, use_locking=None, name=None)` Applies sparse updates to a variable reference. This operation computes # Scalar indices ref[indices, ...] = updates[...] # Vector indices (for each i) ref[indices[i], ...] = updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] = updates[i, ..., j, ...] This operation outputs `ref` after the update is done. This makes it easier to chain operations that need to use the reset value. If `indices` contains duplicate entries, lexicographically later entries override earlier entries. Requires `updates.shape = indices.shape + ref.shape[1:]`.
##### Args: * `ref`: A mutable `Tensor`. Should be from a `Variable` node. * `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A tensor of indices into the first dimension of `ref`. * `updates`: A `Tensor`. Must have the same type as `ref`. A tensor of updated values to store in `ref`. * `use_locking`: An optional `bool`. Defaults to `True`. If True, the assignment will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. * `name`: A name for the operation (optional). ##### Returns: Same as `ref`. Returned as a convenience for operations that want to use the updated values after the update is done. - - - ### `tf.scatter_add(ref, indices, updates, use_locking=None, name=None)` Adds sparse updates to a variable reference. This operation computes # Scalar indices ref[indices, ...] += updates[...] # Vector indices (for each i) ref[indices[i], ...] += updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] += updates[i, ..., j, ...] This operation outputs `ref` after the update is done. This makes it easier to chain operations that need to use the reset value. Duplicate entries are handled correctly: if multiple `indices` reference the same location, their contributions add. Requires `updates.shape = indices.shape + ref.shape[1:]`.
##### Args: * `ref`: A mutable `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`. Should be from a `Variable` node. * `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A tensor of indices into the first dimension of `ref`. * `updates`: A `Tensor`. Must have the same type as `ref`. A tensor of updated values to add to `ref`. * `use_locking`: An optional `bool`. Defaults to `False`. If True, the addition will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. * `name`: A name for the operation (optional). ##### Returns: Same as `ref`. Returned as a convenience for operations that want to use the updated values after the update is done. - - - ### `tf.scatter_sub(ref, indices, updates, use_locking=None, name=None)` Subtracts sparse updates to a variable reference. # Scalar indices ref[indices, ...] -= updates[...] # Vector indices (for each i) ref[indices[i], ...] -= updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] -= updates[i, ..., j, ...] This operation outputs `ref` after the update is done. This makes it easier to chain operations that need to use the reset value. Duplicate entries are handled correctly: if multiple `indices` reference the same location, their (negated) contributions add. Requires `updates.shape = indices.shape + ref.shape[1:]`.
##### Args: * `ref`: A mutable `Tensor`. Must be one of the following types: `float32`, `float64`, `int64`, `int32`, `uint8`, `int16`, `int8`, `complex64`, `qint8`, `quint8`, `qint32`. Should be from a `Variable` node. * `indices`: A `Tensor`. Must be one of the following types: `int32`, `int64`. A tensor of indices into the first dimension of `ref`. * `updates`: A `Tensor`. Must have the same type as `ref`. A tensor of updated values to subtract from `ref`. * `use_locking`: An optional `bool`. Defaults to `False`. If True, the subtraction will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. * `name`: A name for the operation (optional). ##### Returns: Same as `ref`. Returned as a convenience for operations that want to use the updated values after the update is done. - - - ### `tf.sparse_mask(a, mask_indices, name=None)` Masks elements of `IndexedSlices`. Given an `IndexedSlices` instance `a`, returns another `IndexedSlices` that contains a subset of the slices of `a`. Only the slices at indices specified in `mask_indices` are returned. This is useful when you need to extract a subset of slices in an `IndexedSlices` object. For example: ```python # `a` contains slices at indices [12, 26, 37, 45] from a large tensor # with shape [1000, 10] a.indices => [12, 26, 37, 45] tf.shape(a.values) => [4, 10] # `b` will be the subset of `a` slices at its second and third indices, so # we want to mask of its first and last indices (which are at absolute # indices 12, 45) b = tf.sparse_mask(a, [12, 45]) b.indices => [26, 37] tf.shape(b.values) => [2, 10] ``` ##### Args: * `a`: An `IndexedSlices` instance. * `mask_indices`: Indices of elements to mask. * `name`: A name for the operation (optional). ##### Returns: The masked `IndexedSlices` instance. - - - ### `class tf.IndexedSlices` A sparse representation of a set of tensor slices at given indices. This class is a simple wrapper for a pair of `Tensor` objects: * `values`: A `Tensor` of any dtype with shape `[D0, D1, ..., Dn]`. * `indices`: A 1-D integer `Tensor` with shape `[D0]`. An `IndexedSlices` is typically used to represent a subset of a larger tensor `dense` of shape `[LARGE0, D1, .. , DN]` where `LARGE0 >> D0`. The values in `indices` are the indices in the first dimension of the slices that have been extracted from the larger tensor. The dense tensor `dense` represented by an `IndexedSlices` `slices` has ```python dense[slices.indices[i], :, :, :, ...] = slices.values[i, :, :, :, ...] ``` The `IndexedSlices` class is used principally in the definition of gradients for operations that have sparse gradients (e.g. [`tf.gather`](../../api_docs/python/array_ops.md#gather)). Contrast this representation with [`SparseTensor`](../../api_docs/python/sparse_ops.md#SparseTensor), which uses multi-dimensional indices and scalar values. - - - #### `tf.IndexedSlices.__init__(values, indices, dense_shape=None)` Creates an `IndexedSlices`. - - - #### `tf.IndexedSlices.values` A `Tensor` containing the values of the slices. - - - #### `tf.IndexedSlices.indices` A 1-D `Tensor` containing the indices of the slices. - - - #### `tf.IndexedSlices.dense_shape` A 1-D `Tensor` containing the shape of the corresponding dense tensor. - - - #### `tf.IndexedSlices.name` The name of this `IndexedSlices`. - - - #### `tf.IndexedSlices.dtype` The `DType` of elements in this tensor. - - - #### `tf.IndexedSlices.device` The name of the device on which `values` will be produced, or `None`. - - - #### `tf.IndexedSlices.op` The `Operation` that produces `values` as an output. ================================================ FILE: SOURCE/api_docs/python/train.md ================================================ # Training ## Contents ### [Training](#AUTOGENERATED-training) * [Optimizers](#AUTOGENERATED-optimizers) * [`class tf.train.Optimizer`](#Optimizer) * [Usage](#AUTOGENERATED-usage) * [Processing gradients before applying them.](#AUTOGENERATED-processing-gradients-before-applying-them.) * [Gating Gradients](#AUTOGENERATED-gating-gradients) * [Slots](#AUTOGENERATED-slots) * [`class tf.train.GradientDescentOptimizer`](#GradientDescentOptimizer) * [`class tf.train.AdagradOptimizer`](#AdagradOptimizer) * [`class tf.train.MomentumOptimizer`](#MomentumOptimizer) * [`class tf.train.AdamOptimizer`](#AdamOptimizer) * [`class tf.train.FtrlOptimizer`](#FtrlOptimizer) * [`class tf.train.RMSPropOptimizer`](#RMSPropOptimizer) * [Gradient Computation](#AUTOGENERATED-gradient-computation) * [`tf.gradients(ys, xs, grad_ys=None, name='gradients', colocate_gradients_with_ops=False, gate_gradients=False, aggregation_method=None)`](#gradients) * [`class tf.AggregationMethod`](#AggregationMethod) * [`tf.stop_gradient(input, name=None)`](#stop_gradient) * [Gradient Clipping](#AUTOGENERATED-gradient-clipping) * [`tf.clip_by_value(t, clip_value_min, clip_value_max, name=None)`](#clip_by_value) * [`tf.clip_by_norm(t, clip_norm, name=None)`](#clip_by_norm) * [`tf.clip_by_average_norm(t, clip_norm, name=None)`](#clip_by_average_norm) * [`tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None)`](#clip_by_global_norm) * [`tf.global_norm(t_list, name=None)`](#global_norm) * [Decaying the learning rate](#AUTOGENERATED-decaying-the-learning-rate) * [`tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)`](#exponential_decay) * [Moving Averages](#AUTOGENERATED-moving-averages) * [`class tf.train.ExponentialMovingAverage`](#ExponentialMovingAverage) * [Coordinator and QueueRunner](#AUTOGENERATED-coordinator-and-queuerunner) * [`class tf.train.Coordinator`](#Coordinator) * [`class tf.train.QueueRunner`](#QueueRunner) * [`tf.train.add_queue_runner(qr, collection='queue_runners')`](#add_queue_runner) * [`tf.train.start_queue_runners(sess=None, coord=None, daemon=True, start=True, collection='queue_runners')`](#start_queue_runners) * [Summary Operations](#AUTOGENERATED-summary-operations) * [`tf.scalar_summary(tags, values, collections=None, name=None)`](#scalar_summary) * [`tf.image_summary(tag, tensor, max_images=None, collections=None, name=None)`](#image_summary) * [`tf.histogram_summary(tag, values, collections=None, name=None)`](#histogram_summary) * [`tf.nn.zero_fraction(value, name=None)`](#zero_fraction) * [`tf.merge_summary(inputs, collections=None, name=None)`](#merge_summary) * [`tf.merge_all_summaries(key='summaries')`](#merge_all_summaries) * [Adding Summaries to Event Files](#AUTOGENERATED-adding-summaries-to-event-files) * [`class tf.train.SummaryWriter`](#SummaryWriter) * [`tf.train.summary_iterator(path)`](#summary_iterator) * [Training utilities](#AUTOGENERATED-training-utilities) * [`tf.train.global_step(sess, global_step_tensor)`](#global_step) * [`tf.train.write_graph(graph_def, logdir, name, as_text=True)`](#write_graph) This library provides a set of classes and functions that helps train models. ## Optimizers The Optimizer base class provides methods to compute gradients for a loss and apply gradients to variables. A collection of subclasses implement classic optimization algorithms such as GradientDescent and Adagrad. You never instantiate the Optimizer class itself, but instead instantiate one of the subclasses. - - - ### `class tf.train.Optimizer` Base class for optimizers. This class defines the API to add Ops to train a model. You never use this class directly, but instead instantiate one of its subclasses such as `GradientDescentOptimizer`, `AdagradOptimizer`, or `MomentumOptimizer`. ### Usage ``` # Create an optimizer with the desired parameters. opt = GradientDescentOptimizer(learning_rate=0.1) # Add Ops to the graph to minimize a cost by updating a list of variables. # "cost" is a Tensor, and the list of variables contains variables.Variable # objects. opt_op = opt.minimize(cost, ) ``` In the training program you will just have to run the returned Op. ``` # Execute opt_op to do one step of training: opt_op.run() ``` ### Processing gradients before applying them. Calling `minimize()` takes care of both computing the gradients and applying them to the variables. If you want to process the gradients before applying them you can instead use the optimizer in three steps: 1. Compute the gradients with `compute_gradients()`. 2. Process the gradients as you wish. 3. Apply the processed gradients with `apply_gradients()`. Example: ``` # Create an optimizer. opt = GradientDescentOptimizer(learning_rate=0.1) # Compute the gradients for a list of variables. grads_and_vars = opt.compute_gradients(loss, ) # grads_and_vars is a list of tuples (gradient, variable). Do whatever you # need to the 'gradient' part, for example cap them, etc. capped_grads_and_vars = [(MyCapper(gv[0]), gv[1])) for gv in grads_and_vars] # Ask the optimizer to apply the capped gradients. opt.apply_gradients(capped_grads_and_vars) ``` - - - #### `tf.train.Optimizer.__init__(use_locking, name)` Create a new Optimizer. This must be called by the constructors of subclasses. ##### Args: * `use_locking`: Bool. If True apply use locks to prevent concurrent updates to variables. * `name`: A non-empty string. The name to use for accumulators created for the optimizer. ##### Raises: * `ValueError`: if name is malformed. - - - #### `tf.train.Optimizer.minimize(loss, global_step=None, var_list=None, gate_gradients=1, name=None)` Add operations to minimize 'loss' by updating 'var_list'. This method simply combines calls compute_gradients() and apply_gradients(). If you want to process the gradient before applying them call compute_gradients() and apply_gradients() explicitly instead of using this function. ##### Args: * `loss`: A Tensor containing the value to minimize. * `global_step`: Optional Variable to increment by one after the variables have been updated. * `var_list`: Optional list of variables.Variable to update to minimize 'loss'. Defaults to the list of variables collected in the graph under the key GraphKeys.TRAINABLE_VARIABLES. * `gate_gradients`: How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH. * `name`: Optional name for the returned operation. ##### Returns: An Operation that updates the variables in 'var_list'. If 'global_step' was not None, that operation also increments global_step. ##### Raises: * `ValueError`: if some of the variables are not variables.Variable objects. - - - #### `tf.train.Optimizer.compute_gradients(loss, var_list=None, gate_gradients=1)` Compute gradients of "loss" for the variables in "var_list". This is the first part of minimize(). It returns a list of (gradient, variable) pairs where "gradient" is the gradient for "variable". Note that "gradient" can be a Tensor, a IndexedSlices, or None if there is no gradient for the given variable. ##### Args: * `loss`: A Tensor containing the value to minimize. * `var_list`: Optional list of variables.Variable to update to minimize "loss". Defaults to the list of variables collected in the graph under the key GraphKey.TRAINABLE_VARIABLES. * `gate_gradients`: How to gate the computation of gradients. Can be GATE_NONE, GATE_OP, or GATE_GRAPH. ##### Returns: A list of (gradient, variable) pairs. ##### Raises: * `TypeError`: If var_list contains anything else than variables.Variable. * `ValueError`: If some arguments are invalid. - - - #### `tf.train.Optimizer.apply_gradients(grads_and_vars, global_step=None, name=None)` Apply gradients to variables. This is the second part of minimize(). It returns an Operation that applies gradients. ##### Args: * `grads_and_vars`: List of (gradient, variable) pairs as returned by compute_gradients(). * `global_step`: Optional Variable to increment by one after the variables have been updated. * `name`: Optional name for the returned operation. Default to the name passed to the Optimizer constructor. ##### Returns: An Operation that applies the specified gradients. If 'global_step' was not None, that operation also increments global_step. ##### Raises: * `TypeError`: if grads_and_vars is malformed. ### Gating Gradients Both `minimize()` and `compute_gradients()` accept a `gate_gradient` argument that controls the degree of parallelism during the application of the gradients. The possible values are: `GATE_NONE`, `GATE_OP`, and `GATE_GRAPH`. GATE_NONE: Compute and apply gradients in parallel. This provides the maximum parallelism in execution, at the cost of some non-reproducibility in the results. For example the two gradients of MatMul depend on the input values: With `GATE_NONE` one of the gradients could be applied to one of the inputs _before_ the other gradient is computed resulting in non-reproducible results. GATE_OP: For each Op, make sure all gradients are computed before they are used. This prevents race conditions for Ops that generate gradients for multiple inputs where the gradients depend on the inputs. GATE_GRAPH: Make sure all gradients for all variables are computed before any one of them is used. This provides the least parallelism but can be useful if you want to process all gradients before applying any of them. ### Slots Some optimizer subclasses, such as `MomentumOptimizer` and `AdagradOptimizer` allocate and manage additional variables associated with the variables to train. These are called Slots. Slots have names and you can ask the optimizer for the names of the slots that it uses. Once you have a slot name you can ask the optimizer for the variable it created to hold the slot value. This can be useful if you want to log debug a training algorithm, report stats about the slots, etc. - - - #### `tf.train.Optimizer.get_slot_names()` Return a list of the names of slots created by the Optimizer. See get_slot(). ##### Returns: A list of strings. - - - #### `tf.train.Optimizer.get_slot(var, name)` Return a slot named "name" created for "var" by the Optimizer. Some Optimizer subclasses use additional variables. For example Momentum and Adagrad use variables to accumulate updates. This method gives access to these Variables if for some reason you need them. Use get_slot_names() to get the list of slot names created by the Optimizer. ##### Args: * `var`: A variable passed to minimize() or apply_gradients(). * `name`: A string. ##### Returns: The Variable for the slot if it was created, None otherwise. - - - ### `class tf.train.GradientDescentOptimizer` Optimizer that implements the gradient descent algorithm. - - - #### `tf.train.GradientDescentOptimizer.__init__(learning_rate, use_locking=False, name='GradientDescent')` Construct a new gradient descent optimizer. ##### Args: * `learning_rate`: A Tensor or a floating point value. The learning rate to use. * `use_locking`: If True use locks for update operation.s * `name`: Optional name prefix for the operations created when applying gradients. Defaults to "GradientDescent". - - - ### `class tf.train.AdagradOptimizer` Optimizer that implements the Adagrad algorithm. - - - #### `tf.train.AdagradOptimizer.__init__(learning_rate, initial_accumulator_value=0.1, use_locking=False, name='Adagrad')` Construct a new Adagrad optimizer. ##### Args: * `learning_rate`: A `Tensor` or a floating point value. The learning rate. * `initial_accumulator_value`: A floating point value. Starting value for the accumulators, must be positive. * `use_locking`: If `True` use locks for update operations. * `name`: Optional name prefix for the operations created when applying gradients. Defaults to "Adagrad". ##### Raises: * `ValueError`: If the initial_accumulator_value is invalid. - - - ### `class tf.train.MomentumOptimizer` Optimizer that implements the Momentum algorithm. - - - #### `tf.train.MomentumOptimizer.__init__(learning_rate, momentum, use_locking=False, name='Momentum')` Construct a new Momentum optimizer. ##### Args: * `learning_rate`: A `Tensor` or a floating point value. The learning rate. * `momentum`: A `Tensor` or a floating point value. The momentum. * `use_locking`: If `True` use locks for update operations. * `name`: Optional name prefix for the operations created when applying gradients. Defaults to "Momentum". - - - ### `class tf.train.AdamOptimizer` Optimizer that implements the Adam algorithm. - - - #### `tf.train.AdamOptimizer.__init__(learning_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, use_locking=False, name='Adam')` Construct a new Adam optimizer. Implementation is based on: http://arxiv.org/pdf/1412.6980v7.pdf Initialization: ``` m_0 <- 0 (Initialize initial 1st moment vector) v_0 <- 0 (Initialize initial 2nd moment vector) t <- 0 (Initialize timestep) ``` The update rule for `variable` with gradient `g` uses an optimization described at the end of section2 of the paper: ``` t <- t + 1 lr_t <- learning_rate * sqrt(1 - beta2^t) / (1 - beta1^t) m_t <- beta1 * m_{t-1} + (1 - beta1) * g v_t <- beta2 * v_{t-1} + (1 - beta2) * g * g variable <- variable - lr_t * m_t / (sqrt(v_t) + epsilon) ``` The default value of 1e-8 for epsilon might not be a good default in general. For example, when training an Inception network on ImageNet a current good choice is 1.0 or 0.1. ##### Args: * `learning_rate`: A Tensor or a floating point value. The learning rate. * `beta1`: A float value or a constant float tensor. The exponential decay rate for the 1st moment estimates. * `beta2`: A float value or a constant float tensor. The exponential decay rate for the 2st moment estimates. * `epsilon`: A small constant for numerical stability. * `use_locking`: If True use locks for update operation.s * `name`: Optional name for the operations created when applying gradients. Defaults to "Adam". - - - ### `class tf.train.FtrlOptimizer` Optimizer that implements the FTRL algorithm. - - - #### `tf.train.FtrlOptimizer.__init__(learning_rate, learning_rate_power=-0.5, initial_accumulator_value=0.1, l1_regularization_strength=0.0, l2_regularization_strength=0.0, use_locking=False, name='Ftrl')` Construct a new FTRL optimizer. The Ftrl-proximal algorithm, abbreviated for Follow-the-regularized-leader, is described in the paper [Ad Click Prediction: a View from the Trenches]( https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf). It can give a good performance vs. sparsity tradeoff. Ftrl-proximal uses its own global base learning rate and can behave like Adagrad with `learning_rate_power=-0.5`, or like gradient descent with `learning_rate_power=0.0`. The effective learning rate is adjusted per parameter, relative to this base learning rate as: ``` effective_learning_rate_i = (learning_rate / pow(k + summed_squared_gradients_for_i, learning_rate_power)); ``` where k is the small constant `initial_accumulator_value`. Note that the real regularization coefficient of `|w|^2` for objective function is `1 / lambda_2` if specifying `l2 = lambda_2` as argument when using this function. ##### Args: * `learning_rate`: A float value or a constant float `Tensor`. * `learning_rate_power`: A float value, must be less or equal to zero. * `initial_accumulator_value`: The starting value for accumulators. Only positive values are allowed. * `l1_regularization_strength`: A float value, must be greater than or equal to zero. * `l2_regularization_strength`: A float value, must be greater than or equal to zero. * `use_locking`: If `True` use locks for update operations. * `name`: Optional name prefix for the operations created when applying gradients. Defaults to "Ftrl". ##### Raises: * `ValueError`: if one of the arguments is invalid. - - - ### `class tf.train.RMSPropOptimizer` Optimizer that implements the RMSProp algorithm. - - - #### `tf.train.RMSPropOptimizer.__init__(learning_rate, decay, momentum=0.0, epsilon=1e-10, use_locking=False, name='RMSProp')` Construct a new RMSProp optimizer. ##### Args: * `learning_rate`: A Tensor or a floating point value. The learning rate. * `decay`: discounting factor for the history/coming gradient * `momentum`: a scalar tensor. * `epsilon`: small value to avoid zero denominator. * `use_locking`: If True use locks for update operation. * `name`: Optional name prefic for the operations created when applying gradients. Defaults to "RMSProp". ## Gradient Computation TensorFlow provides functions to compute the derivatives for a given TensorFlow computation graph, adding operations to the graph. The optimizer classes automatically compute derivatives on your graph, but creators of new Optimizers or expert users can call the lower-level functions below. - - - ### `tf.gradients(ys, xs, grad_ys=None, name='gradients', colocate_gradients_with_ops=False, gate_gradients=False, aggregation_method=None)` Constructs symbolic partial derivatives of `ys` w.r.t. x in `xs`. `ys` and `xs` are each a `Tensor` or a list of tensors. `grad_ys` is a list of `Tensor`, holding the gradients received by the `ys`. The list must be the same length as `ys`. `gradients()` adds ops to the graph to output the partial derivatives of `ys` with respect to `xs`. It returns a list of `Tensor` of length `len(xs)` where each tensor is the `sum(dy/dx)` for y in `ys`. `grad_ys` is a list of tensors of the same length as `ys` that holds the initial gradients for each y in `ys`. When `grad_ys` is None, we fill in a tensor of '1's of the shape of y for each y in `ys`. A user can provide their own initial 'grad_ys` to compute the derivatives using a different initial gradient for each y (e.g., if one wanted to weight the gradient differently for each value in each y). ##### Args: * `ys`: A `Tensor` or list of tensors to be differentiated. * `xs`: A `Tensor` or list of tensors to be used for differentiation. * `grad_ys`: Optional. A `Tensor` or list of tensors the same size as `ys` and holding the gradients computed for each y in `ys`. * `name`: Optional name to use for grouping all the gradient ops together. defaults to 'gradients'. * `colocate_gradients_with_ops`: If True, try colocating gradients with the corresponding op. * `gate_gradients`: If True, add a tuple around the gradients returned for an operations. This avoids some race conditions. * `aggregation_method`: Specifies the method used to combine gradient terms. Accepted values are constants defined in the class `AggregationMethod`. ##### Returns: A list of `sum(dy/dx)` for each x in `xs`. ##### Raises: * `LookupError`: if one of the operations between `x` and `y` does not have a registered gradient function. * `ValueError`: if the arguments are invalid. - - - ### `class tf.AggregationMethod` A class listing aggregation methods used to combine gradients. Computing partial derivatives can require aggregating gradient contributions. This class lists the various methods that can be used to combine gradients in the graph: * `ADD_N`: All of the gradient terms are summed as part of one operation using the "AddN" op. It has the property that all gradients must be ready before any aggregation is performed. * `DEFAULT`: The system-chosen default aggregation method. - - - ### `tf.stop_gradient(input, name=None)` Stops gradient computation. When executed in a graph, this op outputs its input tensor as-is. When building ops to compute gradients, this op prevents the contribution of its inputs to be taken into account. Normally, the gradient generator adds ops to a graph to compute the derivatives of a specified 'loss' by recursively finding out inputs that contributed to its computation. If you insert this op in the graph it inputs are masked from the gradient generator. They are not taken into account for computing gradients. This is useful any time you want to compute a value with TensorFlow but need to pretend that the value was a constant. Some examples include: * The *EM* algorithm where the *M-step* should not involve backpropagation through the output of the *E-step*. * Contrastive divergence training of Boltzmann machines where, when differentiating the energy function, the training must not backpropagate through the graph that generated the samples from the model. * Adversarial training, where no backprop should happen through the adversarial example generation process. ##### Args: * `input`: A `Tensor`. * `name`: A name for the operation (optional). ##### Returns: A `Tensor`. Has the same type as `input`. ## Gradient Clipping TensorFlow provides several operations that you can use to add clipping functions to your graph. You can use these functions to perform general data clipping, but they're particularly useful for handling exploding or vanishing gradients. - - - ### `tf.clip_by_value(t, clip_value_min, clip_value_max, name=None)` Clips tensor values to a specified min and max. Given a tensor `t`, this operation returns a tensor of the same type and shape as `t` with its values clipped to `clip_value_min` and `clip_value_max`. Any values less than `clip_value_min` are set to `clip_value_min`. Any values greater than `clip_value_max` are set to `clip_value_max`. ##### Args: * `t`: A `Tensor`. * `clip_value_min`: A 0-D (scalar) `Tensor`. The minimum value to clip by. * `clip_value_max`: A 0-D (scalar) `Tensor`. The maximum value to clip by. * `name`: A name for the operation (optional). ##### Returns: A clipped `Tensor`. - - - ### `tf.clip_by_norm(t, clip_norm, name=None)` Clips tensor values to a maximum L2-norm. Given a tensor `t`, and a maximum clip value `clip_norm`, this operation normalizes `t` so that its L2-norm is less than or equal to `clip_norm'. Specifically, if the L2-norm is already less than or equal to `clip_norm`, then `t` is not modified. If the L2-norm is greater than `clip_norm`, then this operation returns a tensor of the same type and shape as `t` with its values set to: `t * clip_norm / l2norm(t)` In this case, the L2-norm of the output tensor is `clip_norm`. This operation is typically used to clip gradients before applying them with an optimizer. ##### Args: * `t`: A `Tensor`. * `clip_norm`: A 0-D (scalar) `Tensor` > 0. A maximum clipping value. * `name`: A name for the operation (optional). ##### Returns: A clipped `Tensor`. - - - ### `tf.clip_by_average_norm(t, clip_norm, name=None)` Clips tensor values to a maximum average L2-norm. Given a tensor `t`, and a maximum clip value `clip_norm`, this operation normalizes `t` so that its average L2-norm is less than or equal to `clip_norm'. Specifically, if the average L2-norm is already less than or equal to `clip_norm`, then `t` is not modified. If the average L2-norm is greater than `clip_norm`, then this operation returns a tensor of the same type and shape as `t` with its values set to: `t * clip_norm / l2norm_avg(t)` In this case, the average L2-norm of the output tensor is `clip_norm`. This operation is typically used to clip gradients before applying them with an optimizer. ##### Args: * `t`: A `Tensor`. * `clip_norm`: A 0-D (scalar) `Tensor` > 0. A maximum clipping value. * `name`: A name for the operation (optional). ##### Returns: A clipped `Tensor`. - - - ### `tf.clip_by_global_norm(t_list, clip_norm, use_norm=None, name=None)` Clips values of multiple tensors by the ratio of the sum of their norms. Given a tuple or list of tensors `t_list`, and a clipping ratio `clip_norm`, this operation returns a list of clipped tensors `list_clipped` and the global norm (`global_norm`) of all tensors in `t_list`. Optionally, if you've already computed the global norm for `t_list`, you can specify the global norm with `use_norm`. To perform the clipping, the values t_list[i] are set to: `t_list[i] * clip_norm / max(global_norm, clip_norm)` where: `global_norm = sqrt(sum([l2norm(t)**2 for t in t_list]))` If `clip_norm > global_norm` then the entries in `t_list` remain as they are, otherwise they're all shrunk by the global ratio. Any of the entries of `t_list` that are of type None are ignored. This is the correct way to perform gradient clipping (for example, see R. Pascanu, T. Mikolov, and Y. Bengio, "On the difficulty of training Recurrent Neural Networks". http://arxiv.org/abs/1211.5063) However, it is slower than `clip_by_norm()` because all the parameters must be ready before the clipping operation can be performed. ##### Args: * `t_list`: A tuple or list of mixed `Tensors`, `IndexedSlices`, or None. * `clip_norm`: A 0-D (scalar) `Tensor` > 0. The clipping ratio. * `use_norm`: A 0-D (scalar) `Tensor` of type `float` (optional). The global norm to use. If not provided, `global_norm()` is used to compute the norm. * `name`: A name for the operation (optional). ##### Returns: * `list_clipped`: A list of `Tensors` of the same type as `list_t`. * `global_norm`: A 0-D (scalar) `Tensor` representing the global norm. ##### Raises: * `TypeError`: If `t_list` is not a sequence. - - - ### `tf.global_norm(t_list, name=None)` Computes the global norm of multiple tensors. Given a tuple or list of tensors `t_list`, this operation returns the global norm of the elements in all tensors in `t_list`. The global norm is computed as: `global_norm = sqrt(sum([l2norm(t)**2 for t in t_list]))` Any entries in `t_list` that are of type None are ignored. ##### Args: * `t_list`: A tuple or list of mixed `Tensors`, `IndexedSlices`, or None. * `name`: A name for the operation (optional). ##### Returns: A 0-D (scalar) `Tensor` of type `float`. ##### Raises: * `TypeError`: If `t_list` is not a sequence. ## Decaying the learning rate - - - ### `tf.train.exponential_decay(learning_rate, global_step, decay_steps, decay_rate, staircase=False, name=None)` Applies exponential decay to the learning rate. When training a model, it is often recommended to lower the learning rate as the training progresses. This function applies an exponential decay function to a provided initial learning rate. It requires a `global_step` value to compute the decayed learning rate. You can just pass a TensorFlow variable that you increment at each training step. The function returns the decayed learning rate. It is computed as: ```python decayed_learning_rate = learning_rate * decay_rate ^ (global_step / decay_steps) ``` If the argument `staircase` is `True`, then `global_step /decay_steps` is an integer division and the decayed learning rate follows a staircase function. Example: decay every 100000 steps with a base of 0.96: ```python ... global_step = tf.Variable(0, trainable=False) starter_learning_rate = 0.1 learning_rate = tf.exponential_decay(starter_learning_rate, global_step, 100000, 0.96, staircase=True) optimizer = tf.GradientDescent(learning_rate) # Passing global_step to minimize() will increment it at each step. optimizer.minimize(...my loss..., global_step=global_step) ``` ##### Args: * `learning_rate`: A scalar `float32` or `float64` `Tensor` or a Python number. The initial learning rate. * `global_step`: A scalar `int32` or `int64` `Tensor` or a Python number. Global step to use for the decay computation. Must not be negative. * `decay_steps`: A scalar `int32` or `int64` `Tensor` or a Python number. Must be positive. See the decay computation above. * `decay_rate`: A scalar `float32` or `float64` `Tensor` or a Python number. The decay rate. * `staircase`: Boolean. It `True` decay the learning rate at discrete intervals. * `name`: string. Optional name of the operation. Defaults to 'ExponentialDecay' ##### Returns: A scalar `Tensor` of the same type as `learning_rate`. The decayed learning rate. ## Moving Averages Some training algorithms, such as GradientDescent and Momentum often benefit from maintaining a moving average of variables during optimization. Using the moving averages for evaluations often improve results significantly. - - - ### `class tf.train.ExponentialMovingAverage` Maintains moving averages of variables by employing and exponential decay. When training a model, it is often beneficial to maintain moving averages of the trained parameters. Evaluations that use averaged parameters sometimes produce significantly better results than the final trained values. The `apply()` method adds shadow copies of trained variables and add ops that maintain a moving average of the trained variables in their shadow copies. It is used when building the training model. The ops that maintain moving averages are typically run after each training step. The `average()` and `average_name()` methods give access to the shadow variables and their names. They are useful when building an evaluation model, or when restoring a model from a checkpoint file. They help use the moving averages in place of the last trained values for evaluations. The moving averages are computed using exponential decay. You specify the decay value when creating the `ExponentialMovingAverage` object. The shadow variables are initialized with the same initial values as the trained variables. When you run the ops to maintain the moving averages, each shadow variable is updated with the formula: `shadow_variable -= (1 - decay) * (shadow_variable - variable)` This is mathematically equivalent to the classic formula below, but the use of an `assign_sub` op (the `"-="` in the formula) allows concurrent lockless updates to the variables: `shadow_variable = decay * shadow_variable + (1 - decay) * variable` Reasonable values for `decay` are close to 1.0, typically in the multiple-nines range: 0.999, 0.9999, etc. Example usage when creating a training model: ```python # Create variables. var0 = tf.Variable(...) var1 = tf.Variable(...) # ... use the variables to build a training model... ... # Create an op that applies the optimizer. This is what we usually # would use as a training op. opt_op = opt.minimize(my_loss, [var0, var1]) # Create an ExponentialMovingAverage object ema = tf.train.ExponentialMovingAverage(decay=0.9999) # Create the shadow variables, and add ops to maintain moving averages # of var0 and var1. maintain_averages_op = ema.apply([var0, var1]) # Create an op that will update the moving averages after each training # step. This is what we will use in place of the usuall trainig op. with tf.control_dependencies([opt_op]): training_op = tf.group(maintain_averages_op) ...train the model by running training_op... ``` There are two ways to use the moving averages for evaluations: * Build a model that uses the shadow variables instead of the variables. For this, use the `average()` method which returns the shadow variable for a given variable. * Build a model normally but load the checkpoint files to evaluate by using the shadow variable names. For this use the `average_name()` method. See the [Saver class](../../api_docs/python/train.md#Saver) for more information on restoring saved variables. Example of restoring the shadow variable values: ```python # Create a Saver that loads variables from their saved shadow values. shadow_var0_name = ema.average_name(var0) shadow_var1_name = ema.average_name(var1) saver = tf.train.Saver({shadow_var0_name: var0, shadow_var1_name: var1}) saver.restore(...checkpoint filename...) # var0 and var1 now hold the moving average values ``` - - - #### `tf.train.ExponentialMovingAverage.__init__(decay, num_updates=None, name='ExponentialMovingAverage')` Creates a new ExponentialMovingAverage object. The `Apply()` method has to be called to create shadow variables and add ops to maintain moving averages. The optional `num_updates` parameter allows one to tweak the decay rate dynamically. . It is typical to pass the count of training steps, usually kept in a variable that is incremented at each step, in which case the decay rate is lower at the start of training. This makes moving averages move faster. If passed, the actual decay rate used is: `min(decay, (1 + num_updates) / (10 + num_updates))` ##### Args: * `decay`: Float. The decay to use. * `num_updates`: Optional count of number of updates applied to variables. * `name`: String. Optional prefix name to use for the name of ops added in `Apply()`. - - - #### `tf.train.ExponentialMovingAverage.apply(var_list=None)` Maintains moving averages of variables. `var_list` must be a list of `Variable` or `Tensor` objects. This method creates shadow variables for all elements of `var_list`. Shadow variables for `Variable` objects are initialized to the variable's initial value. For `Tensor` objects, the shadow variables are initialized to 0. shadow variables are created with `trainable=False` and added to the `GraphKeys.ALL_VARIABLES` collection. They will be returned by calls to `tf.all_variables()`. Returns an op that updates all shadow variables as described above. Note that `apply()` can be called multiple times with different lists of variables. ##### Args: * `var_list`: A list of Variable or Tensor objects. The variables and Tensors must be of types float32 or float64. ##### Returns: An Operation that updates the moving averages. ##### Raises: * `TypeError`: If the arguments are not all float32 or float64. * `ValueError`: If the moving average of one of the variables is already being computed. - - - #### `tf.train.ExponentialMovingAverage.average_name(var)` Returns the name of the `Variable` holding the average for `var`. The typical scenario for `ExponentialMovingAverage` is to compute moving averages of variables during training, and restore the variables from the computed moving averages during evaluations. To restore variables, you have to know the name of the shadow variables. That name and the original variable can then be passed to a `Saver()` object to restore the variable from the moving average value with: `saver = tf.train.Saver({ema.average_name(var): var})` `average_name()` can be called whether or not `apply()` has been called. ##### Args: * `var`: A `Variable` object. ##### Returns: A string: the name of the variable that will be used or was used by the `ExponentialMovingAverage class` to hold the moving average of `var`. - - - #### `tf.train.ExponentialMovingAverage.average(var)` Returns the `Variable` holding the average of `var`. ##### Args: * `var`: A `Variable` object. ##### Returns: A `Variable` object or `None` if the moving average of `var` is not maintained.. ## Coordinator and QueueRunner See [Threading and Queues](../../how_tos/threading_and_queues/index.md) for how to use threads and queues. For documentation on the Queue API, see [Queues](../../api_docs/python/io_ops.md#queues). - - - ### `class tf.train.Coordinator` A coordinator for threads. This class implements a simple mechanism to coordinate the termination of a set of threads. #### Usage: ```python # Create a coordinator. coord = Coordinator() # Start a number of threads, passing the coordinator to each of them. ...start thread 1...(coord, ...) ...start thread N...(coord, ...) # Wait for all the threads to terminate. coord.join(threads) ``` Any of the threads can call `coord.request_stop()` to ask for all the threads to stop. To cooperate with the requests, each thread must check for `coord.should_stop()` on a regular basis. `coord.should_stop()` returns `True` as soon as `coord.request_stop()` has been called. A typical thread running with a Coordinator will do something like: ```python while not coord.should_stop(): ...do some work... ``` #### Exception handling: A thread can report an exception to the Coordinator as part of the `should_stop()` call. The exception will be re-raised from the `coord.join()` call. Thread code: ```python try: while not coord.should_stop(): ...do some work... except Exception, e: coord.request_stop(e) ``` Main code: ```python try: ... coord = Coordinator() # Start a number of threads, passing the coordinator to each of them. ...start thread 1...(coord, ...) ...start thread N...(coord, ...) # Wait for all the threads to terminate. coord.join(threads) except Exception, e: ...exception that was passed to coord.request_stop() ``` #### Grace period for stopping: After a thread has called `coord.request_stop()` the other threads have a fixed time to stop, this is called the 'stop grace period' and defaults to 2 minutes. If any of the threads is still alive after the grace period expires `coord.join()` raises a RuntimeException reporting the laggards. ``` try: ... coord = Coordinator() # Start a number of threads, passing the coordinator to each of them. ...start thread 1...(coord, ...) ...start thread N...(coord, ...) # Wait for all the threads to terminate, give them 10s grace period coord.join(threads, stop_grace_period_secs=10) except RuntimeException: ...one of the threads took more than 10s to stop after request_stop() ...was called. except Exception: ...exception that was passed to coord.request_stop() ``` - - - #### `tf.train.Coordinator.__init__()` Create a new Coordinator. - - - #### `tf.train.Coordinator.join(threads, stop_grace_period_secs=120)` Wait for threads to terminate. Blocks until all 'threads' have terminated or request_stop() is called. After the threads stop, if an 'exc_info' was passed to request_stop, that exception is re-reaised. Grace period handling: When request_stop() is called, threads are given 'stop_grace_period_secs' seconds to terminate. If any of them is still alive after that period expires, a RuntimeError is raised. Note that if an 'exc_info' was passed to request_stop() then it is raised instead of that RuntimeError. ##### Args: * `threads`: List threading.Threads. The started threads to join. * `stop_grace_period_secs`: Number of seconds given to threads to stop after request_stop() has been called. ##### Raises: * `RuntimeError`: If any thread is still alive after request_stop() is called and the grace period expires. - - - #### `tf.train.Coordinator.request_stop(ex=None)` Request that the threads stop. After this is called, calls to should_stop() will return True. ##### Args: * `ex`: Optional Exception, or Python 'exc_info' tuple as returned by sys.exc_info(). If this is the first call to request_stop() the corresponding exception is recorded and re-raised from join(). - - - #### `tf.train.Coordinator.should_stop()` Check if stop was requested. ##### Returns: True if a stop was requested. - - - #### `tf.train.Coordinator.wait_for_stop(timeout=None)` Wait till the Coordinator is told to stop. ##### Args: * `timeout`: float. Sleep for up to that many seconds waiting for should_stop() to become True. ##### Returns: True if the Coordinator is told stop, False if the timeout expired. - - - ### `class tf.train.QueueRunner` Holds a list of enqueue operations for a queue, each to be run in a thread. Queues are a convenient TensorFlow mechanism to compute tensors asynchronously using multiple threads. For example in the canonical 'Input Reader' setup one set of threads generates filenames in a queue; a second set of threads read records from the files, processes them, and enqueues tensors on a second queue; a third set of threads dequeues these input records to construct batches and runs them through training operations. There are several delicate issues when running multiple threads that way: closing the queues in sequence as the input is exhausted, correctly catching and reporting exceptions, etc. The `QueueRunner`, combined with the `Coordinator`, helps handle these issues. - - - #### `tf.train.QueueRunner.__init__(queue, enqueue_ops)` Create a QueueRunner. On construction the `QueueRunner` adds an op to close the queue. That op will be run if the enqueue ops raise exceptions. When you later call the `create_threads()` method, the `QueueRunner` will create one thread for each op in `enqueue_ops`. Each thread will run its enqueue op in parallel with the other threads. The enqueue ops do not have to all be the same op, but it is expected that they all enqueue tensors in `queue`. ##### Args: * `queue`: A `Queue`. * `enqueue_ops`: List of enqueue ops to run in threads later. - - - #### `tf.train.QueueRunner.create_threads(sess, coord=None, daemon=False, start=False)` Create threads to run the enqueue ops. This method requires a session in which the graph was launched. It creates a list of threads, optionally starting them. There is one thread for each op passed in `enqueue_ops`. The `coord` argument is an optional coordinator, that the threads will use to terminate together and report exceptions. If a coordinator is given, this method starts an additional thread to close the queue when the coordinator requests a stop. This method may be called again as long as all threads from a previous call have stopped. ##### Args: * `sess`: A `Session`. * `coord`: Optional `Coordinator` object for reporting errors and checking stop conditions. * `daemon`: Boolean. If `True` make the threads daemon threads. * `start`: Boolean. If `True` starts the threads. If `False` the caller must call the `start()` method of the returned threads. ##### Returns: A list of threads. ##### Raises: * `RuntimeError`: If threads from a previous call to `create_threads()` are still running. - - - #### `tf.train.QueueRunner.exceptions_raised` Exceptions raised but not handled by the `QueueRunner` threads. Exceptions raised in queue runner threads are handled in one of two ways depending on whether or not a `Coordinator` was passed to `create_threads()`: * With a `Coordinator`, exceptions are reported to the coordinator and forgotten by the `QueueRunner`. * Without a `Coordinator`, exceptions are captured by the `QueueRunner` and made available in this `exceptions_raised` property. ##### Returns: A list of Python `Exception` objects. The list is empty if no exception was captured. (No exceptions are captured when using a Coordinator.) - - - ### `tf.train.add_queue_runner(qr, collection='queue_runners')` Adds a `QueueRunner` to a collection in the graph. When building a complex model that uses many queues it is often difficult to gather all the queue runners that need to be run. This convenience function allows you to add a queue runner to a well known collection in the graph. The companion method `start_queue_runners()` can be used to start threads for all the collected queue runners. ##### Args: * `qr`: A `QueueRunner`. * `collection`: A `GraphKey` specifying the graph collection to add the queue runner to. Defaults to `GraphKeys.QUEUE_RUNNERS`. - - - ### `tf.train.start_queue_runners(sess=None, coord=None, daemon=True, start=True, collection='queue_runners')` Starts all queue runners collected in the graph. This is a companion method to `add_queue_runner()`. It just starts threads for all queue runners collected in the graph. It returns the list of all threads. ##### Args: * `sess`: `Session` used to run the queue ops. Defaults to the default session. * `coord`: Optional `Coordinator` for coordinating the started threads. * `daemon`: Whether the threads should be marked as `daemons`, meaning they don't block program exit. * `start`: Set to `False` to only create the threads, not start them. * `collection`: A `GraphKey` specifying the graph collection to get the queue runners from. Defaults to `GraphKeys.QUEUE_RUNNERS`. ##### Returns: A list of threads. ## Summary Operations The following ops output [`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) protocol buffers as serialized string tensors. You can fetch the output of a summary op in a session, and pass it to a [SummaryWriter](../../api_docs/python/train.md#SummaryWriter) to append it to an event file. Event files contain [`Event`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/util/event.proto) protos that can contain `Summary` protos along with the timestamp and step. You can then use TensorBoard to visualize the contents of the event files. See [TensorBoard and Summaries](../../how_tos/summaries_and_tensorboard/index.md) for more details. - - - ### `tf.scalar_summary(tags, values, collections=None, name=None)` Outputs a `Summary` protocol buffer with scalar values. The input `tags` and `values` must have the same shape. The generated summary has a summary value for each tag-value pair in `tags` and `values`. ##### Args: * `tags`: A 1-D `string` `Tensor`. Tags for the summaries. * `values`: A 1-D `float32` or `float64` Tensor. Values for the summaries. * `collections`: Optional list of graph collections keys. The new summary op is added to these collections. Defaults to `[GraphKeys.SUMMARIES]`. * `name`: A name for the operation (optional). ##### Returns: A scalar `Tensor` of type `string`. The serialized `Summary` protocol buffer. - - - ### `tf.image_summary(tag, tensor, max_images=None, collections=None, name=None)` Outputs a `Summary` protocol buffer with images. The summary has up to `max_images` summary values containing images. The images are built from `tensor` which must be 4-D with shape `[batch_size, height, width, channels]` and where `channels` can be: * 1: `tensor` is interpreted as Grayscale. * 3: `tensor` is interpreted as RGB. * 4: `tensor` is interpreted as RGBA. The images have the same number of channels as the input tensor. Their values are normalized, one image at a time, to fit in the range `[0, 255]`. The op uses two different normalization algorithms: * If the input values are all positive, they are rescaled so the largest one is 255. * If any input value is negative, the values are shifted so input value 0.0 is at 127. They are then rescaled so that either the smallest value is 0, or the largest one is 255. The `tag` argument is a scalar `Tensor` of type `string`. It is used to build the `tag` of the summary values: * If `max_images` is 1, the summary value tag is '*tag*/image'. * If `max_images` is greater than 1, the summary value tags are generated sequentially as '*tag*/image/0', '*tag*/image/1', etc. ##### Args: * `tag`: A scalar `Tensor` of type `string`. Used to build the `tag` of the summary values. * `tensor`: A 4-D `float32` `Tensor` of shape `[batch_size, height, width, channels]` where `channels` is 1, 3, or 4. * `max_images`: Max number of batch elements to generate images for. * `collections`: Optional list of ops.GraphKeys. The collections to add the summary to. Defaults to [ops.GraphKeys.SUMMARIES] * `name`: A name for the operation (optional). ##### Returns: A scalar `Tensor` of type `string`. The serialized `Summary` protocol buffer. - - - ### `tf.histogram_summary(tag, values, collections=None, name=None)` Outputs a `Summary` protocol buffer with a histogram. The generated [`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) has one summary value containing a histogram for `values`. This op reports an `OutOfRange` error if any value is not finite. ##### Args: * `tag`: A `string` `Tensor`. 0-D. Tag to use for the summary value. * `values`: A `float32` `Tensor`. Any shape. Values to use to build the histogram. * `collections`: Optional list of graph collections keys. The new summary op is added to these collections. Defaults to `[GraphKeys.SUMMARIES]`. * `name`: A name for the operation (optional). ##### Returns: A scalar `Tensor` of type `string`. The serialized `Summary` protocol buffer. - - - ### `tf.nn.zero_fraction(value, name=None)` Returns the fraction of zeros in `value`. If `value` is empty, the result is `nan`. This is useful in summaries to measure and report sparsity. For example, z = tf.Relu(...) summ = tf.scalar_summary('sparsity', tf.zero_fraction(z)) ##### Args: * `value`: A tensor of numeric type. * `name`: A name for the operation (optional). ##### Returns: The fraction of zeros in `value`, with type `float32`. - - - ### `tf.merge_summary(inputs, collections=None, name=None)` Merges summaries. This op creates a [`Summary`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) protocol buffer that contains the union of all the values in the input summaries. When the Op is run, it reports an `InvalidArgument` error if multiple values in the summaries to merge use the same tag. ##### Args: * `inputs`: A list of `string` `Tensor` objects containing serialized `Summary` protocol buffers. * `collections`: Optional list of graph collections keys. The new summary op is added to these collections. Defaults to `[GraphKeys.SUMMARIES]`. * `name`: A name for the operation (optional). ##### Returns: A scalar `Tensor` of type `string`. The serialized `Summary` protocol buffer resulting from the merging. - - - ### `tf.merge_all_summaries(key='summaries')` Merges all summaries collected in the default graph. ##### Args: * `key`: `GraphKey` used to collect the summaries. Defaults to `GraphKeys.SUMMARIES`. ##### Returns: If no summaries were collected, returns None. Otherwise returns a scalar `Tensor` of type`string` containing the serialized `Summary` protocol buffer resulting from the merging. ## Adding Summaries to Event Files See [Summaries and TensorBoard](../../how_tos/summaries_and_tensorboard/index.md) for an overview of summaries, event files, and visualization in TensorBoard. - - - ### `class tf.train.SummaryWriter` Writes `Summary` protocol buffers to event files. The `SummaryWriter` class provides a mechanism to create an event file in a given directory and add summaries and events to it. The class updates the file contents asynchronously. This allows a training program to call methods to add data to the file directly from the training loop, without slowing down training. - - - #### `tf.train.SummaryWriter.__init__(logdir, graph_def=None, max_queue=10, flush_secs=120)` Creates a `SummaryWriter` and an event file. On construction the summary writer creates a new event file in `logdir`. This event file will contain `Event` protocol buffers constructed when you call one of the following functions: `add_summary()`, `add_event()`, or `add_graph()`. If you pass a `graph_def` protocol buffer to the constructor it is added to the event file. (This is equivalent to calling `add_graph()` later). TensorBoard will pick the graph from the file and display it graphically so you can interactively explore the graph you built. You will usually pass the graph from the session in which you launched it: ```python ...create a graph... # Launch the graph in a session. sess = tf.Session() # Create a summary writer, add the 'graph_def' to the event file. writer = tf.train.SummaryWriter(, sess.graph_def) ``` The other arguments to the constructor control the asynchronous writes to the event file: * `flush_secs`: How often, in seconds, to flush the added summaries and events to disk. * `max_queue`: Maximum number of summaries or events pending to be written to disk before one of the 'add' calls block. ##### Args: * `logdir`: A string. Directory where event file will be written. * `graph_def`: A `GraphDef` protocol buffer. * `max_queue`: Integer. Size of the queue for pending events and summaries. * `flush_secs`: Number. How often, in seconds, to flush the pending events and summaries to disk. - - - #### `tf.train.SummaryWriter.add_summary(summary, global_step=None)` Adds a `Summary` protocol buffer to the event file. This method wraps the provided summary in an `Event` procotol buffer and adds it to the event file. You can pass the output of any summary op, as-is, to this function. You can also pass a `Summary` procotol buffer that you manufacture with your own data. This is commonly done to report evaluation results in event files. ##### Args: * `summary`: A `Summary` protocol buffer, optionally serialized as a string. * `global_step`: Number. Optional global step value to record with the summary. - - - #### `tf.train.SummaryWriter.add_event(event)` Adds an event to the event file. ##### Args: * `event`: An `Event` protocol buffer. - - - #### `tf.train.SummaryWriter.add_graph(graph_def, global_step=None)` Adds a `GraphDef` protocol buffer to the event file. The graph described by the protocol buffer will be displayed by TensorBoard. Most users pass a graph in the constructor instead. ##### Args: * `graph_def`: A `GraphDef` protocol buffer. * `global_step`: Number. Optional global step counter to record with the graph. - - - #### `tf.train.SummaryWriter.flush()` Flushes the event file to disk. Call this method to make sure that all pending events have been written to disk. - - - #### `tf.train.SummaryWriter.close()` Flushes the event file to disk and close the file. Call this method when you do not need the summary writer anymore. - - - ### `tf.train.summary_iterator(path)` An iterator for reading `Event` protocol buffers from an event file. You can use this function to read events written to an event file. It returns a Python iterator that yields `Event` protocol buffers. Example: Print the contents of an events file. ```python for e in tf.summary_iterator(path to events file): print e ``` Example: Print selected summary values. ```python # This example supposes that the events file contains summaries with a # summary value tag 'loss'. These could have been added by calling # `add_summary()`, passing the output of a scalar summary op created with # with: `tf.scalar_summary(['loss'], loss_tensor)`. for e in tf.summary_iterator(path to events file): for v in e.summary.value: if v.tag == 'loss': print v.simple_value ``` See the protocol buffer definitions of [Event](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/util/event.proto) and [Summary](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto) for more information about their attributes. ##### Args: * `path`: The path to an event file created by a `SummaryWriter`. ##### Yields: `Event` protocol buffers. ## Training utilities - - - ### `tf.train.global_step(sess, global_step_tensor)` Small helper to get the global step. ```python # Creates a variable to hold the global_step. global_step_tensor = tf.Variable(10, trainable=False, name='global_step') # Creates a session. sess = tf.Session() # Initializes the variable. sess.run(global_step_tensor.initializer) print 'global_step:', tf.train.global_step(sess, global_step_tensor) global_step: 10 ``` ##### Args: * `sess`: A brain `Session` object. * `global_step_tensor`: `Tensor` or the `name` of the operation that contains the global step. ##### Returns: The global step value. - - - ### `tf.train.write_graph(graph_def, logdir, name, as_text=True)` Writes a graph proto on disk. The graph is written as a binary proto unless as_text is `True`. ```python v = tf.Variable(0, name='my_variable') sess = tf.Session() tf.train.write_graph(sess.graph_def, '/tmp/my-model', 'train.pbtxt') ``` ##### Args: * `graph_def`: A `GraphDef` protocol buffer. * `logdir`: Directory where to write the graph. * `name`: Filename for the graph. * `as_text`: If `True`, writes the graph as an ASCII proto. ================================================ FILE: SOURCE/extras/README.txt ================================================ This directory holds extra files we'd like to be able to link to and serve from within tensorflow.org ================================================ FILE: SOURCE/faq.md ================================================ # 常见问题汇总 ## 说明 本章非官方文档翻译,是由众多TensorFlow爱好者将安装和使用TF过程中的问题总结而成的。 ## 常见问题及解答 ### (示例)官网地址是哪里? [http://www.tensorflow.org/](http://www.tensorflow.org/) ### 如何安装 pip ? * Ubuntu (14.04) sudo apt-get update && sudo apt-get install -y python python-dev python-pip * CentOS 7 yum update -y && yum install -y python python-devel epel-release.noarch python-pip * MACOS sudo easy_install pip ### docker run -it b.gcr.io/tensorflow/tensorflow 失败 该镜像所在仓库被墙,需要梯子。 [这里](http://pan.baidu.com/s/1bnyVrMR)(密码:v9ts)有镜像的导出包。 使用方法 docker load < sensorflow.tar.gz ### SSLError: SSL_VERIFY_FAILED 在用url通过pip的方式安装的时候,可能会遇到的错误如下: SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:590) 解决办法:先下载wheel到本地,之后本地pip install ================================================ FILE: SOURCE/get_started/basic_usage.md ================================================ # 基本使用 使用 TensorFlow, 你必须明白 TensorFlow: * 使用图 (graph) 来表示计算任务. * 在被称之为 `会话 (Session)` 的上下文 (context) 中执行图. * 使用 tensor 表示数据. * 通过 `变量 (Variable)` 维护状态. * 使用 feed 和 fetch 可以为任意的操作(arbitrary operation) 赋值或者从其中获取数据. ## 综述 TensorFlow 是一个编程系统, 使用图来表示计算任务. 图中的节点被称之为 *op* (operation 的缩写). 一个 op 获得 0 个或多个 `Tensor`, 执行计算, 产生 0 个或多个 `Tensor `. 每个 Tensor 是一个类型化的多维数组. 例如, 你可以将一小组图像集表示为一个四维浮点数数组, 这四个维度分别是 `[batch, height, width, channels]`. 一个 TensorFlow 图*描述*了计算的过程. 为了进行计算, 图必须在 `会话` 里被启动. `会话` 将图的 op 分发到诸如 CPU 或 GPU 之类的 `设备` 上, 同时提供执行 op 的方法. 这些方法执行后, 将产生的 tensor 返回. 在 Python 语言中, 返回的 tensor 是 [numpy](http://www.numpy.org) `ndarray` 对象; 在 C 和 C++ 语言中, 返回的 tensor 是 `tensorflow::Tensor` 实例. ## 计算图 TensorFlow 程序通常被组织成一个构建阶段和一个执行阶段. 在构建阶段, op 的执行步骤 被描述成一个图. 在执行阶段, 使用会话执行执行图中的 op. 例如, 通常在构建阶段创建一个图来表示和训练神经网络, 然后在执行阶段反复执行图中的训练 op. TensorFlow 支持 C, C++, Python 编程语言. 目前, TensorFlow 的 Python 库更加易用, 它提供了大量的辅助函数来简化构建图的工作, 这些函数尚未被 C 和 C++ 库支持. 三种语言的会话库 (session libraries) 是一致的. ### 构建图 构建图的第一步, 是创建源 op (source op). 源 op 不需要任何输入, 例如 `常量 (Constant)`. 源 op 的输出被传递给其它 op 做运算. Python 库中, op 构造器的返回值代表被构造出的 op 的输出, 这些返回值可以传递给其它 op 构造器作为输入. TensorFlow Python 库有一个*默认图 (default graph)*, op 构造器可以为其增加节点. 这个默认图对 许多程序来说已经足够用了. 阅读 [Graph 类](../api_docs/python/framework.md#Graph) 文档 来了解如何管理多个图. ```python import tensorflow as tf # 创建一个常量 op, 产生一个 1x2 矩阵. 这个 op 被作为一个节点 # 加到默认图中. # # 构造器的返回值代表该常量 op 的返回值. matrix1 = tf.constant([[3., 3.]]) # 创建另外一个常量 op, 产生一个 2x1 矩阵. matrix2 = tf.constant([[2.],[2.]]) # 创建一个矩阵乘法 matmul op , 把 'matrix1' 和 'matrix2' 作为输入. # 返回值 'product' 代表矩阵乘法的结果. product = tf.matmul(matrix1, matrix2) ``` 默认图现在有三个节点, 两个 `constant()` op, 和一个`matmul()` op. 为了真正进行矩阵相乘运算, 并得到矩阵乘法的 结果, 你必须在会话里启动这个图. ### 在一个会话中启动图 构造阶段完成后, 才能启动图. 启动图的第一步是创建一个 `Session` 对象, 如果无任何创建参数, 会话构造器将启动默认图. 欲了解完整的会话 API, 请阅读[Session 类](../api_docs/python/client.md#session-management). ```python # 启动默认图. sess = tf.Session() # 调用 sess 的 'run()' 方法来执行矩阵乘法 op, 传入 'product' 作为该方法的参数. # 上面提到, 'product' 代表了矩阵乘法 op 的输出, 传入它是向方法表明, 我们希望取回 # 矩阵乘法 op 的输出. # # 整个执行过程是自动化的, 会话负责传递 op 所需的全部输入. op 通常是并发执行的. # # 函数调用 'run(product)' 触发了图中三个 op (两个常量 op 和一个矩阵乘法 op) 的执行. # # 返回值 'result' 是一个 numpy `ndarray` 对象. result = sess.run(product) print result # ==> [[ 12.]] # 任务完成, 关闭会话. sess.close() ``` `Session` 对象在使用完后需要关闭以释放资源. 除了显式调用 close 外, 也可以使用 "with" 代码块 来自动完成关闭动作. ```python with tf.Session() as sess: result = sess.run([product]) print result ``` 在实现上, TensorFlow 将图形定义转换成分布式执行的操作, 以充分利用可用的计算资源(如 CPU 或 GPU). 一般你不需要显式指定使用 CPU 还是 GPU, TensorFlow 能自动检测. 如果检测到 GPU, TensorFlow 会尽可能地利用找到的第一个 GPU 来执行操作. 如果机器上有超过一个可用的 GPU, 除第一个外的其它 GPU 默认是不参与计算的. 为了让 TensorFlow 使用这些 GPU, 你必须将 op 明确指派给它们执行. `with...Device` 语句用来指派特定的 CPU 或 GPU 执行操作: ```python with tf.Session() as sess: with tf.device("/gpu:1"): matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) ... ``` 设备用字符串进行标识. 目前支持的设备包括: * `"/cpu:0"`: 机器的 CPU. * `"/gpu:0"`: 机器的第一个 GPU, 如果有的话. * `"/gpu:1"`: 机器的第二个 GPU, 以此类推. 阅读[使用GPU](tensorflow-zh/SOURCE/how_tos/using_gpu.md)章节, 了解 TensorFlow GPU 使用的更多信息. ## 交互式使用 文档中的 Python 示例使用一个会话 [`Session`](../api_docs/python/client.md#Session) 来 启动图, 并调用 [`Session.run()`](../api_docs/python/client.md#Session.run) 方法执行操作. 为了便于使用诸如 [IPython](http://ipython.org) 之类的 Python 交互环境, 可以使用 [`InteractiveSession`](../api_docs/python/client.md#InteractiveSession) 代替 `Session` 类, 使用 [`Tensor.eval()`](../api_docs/python/framework.md#Tensor.eval) 和 [`Operation.run()`](../api_docs/python/framework.md#Operation.run) 方法代替 `Session.run()`. 这样可以避免使用一个变量来持有会话. ```python # 进入一个交互式 TensorFlow 会话. import tensorflow as tf sess = tf.InteractiveSession() x = tf.Variable([1.0, 2.0]) a = tf.constant([3.0, 3.0]) # 使用初始化器 initializer op 的 run() 方法初始化 'x' x.initializer.run() # 增加一个减法 sub op, 从 'x' 减去 'a'. 运行减法 op, 输出结果 sub = tf.sub(x, a) print sub.eval() # ==> [-2. -1.] ``` ## Tensor TensorFlow 程序使用 tensor 数据结构来代表所有的数据, 计算图中, 操作间传递的数据都是 tensor. 你可以把 TensorFlow tensor 看作是一个 n 维的数组或列表. 一个 tensor 包含一个静态类型 rank, 和 一个 shape. 想了解 TensorFlow 是如何处理这些概念的, 参见 [Rank, Shape, 和 Type](../resources/dims_types.md). ## 变量 [Variables](../how_tos/variables/index.md) for more details. 变量维护图执行过程中的状态信息. 下面的例子演示了如何使用变量实现一个简单的计数器. 参见 [变量](tensorflow-zh/how_tos/variables.md) 章节了解更多细节. ```python # 创建一个变量, 初始化为标量 0. state = tf.Variable(0, name="counter") # 创建一个 op, 其作用是使 state 增加 1 one = tf.constant(1) new_value = tf.add(state, one) update = tf.assign(state, new_value) # 启动图后, 变量必须先经过`初始化` (init) op 初始化, # 首先必须增加一个`初始化` op 到图中. init_op = tf.initialize_all_variables() # 启动图, 运行 op with tf.Session() as sess: # 运行 'init' op sess.run(init_op) # 打印 'state' 的初始值 print sess.run(state) # 运行 op, 更新 'state', 并打印 'state' for _ in range(3): sess.run(update) print sess.run(state) # 输出: # 0 # 1 # 2 # 3 ``` 代码中 `assign()` 操作是图所描绘的表达式的一部分, 正如 `add()` 操作一样. 所以在调用 `run()` 执行表达式之前, 它并不会真正执行赋值操作. 通常会将一个统计模型中的参数表示为一组变量. 例如, 你可以将一个神经网络的权重作为某个变量存储在一个 tensor 中. 在训练过程中, 通过重复运行训练图, 更新这个 tensor. ## Fetch 为了取回操作的输出内容, 可以在使用 `Session` 对象的 `run()` 调用 执行图时, 传入一些 tensor, 这些 tensor 会帮助你取回结果. 在之前的例子里, 我们只取回了单个节点 `state`, 但是你也可以取回多个 tensor: ```python input1 = tf.constant(3.0) input2 = tf.constant(2.0) input3 = tf.constant(5.0) intermed = tf.add(input2, input3) mul = tf.mul(input1, intermed) with tf.Session() as sess: result = sess.run([mul, intermed]) print result # 输出: # [array([ 21.], dtype=float32), array([ 7.], dtype=float32)] ``` 需要获取的多个 tensor 值,在 op 的一次运行中一起获得(而不是逐个去获取 tensor)。 ## Feed 上述示例在计算图中引入了 tensor, 以常量或变量的形式存储. TensorFlow 还提供了 feed 机制, 该机制 可以临时替代图中的任意操作中的 tensor 可以对图中任何操作提交补丁, 直接插入一个 tensor. feed 使用一个 tensor 值临时替换一个操作的输出结果. 你可以提供 feed 数据作为 `run()` 调用的参数. feed 只在调用它的方法内有效, 方法结束, feed 就会消失. 最常见的用例是将某些特殊的操作指定为 "feed" 操作, 标记的方法是使用 tf.placeholder() 为这些操作创建占位符. ```python input1 = tf.placeholder(tf.float32) input2 = tf.placeholder(tf.float32) output = tf.mul(input1, input2) with tf.Session() as sess: print sess.run([output], feed_dict={input1:[7.], input2:[2.]}) # 输出: # [array([ 14.], dtype=float32)] ``` for a larger-scale example of feeds. 如果没有正确提供 feed, `placeholder()` 操作将会产生错误. MNIST 全连通 [feed 教程](http://wiki.jikexueyuan.com/project/tensorflow-zh/tutorials/mnist_tf.html) ([source code](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/tutorials/mnist/fully_connected_feed.py)) 给出了一个更大规模的使用 feed 的例子. > 原文:[Basic Usage](http://tensorflow.org/get_started/basic_usage.md) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@yangtze](https://github.com/sstruct) ================================================ FILE: SOURCE/get_started/introduction.md ================================================ # 简介 本章的目的是让你了解和运行 TensorFlow! 在开始之前, 让我们先看一段使用 Python API 撰写的 TensorFlow 示例代码, 让你对将要学习的内容有初步的印象. 这段很短的 Python 程序生成了一些三维数据, 然后用一个平面拟合它. ```python import tensorflow as tf import numpy as np # 使用 NumPy 生成假数据(phony data), 总共 100 个点. x_data = np.float32(np.random.rand(2, 100)) # 随机输入 y_data = np.dot([0.100, 0.200], x_data) + 0.300 # 构造一个线性模型 # b = tf.Variable(tf.zeros([1])) W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0)) y = tf.matmul(W, x_data) + b # 最小化方差 loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(loss) # 初始化变量 init = tf.initialize_all_variables() # 启动图 (graph) sess = tf.Session() sess.run(init) # 拟合平面 for step in xrange(0, 201): sess.run(train) if step % 20 == 0: print step, sess.run(W), sess.run(b) # 得到最佳拟合结果 W: [[0.100 0.200]], b: [0.300] ``` 为了进一步激发你的学习欲望, 我们想让你先看一下 TensorFlow 是如何解决一个经典的机器 学习问题的. 在神经网络领域, 最为经典的问题莫过于 MNIST 手写数字分类问题. 我们准备了 两篇不同的教程, 分别面向机器学习领域的初学者和专家. 如果你已经使用其它软件训练过许多 MNIST 模型, 请阅读高级教程 (红色药丸链接). 如果你以前从未听说过 MNIST, 请阅读初级教程 (蓝色药丸链接). 如果你的水平介于这两类人之间, 我们建议你先快速浏览初级教程, 然后再阅读高级教程.

图片由 CC BY-SA 4.0 授权; 原作者 W. Carter

如果你已经下定决心, 准备学习和安装 TensorFlow, 你可以略过这些文字, 直接阅读 后面的章节. 不用担心, 你仍然会看到 MNIST -- 在阐述 TensorFlow 的特性时, 我们还会使用 MNIST 作为一个样例. ## 推荐随后阅读: * [下载与安装](../get_started/os_setup.md) * [基本使用](../get_started/basic_usage.md) * [TensorFlow 技术指南](../tutorials/mnist/tf/index.md) > 原文:[Introduction](http://tensorflow.org/get_started) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@yangtze](https://github.com/sstruct) ================================================ FILE: SOURCE/get_started/os_setup.md ================================================ # 下载与安装 你可以使用我们提供的 Pip, Docker, Virtualenv, Anaconda 或 源码编译的方法安装 TensorFlow. ## Pip 安装 [Pip](https://en.wikipedia.org/wiki/Pip_(package_manager)) 是一个 Python 的软件包安装与管理工具. 在安装 TensorFlow 过程中要涉及安装或升级的包详见 [列表](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/pip_package/setup.py) 首先安装 pip (或 Python3 的 pip3 ): ```bash # Ubuntu/Linux 64-bit $ sudo apt-get install python-pip python-dev # Mac OS X $ sudo easy_install pip ``` 安装 TensorFlow : ```bash # Ubuntu/Linux 64-bit, CPU only, Python 2.7: $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl # Ubuntu/Linux 64-bit, GPU enabled, Python 2.7. Requires CUDA toolkit 7.5 and CuDNN v4. # For other versions, see "Install from sources" below. $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp27-none-linux_x86_64.whl # Mac OS X, CPU only: $ sudo easy_install --upgrade six $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py2-none-any.whl ``` 如果是 Python3 : ```bash # Ubuntu/Linux 64-bit, CPU only, Python 3.4: $ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl # Ubuntu/Linux 64-bit, GPU enabled, Python 3.4. Requires CUDA toolkit 7.5 and CuDNN v4. # For other versions, see "Install from sources" below. $ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0-cp34-cp34m-linux_x86_64.whl # Mac OS X, CPU only: $ sudo easy_install --upgrade six $ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0-py3-none-any.whl ``` 备注:如果之前安装过 TensorFlow < 0.7.1 的版本,应该先使用 `pip uninstall` 卸载 TensorFlow 和 protobuf ,保证获取的是一个最新 protobuf 依赖下的安装包. 之后可以[测试](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#test-the-tensorflow-installation)一下. ## 基于 Docker 的安装 我们也支持通过 [Docker](http://docker.com/) 运行 TensorFlow. 该方式的优点是不用操心软件依赖问题. 首先, [安装 Docker](http://docs.docker.com/engine/installation/). 一旦 Docker 已经启动运行, 可以通过命令启动一个容器: ```bash $ docker run -it b.gcr.io/tensorflow/tensorflow ``` 该命令将启动一个已经安装好 TensorFlow 及相关依赖的容器. ### 其它镜像 默认的 Docker 镜像只包含启动和运行 TensorFlow 所需依赖库的一个最小集. 我们额外提供了 下面的容器, 该容器同样可以通过上述 `docker run` 命令安装: * `b.gcr.io/tensorflow/tensorflow-full`: 镜像中的 TensorFlow 是从源代码完整安装的, 包含了编译和运行 TensorFlow 所需的全部工具. 在该镜像上, 可以直接使用源代码进行实验, 而不需要再安装上述的任何依赖. ## 基于 VirtualEnv 的安装 我们推荐使用 [virtualenv](https://pypi.python.org/pypi/virtualenv) 创建一个隔离的容器, 来安装 TensorFlow. 这是可选的, 但是这样做能使排查安装问题变得更容易. 首先, 安装所有必备工具: ```bash # 在 Linux 上: $ sudo apt-get install python-pip python-dev python-virtualenv # 在 Mac 上: $ sudo easy_install pip # 如果还没有安装 pip $ sudo pip install --upgrade virtualenv ``` 接下来, 建立一个全新的 virtualenv 环境. 为了将环境建在 `~/tensorflow` 目录下, 执行: ```bash $ virtualenv --system-site-packages ~/tensorflow $ cd ~/tensorflow ``` 然后, 激活 virtualenv: ```bash $ source bin/activate # 如果使用 bash $ source bin/activate.csh # 如果使用 csh (tensorflow)$ # 终端提示符应该发生变化 ``` 在 virtualenv 内, 安装 TensorFlow: ```bash (tensorflow)$ pip install --upgrade <$url_to_binary.whl> ``` 接下来, 使用类似命令运行 TensorFlow 程序: ```bash (tensorflow)$ cd tensorflow/models/image/mnist (tensorflow)$ python convolutional.py # 当使用完 TensorFlow (tensorflow)$ deactivate # 停用 virtualenv $ # 你的命令提示符会恢复原样 ``` ## 基于 Anaconda 的安装 [Anaconda](https://www.continuum.io/why-anaconda) 是一个集成许多第三方科学计算库的 Python 科学计算环境,Anaconda 使用 conda 作为自己的包管理工具,同时具有自己的[计算环境](http://conda.pydata.org/docs/using/envs.html),类似 Virtualenv. 和 Virtualenv 一样,不同 Python 工程需要的依赖包,conda 将他们存储在不同的地方。 TensorFlow 上安装的 Anaconda 不会对之前安装的 Python 包进行覆盖. * 安装 Anaconda * 建立一个 conda 计算环境 * 激活环境,使用 conda 安装 TensorFlow * 安装成功后,每次使用 TensorFlow 的时候需要激活 conda 环境 安装 Anaconda : 参考 Anaconda 的下载页面的[指导](https://www.continuum.io/downloads) 建立一个 conda 计算环境名字叫`tensorflow`: ```bash # Python 2.7 $ conda create -n tensorflow python=2.7 # Python 3.4 $ conda create -n tensorflow python=3.4 ``` 激活`tensorflow`环境,然后使用其中的 pip 安装 TensorFlow. 当使用`easy_install`使用`--ignore-installed`标记防止错误的产生。 ```bash $ source activate tensorflow (tensorflow)$ # Your prompt should change # Ubuntu/Linux 64-bit, CPU only, Python 2.7: (tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl # Ubuntu/Linux 64-bit, GPU enabled, Python 2.7. Requires CUDA toolkit 7.5 and CuDNN v4. # For other versions, see "Install from sources" below. (tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp27-none-linux_x86_64.whl # Mac OS X, CPU only: (tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py2-none-any.whl ``` 对于 Python 3.x : ```bash $ source activate tensorflow (tensorflow)$ # Your prompt should change # Ubuntu/Linux 64-bit, CPU only, Python 3.4: (tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl # Ubuntu/Linux 64-bit, GPU enabled, Python 3.4. Requires CUDA toolkit 7.5 and CuDNN v4. # For other versions, see "Install from sources" below. (tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.8.0rc0-cp34-cp34m-linux_x86_64.whl # Mac OS X, CPU only: (tensorflow)$ pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.8.0rc0-py3-none-any.whl ``` conda 环境激活后,你可以[测试](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md#test-the-tensorflow-installation) 当你不用 TensorFlow 的时候,关闭环境: ```bash (tensorflow)$ source deactivate $ # Your prompt should change back ``` 再次使用的时候再激活 :-) ```bash $ source activate tensorflow (tensorflow)$ # Your prompt should change. # Run Python programs that use TensorFlow. ... # When you are done using TensorFlow, deactivate the environment. (tensorflow)$ source deactivate ``` ## 尝试你的第一个 TensorFlow 程序 ### (可选) 启用 GPU 支持 如果你使用 pip 二进制包安装了开启 GPU 支持的 TensorFlow, 你必须确保 系统里安装了正确的 CUDA sdk 和 CUDNN 版本. 请参间 [CUDA 安装教程](#install_cuda) 你还需要设置 `LD_LIBRARY_PATH` 和 `CUDA_HOME` 环境变量. 可以考虑将下面的命令 添加到 `~/.bash_profile` 文件中, 这样每次登陆后自动生效. 注意, 下面的命令 假定 CUDA 安装目录为 `/usr/local/cuda`: ```bash export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64" export CUDA_HOME=/usr/local/cuda ``` ### 运行 TensorFlow 打开一个 python 终端: ```bash $ python >>> import tensorflow as tf >>> hello = tf.constant('Hello, TensorFlow!') >>> sess = tf.Session() >>> print sess.run(hello) Hello, TensorFlow! >>> a = tf.constant(10) >>> b = tf.constant(32) >>> print sess.run(a+b) 42 >>> ``` ## 从源码安装 ### 克隆 TensorFlow 仓库 ```bash $ git clone --recurse-submodules https://github.com/tensorflow/tensorflow ``` `--recurse-submodules` 参数是必须得, 用于获取 TesorFlow 依赖的 protobuf 库. ### Linux 安装 #### 安装 Bazel 首先依照 [教程](http://bazel.io/docs/install.html) 安装 Bazel 的依赖. 然后在 [链接](https://github.com/bazelbuild/bazel/releases) 中下载适合你的操作系统的最新稳定版, 最后按照下面脚本执行: ```bash $ chmod +x PATH_TO_INSTALL.SH $ ./PATH_TO_INSTALL.SH --user ``` 注意把 `PATH_TO_INSTALL.SH` 替换为你下载的安装包的文件路径. 将执行路径 `output/bazel` 添加到 `$PATH` 环境变量中. #### 安装其他依赖 ```bash # For Python 2.7: $ sudo apt-get install python-numpy swig python-dev python-wheel # For Python 3.x: $ sudo apt-get install python3-numpy swig python3-dev python3-wheel ``` #### 可选: 安装 CUDA (在 Linux 上开启 GPU 支持) 为了编译并运行能够使用 GPU 的 TensorFlow, 需要先安装 NVIDIA 提供的 Cuda Toolkit 7.0 和 CUDNN 6.5 V2. TensorFlow 的 GPU 特性只支持 NVidia Compute Capability >= 3.5 的显卡. 被支持的显卡 包括但不限于: * NVidia Titan * NVidia Titan X * NVidia K20 * NVidia K40 ##### 下载并安装 Cuda Toolkit 7.0 [下载地址](https://developer.nvidia.com/cuda-toolkit-70) 将工具安装到诸如 `/usr/local/cuda` 之类的路径. ##### 下载并安装 CUDNN Toolkit 6.5 [下载地址](https://developer.nvidia.com/rdp/cudnn-archive) 解压并拷贝 CUDNN 文件到 Cuda Toolkit 7.0 安装路径下. 假设 Cuda Toolkit 7.0 安装 在 `/usr/local/cuda`, 执行以下命令: ``` bash tar xvzf cudnn-6.5-linux-x64-v2.tgz sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64 ``` ##### 配置 TensorFlow 的 Cuda 选项 从源码树的根路径执行: ``` bash $ ./configure Do you wish to bulid TensorFlow with GPU support? [y/n] y GPU support will be enabled for TensorFlow Please specify the location where CUDA 7.0 toolkit is installed. Refer to README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda Please specify the location where CUDNN 6.5 V2 library is installed. Refer to README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda Setting up Cuda include Setting up Cuda lib64 Setting up Cuda bin Setting up Cuda nvvm Configuration finished ``` 这些配置将建立到系统 Cuda 库的符号链接. 每当 Cuda 库的路径发生变更时, 必须重新执行上述 步骤, 否则无法调用 bazel 编译命令. ##### 编译目标程序, 开启 GPU 支持 从源码树的根路径执行: ```bash $ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer $ bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu # 大量的输出信息. 这个例子用 GPU 迭代计算一个 2x2 矩阵的主特征值 (major eigenvalue). # 最后几行输出和下面的信息类似. 000009/000005 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] 000006/000001 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] 000009/000009 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] ``` 注意, GPU 支持需通过编译选项 "--config=cuda" 开启. ##### 已知问题 * 尽管可以在同一个源码树下编译开启 Cuda 支持和禁用 Cuda 支持的版本, 我们还是推荐在 在切换这两种不同的编译配置时, 使用 "bazel clean" 清理环境. * 在执行 bazel 编译前必须先运行 configure, 否则编译会失败并提示错误信息. 未来, 我们可能考虑将 configure 步骤包含在编译过程中, 以简化整个过程, 前提是 bazel 能够提供新的特性支持这样. ### Mac OS X 安装 Mac 和 Linux 需要的软件依赖完全一样, 但是安装过程区别很大. 以下链接用于帮助你 在 Mac OS X 上安装这些依赖: #### Bazel 参见[本网页](http://bazel.io/docs/install.html)的 Mac OS X 安装指南. #### SWIG [Mac OS X 安装教程](http://www.swig.org/Doc3.0/Preface.html#Preface_osx_installation). 注意: 你需要安装[PCRE](ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/), 而*不是* PCRE2. #### Numpy 参见[安装教程](http://docs.scipy.org/doc/numpy/user/install.html). ### 创建 pip 包并安装 ```bash $ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package $ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg # .whl 文件的实际名字与你所使用的平台有关 $ pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl ``` ## 训练你的第一个 TensorFlow 神经网络模型 从源代码树的根路径执行: ```python $ cd tensorflow/models/image/mnist $ python convolutional.py Succesfully downloaded train-images-idx3-ubyte.gz 9912422 bytes. Succesfully downloaded train-labels-idx1-ubyte.gz 28881 bytes. Succesfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes. Succesfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes. Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz Initialized! Epoch 0.00 Minibatch loss: 12.054, learning rate: 0.010000 Minibatch error: 90.6% Validation error: 84.6% Epoch 0.12 Minibatch loss: 3.285, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 7.0% ... ... ``` ## 常见问题 ### GPU 相关问题 如果在尝试运行一个 TensorFlow 程序时出现以下错误: ```python ImportError: libcudart.so.7.0: cannot open shared object file: No such file or directory ``` 请确认你正确安装了 GPU 支持, 参见 [相关章节](#install_cuda). ### 在 Linux 上 如果出现错误: ```python ... "__add__", "__radd__", ^ SyntaxError: invalid syntax ``` 解决方案: 确认正在使用的 Python 版本为 Python 2.7. ### 在 Mac OS X 上 如果出现错误: ```python import six.moves.copyreg as copyreg ImportError: No module named copyreg ``` 解决方案: TensorFlow 使用的 protobuf 依赖 `six-1.10.0`. 但是, Apple 的默认 python 环境 已经安装了 `six-1.4.1`, 该版本可能很难升级. 这里提供几种方法来解决该问题: 1. 升级全系统的 `six`: ```bash sudo easy_install -U six ``` 2. 通过 homebrew 安装一个隔离的 python 副本: ```bash brew install python ``` 3. 在[`virtualenv`](#virtualenv_install) 内编译或使用 TensorFlow. 如果出现错误: ``` >>> import tensorflow as tf Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/site-packages/tensorflow/__init__.py", line 4, in from tensorflow.python import * File "/usr/local/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 13, in from tensorflow.core.framework.graph_pb2 import * ... File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/tensor_shape_pb2.py", line 22, in serialized_pb=_b('\n,tensorflow/core/framework/tensor_shape.proto\x12\ntensorflow\"d\n\x10TensorShapeProto\x12-\n\x03\x64im\x18\x02 \x03(\x0b\x32 .tensorflow.TensorShapeProto.Dim\x1a!\n\x03\x44im\x12\x0c\n\x04size\x18\x01 \x01(\x03\x12\x0c\n\x04name\x18\x02 \x01(\tb\x06proto3') TypeError: __init__() got an unexpected keyword argument 'syntax' ``` 这是由于安装了冲突的 protobuf 版本引起的, TensorFlow 需要的是 protobuf 3.0.0. 当前 最好的解决方案是确保没有安装旧版本的 protobuf, 可以使用以下命令重新安装 protobuf 来解决 冲突: ```bash brew reinstall --devel protobuf ``` > 原文:[Download and Setup](http://tensorflow.org/get_started/os_setup.md) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@yangtze](https://github.com/sstruct) ================================================ FILE: SOURCE/how_tos/__init__.py ================================================ ================================================ FILE: SOURCE/how_tos/adding_an_op/__init__.py ================================================ ================================================ FILE: SOURCE/how_tos/adding_an_op/attr_examples.cc ================================================ #include #include "tensorflow/core/framework/op.h" REGISTER_OP("RestrictedTypeExample").Attr("t: {int32, float, bool}"); REGISTER_OP("NumberType").Attr("t: numbertype"); REGISTER_OP("EnumExample").Attr("e: {'apple', 'orange'}"); REGISTER_OP("MinIntExample").Attr("a: int >= 2"); REGISTER_OP("TypeListExample").Attr("a: list({int32, float}) >= 3"); REGISTER_OP("AttrDefaultExample").Attr("i: int = 0"); REGISTER_OP("AttrDefaultExampleForAllTypes") .Attr("s: string = 'foo'") .Attr("i: int = 0") .Attr("f: float = 1.0") .Attr("b: bool = true") .Attr("ty: type = DT_INT32") .Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }") .Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }") .Attr("l_empty: list(int) = []") .Attr("l_int: list(int) = [2, 3, 5, 7]"); int main(int argc, char* argv[]) { printf("All registered ops:\n%s\n", tensorflow::OpRegistry::Global()->DebugString(false).c_str()); return 0; } ================================================ FILE: SOURCE/how_tos/adding_an_op/fact_test.py ================================================ """Test that user ops can be used as expected.""" from __future__ import print_function import tensorflow.python.platform import tensorflow as tf class FactTest(tf.test.TestCase): def test(self): with self.test_session(): print(tf.user_ops.my_fact().eval()) if __name__ == '__main__': tf.test.main() ================================================ FILE: SOURCE/how_tos/adding_an_op/index.md ================================================ # Adding a New Op PREREQUISITES: * Some familiarity with C++. * Must have [downloaded TensorFlow source](../../get_started/introduction.md), and be able to build it. If you'd like to incorporate an operation that isn't covered by the existing library, you can create a custom Op. To incorporate your custom Op, you'll need to: * Register the new Op in a C++ file. The Op registration is independent of the implementation, and describes the semantics of how the Op is invoked. For example, it defines the Op name, and specifies its inputs and outputs. * Implement the Op in C++. This implementation is called a "kernel", and there can be multiple kernels for different architectures (e.g. CPUs, GPUs) or input / output types. * Create a Python wrapper. This wrapper is the public API to create the Op. A default wrapper is generated from the Op registration, which can be used directly or added to. * Optionally, write a function to compute gradients for the Op. * Optionally, write a function that describes the input and output shapes for the Op. This allows shape inference to work with your Op. * Test the Op, typically in Python. ## Contents ### [Adding a New Op](#AUTOGENERATED-adding-a-new-op) * [Define the Op's interface](#define_interface) * [Implement the kernel for the Op](#AUTOGENERATED-implement-the-kernel-for-the-op) * [Generate the client wrapper](#AUTOGENERATED-generate-the-client-wrapper) * [The Python Op wrapper](#AUTOGENERATED-the-python-op-wrapper) * [The C++ Op wrapper](#AUTOGENERATED-the-c---op-wrapper) * [Verify it works](#AUTOGENERATED-verify-it-works) * [Validation](#Validation) * [Op registration](#AUTOGENERATED-op-registration) * [Attrs](#Attrs) * [Attr types](#AUTOGENERATED-attr-types) * [Polymorphism](#Polymorphism) * [Inputs and Outputs](#AUTOGENERATED-inputs-and-outputs) * [Backwards compatibility](#AUTOGENERATED-backwards-compatibility) * [GPU Support](#mult-archs) * [Implement the gradient in Python](#AUTOGENERATED-implement-the-gradient-in-python) * [Implement a shape function in Python](#AUTOGENERATED-implement-a-shape-function-in-python) ## Define the Op's interface You define the interface of an Op by registering it with the TensorFlow system. In the registration, you specify the name of your Op, its inputs (types and names) and outputs (types and names), as well as docstrings and any [attrs](#Attrs) the Op might require. To see how this works, suppose you'd like to create an Op that takes a tensor of `int32`s and outputs a copy of the tensor, with all but the first element set to zero. Create file [`tensorflow/core/user_ops`][user_ops]`/zero_out.cc` and add a call to the `REGISTER_OP` macro that defines the interface for such an Op: ```c++ #include "tensorflow/core/framework/op.h" REGISTER_OP("ZeroOut") .Input("to_zero: int32") .Output("zeroed: int32"); ``` This `ZeroOut` Op takes one tensor `to_zero` of 32-bit integers as input, and outputs a tensor `zeroed` of 32-bit integers. > A note on naming: The name of the Op should be unique and CamelCase. Names > starting with an underscore (`_`) are reserved for internal use. ## Implement the kernel for the Op After you define the interface, provide one or more implementations of the Op. To create one of these kernels, create a class that extends `OpKernel` and overrides the `Compute` method. The `Compute` method provides one `context` argument of type `OpKernelContext*`, from which you can access useful things like the input and output tensors. Add your kernel to the file you created above. The kernel might look something like this: ```c++ #include "tensorflow/core/framework/op_kernel.h" using namespace tensorflow; class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // Create an output tensor Tensor* output_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor)); auto output = output_tensor->template flat(); // Set all but the first element of the output tensor to 0. const int N = input.size(); for (int i = 1; i < N; i++) { output(i) = 0; } // Preserve the first input value if possible. if (N > 0) output(0) = input(0); } }; ``` After implementing your kernel, you register it with the TensorFlow system. In the registration, you specify different constraints under which this kernel will run. For example, you might have one kernel made for CPUs, and a separate one for GPUs. To do this for the `ZeroOut` op, add the following to `zero_out.cc`: ```c++ REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp); ``` Once you [build and reinstall TensorFlow](../../get_started/os_setup.md#create-pip), the Tensorflow system can reference and use the Op when requested. ## Generate the client wrapper ### The Python Op wrapper Python op wrappers are created automatically in `bazel-genfiles/tensorflow/python/ops/gen_user_ops.py` for all ops placed in the [`tensorflow/core/user_ops`][user_ops] directory when you build Tensorflow. Those ops are imported into [`tensorflow/python/user_ops/user_ops.py`][python-user_ops] with the statement: ```python from tensorflow.python.ops.gen_user_ops import * ``` You may optionally use your own function instead. To do this, you first hide the generated code for that op by adding its name to the `hidden` list in the `"user_ops"` rule in [`tensorflow/python/BUILD`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD): ```python tf_gen_op_wrapper_py( name = "user_ops", hidden = [ "Fact", ], require_shape_functions = False, ) ``` List your op next to `"Fact"`. Next you add your replacement function to [`tensorflow/python/user_ops/user_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py). Typically your function will call the generated function to actually add the op to the graph. The hidden version of the generated function will be in the `gen_user_ops` package and start with an underscore ("`_`"). For example: ```python def my_fact(): """Example of overriding the generated code for an Op.""" return gen_user_ops._fact() ``` ### The C++ Op wrapper C++ op wrappers are created automatically for all ops placed in the [`tensorflow/core/user_ops`][user_ops] directory, when you build Tensorflow. For example, ops in `tensorflow/core/user_ops/zero_out.cc` will generate wrappers in `bazel-genfiles/tensorflow/cc/ops/user_ops.{h,cc}`. All generated wrappers for user ops are automatically imported into [`tensorflow/cc/ops/standard_ops.h`][standard_ops-cc] with the statement ```c++ #include "tensorflow/cc/ops/user_ops.h" ``` ## Verify it works A good way to verify that you've successfully implemented your Op is to write a test for it. Create the file `tensorflow/python/kernel_tests/zero_out_op_test.py` with the contents: ```python import tensorflow as tf class ZeroOutTest(tf.test.TestCase): def testZeroOut(self): with self.test_session(): result = tf.user_ops.zero_out([5, 4, 3, 2, 1]) self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0]) ``` Then run your test: ```sh $ bazel test tensorflow/python:zero_out_op_test ``` ## Validation The example above assumed that the Op applied to a tensor of any shape. What if it only applied to vectors? That means adding a check to the above OpKernel implementation. ```c++ void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); OP_REQUIRES(context, TensorShapeUtils::IsVector(input_tensor.shape()), errors::InvalidArgument("ZeroOut expects a 1-D vector.")); // ... } ``` This asserts that the input is a vector, and returns having set the `InvalidArgument` status if it isn't. The [`OP_REQUIRES` macro][validation-macros] takes three arguments: * The `context`, which can either be an `OpKernelContext` or `OpKernelConstruction` pointer (see [`tensorflow/core/framework/op_kernel.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_kernel.h)), for its `SetStatus()` method. * The condition. For example, there are functions for validating the shape of a tensor in [`tensorflow/core/public/tensor_shape.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/tensor_shape.h) * The error itself, which is represented by a `Status` object, see [`tensorflow/core/public/status.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/status.h). A `Status` has both a type (frequently `InvalidArgument`, but see the list of types) and a message. Functions for constructing an error may be found in [`tensorflow/core/lib/core/errors.h`][validation-macros]. Alternatively, if you want to test whether a `Status` object returned from some function is an error, and if so return it, use [`OP_REQUIRES_OK`][validation-macros]. Both of these macros return from the function on error. ## Op registration ### Attrs Ops can have attrs, whose values are set when the Op is added to a graph. These are used to configure the Op, and their values can be accessed both within the kernel implementation and in the types of inputs and outputs in the Op registration. Prefer using an input instead of an attr when possible, since inputs are more flexible. They can change every step, be set using a feed, etc. Attrs are used for things that can't be done with inputs: any configuration that affects the signature (number or type of inputs or outputs) or that can't change from step-to-step. You define an attr when you register the Op, by specifying its name and type using the `Attr` method, which expects a spec of the form: ``` : ``` where `` begins with a letter and can be composed of alphanumeric characters and underscores, and `` is a type expression of the form [described below](#attr-types) For example, if you'd like the `ZeroOut` Op to preserve a user-specified index, instead of only the 0th element, you can register the Op like so:
REGISTER\_OP("ZeroOut")
    .Attr("preserve\_index: int")
    .Input("to\_zero: int32")
    .Output("zeroed: int32");
Your kernel can then access this attr in its constructor via the `context` parameter:
class ZeroOutOp : public OpKernel {
 public:
  explicit ZeroOutOp(OpKernelConstruction\* context) : OpKernel(context) {
    // Get the index of the value to preserve
    OP\_REQUIRES\_OK(context,
                   context->GetAttr("preserve\_index", &preserve\_index\_));
    // Check that preserve\_index is positive
    OP\_REQUIRES(context, preserve\_index_ >= 0,
                errors::InvalidArgument("Need preserve\_index >= 0, got ",
                                        preserve\_index_));
  }
  void Compute(OpKernelContext\* context) override {
    // ...
  }
 private:
  int preserve\_index\_;
};
which can then be used in the `Compute` method:
  void Compute(OpKernelContext\* context) override {
    // ...

// Check that preserve_index is in range OP\_REQUIRES(context, preserve\_index_ < input.dimension(0), errors::InvalidArgument("preserve\_index out of range"));
// Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output\_flat(i) = 0; }
// Preserve the requested input value output\_flat(preserve\_index\_) = input(preserve\_index\_); }
> To preserve [backwards compatibility](#backwards-compatibility), you should > specify a [default value](#default-values-constraints) when adding an attr to > an existing op: > >
> REGISTER\_OP("ZeroOut")
>     .Attr("preserve\_index: int = 0")
>     .Input("to_zero: int32")
>     .Output("zeroed: int32");
> 
### Attr types The following types are supported in an attr: * `string`: Any sequence of bytes (not required to be UTF8). * `int`: A signed integer. * `float`: A floating point number. * `bool`: True or false. * `type`: One of the (non-ref) values of [`DataType`][DataTypeString]. * `shape`: A [`TensorShapeProto`][TensorShapeProto]. * `tensor`: A [`TensorProto`][TensorProto]. * `list()`: A list of ``, where `` is one of the above types. Note that `list(list())` is invalid. See also: [`op_def_builder.cc:FinalizeAttr`][FinalizeAttr] for a definitive list. #### Default values & constraints Attrs may have default values, and some types of attrs can have constraints. To define an attr with constraints, you can use the following ``s: * `{'', ''}`: The value must be a string that has either the value `` or ``. The name of the type, `string`, is implied when you use this syntax. This emulates an enum: ```c++ REGISTER_OP("EnumExample") .Attr("e: {'apple', 'orange'}"); ``` * `{, }`: The value is of type `type`, and must be one of `` or ``, where `` and `` are supported [tensor types](../../resources/dims_types.md#data-types). You don't specify that the type of the attr is `type`. This is implied when you have a list of types in `{...}`. For example, in this case the attr `t` is a type that must be an `int32`, a `float`, or a `bool`: ```c++ REGISTER_OP("RestrictedTypeExample") .Attr("t: {int32, float, bool}"); ``` * There are shortcuts for common type constraints: * `numbertype`: Type `type` restricted to the numeric (non-string and non-bool) types. * `realnumbertype`: Like `numbertype` without complex types. * `quantizedtype`: Like `numbertype` but just the quantized number types. The specific lists of types allowed by these are defined by the functions (like `NumberTypes()`) in [`tensorflow/core/framework/types.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.h). In this example the attr `t` must be one of the numeric types: ```c++ REGISTER_OP("NumberType") .Attr("t: numbertype"); ``` For this op: ```python tf.number_type(t=tf.int32) # Valid tf.number_type(t=tf.bool) # Invalid ``` * `int >= `: The value must be an int whose value is greater than or equal to ``, where `` is a natural number. For example, the following Op registration specifies that the attr `a` must have a value that is at least `2`: ```c++ REGISTER_OP("MinIntExample") .Attr("a: int >= 2"); ``` * `list() >= `: A list of type `` whose length is greater than or equal to ``. For example, the following Op registration specifies that the attr `a` is a list of types (either `int32` or `float`), and that there must be at least 3 of them: ```c++ REGISTER_OP("TypeListExample") .Attr("a: list({int32, float}) >= 3"); ``` To set a default value for an attr (making it optional in the generated code), add `= ` to the end, as in: ```c++ REGISTER_OP("AttrDefaultExample") .Attr("i: int = 0"); ``` The supported syntax of the default value is what would be used in the proto representation of the resulting GraphDef definition. Here are examples for how to specify a default for all types: ```c++ REGISTER_OP("AttrDefaultExampleForAllTypes") .Attr("s: string = 'foo'") .Attr("i: int = 0") .Attr("f: float = 1.0") .Attr("b: bool = true") .Attr("ty: type = DT_INT32") .Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }") .Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }") .Attr("l_empty: list(int) = []") .Attr("l_int: list(int) = [2, 3, 5, 7]"); ``` Note in particular that the values of type `type` use [the `DT_*` names for the types](../../resources/dims_types.md#data-types). ### Polymorphism #### Type Polymorphism For ops that can take different types as input or produce different output types, you can specify [an attr](#attrs) in [an input or output type](#inputs-outputs) in the Op registration. Typically you would then register an `OpKernel` for each supported type. For instance, if you'd like the `ZeroOut` Op to work on `float`s in addition to `int32`s, your Op registration might look like:
REGISTER\_OP("ZeroOut")
    .Attr("T: {float, int32}")
    .Input("to_zero: T")
    .Output("zeroed: T");
Your Op registration now specifies that the input's type must be `float`, or `int32`, and that its output will be the same type, since both have type `T`. > A note on naming:{#naming} Inputs, outputs, and attrs generally should be > given snake_case names. The one exception is attrs that are used as the type > of an input or in the type of an input. Those attrs can be inferred when the > op is added to the graph and so don't appear in the op's function. For > example, this last definition of ZeroOut will generate a Python function that > looks like: > > ```python > def zero_out(to_zero, name=None): > """... > Args: > to_zero: A `Tensor`. Must be one of the following types: > `float32`, `int32`. > name: A name for the operation (optional). > > Returns: > A `Tensor`. Has the same type as `to_zero`. > """ > ``` > > If `to_zero` is passed an `int32` tensor, then `T` is automatically set to > `int32` (well, actually `DT_INT32`). Those inferred attrs are given > Capitalized or CamelCase names. > > Compare this with an op that has a type attr that determines the output > type: > > ```c++ > REGISTER_OP("StringToNumber") > .Input("string_tensor: string") > .Output("output: out_type") > .Attr("out_type: {float, int32}"); > .Doc(R"doc( > Converts each string in the input Tensor to the specified numeric type. > )doc"); > ``` > > In this case, the user has to specify the output type, as in the generated > Python: > > ```python > def string_to_number(string_tensor, out_type=None, name=None): > """Converts each string in the input Tensor to the specified numeric type. > > Args: > string_tensor: A `Tensor` of type `string`. > out_type: An optional `tf.DType` from: `tf.float32, tf.int32`. > Defaults to `tf.float32`. > name: A name for the operation (optional). > > Returns: > A `Tensor` of type `out_type`. > """ > ```
\#include "tensorflow/core/framework/op_kernel.h"
class ZeroOutInt32Op : public OpKernel { // as before };
class ZeroOutFloatOp : public OpKernel { public: explicit ZeroOutFloatOp(OpKernelConstruction\* context) : OpKernel(context) {}
void Compute(OpKernelContext\* context) override { // Grab the input tensor const Tensor& input\_tensor = context->input(0); auto input = input\_tensor.flat<float>();
// Create an output tensor Tensor* output = NULL; OP\_REQUIRES\_OK(context, context->allocate\_output(0, input_tensor.shape(), &output)); auto output\_flat = output->template flat<float>();
// Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output\_flat(i) = 0; }
// Preserve the first input value if (N > 0) output\_flat(0) = input(0); } };
// Note that TypeConstraint<int32>("T") means that attr "T" (defined // in the Op registration above) must be "int32" to use this template // instantiation. REGISTER\_KERNEL\_BUILDER( Name("ZeroOut") .Device(DEVICE\_CPU) .TypeConstraint<int32>("T"), ZeroOutOpInt32); REGISTER\_KERNEL\_BUILDER( Name("ZeroOut") .Device(DEVICE\_CPU) .TypeConstraint<float>("T"), ZeroOutFloatOp);
> To preserve [backwards compatibility](#backwards-compatibility), you should > specify a [default value](#default-values-constraints) when adding an attr to > an existing op: > >
> REGISTER\_OP("ZeroOut")
>   .Attr("T: {float, int32} = DT_INT32")
>   .Input("to_zero: T")
>   .Output("zeroed: T")
> 
Lets say you wanted to add more types, say `double`:
REGISTER\_OP("ZeroOut")
    .Attr("T: {float, double, int32}")
    .Input("to_zero: T")
    .Output("zeroed: T");
Instead of writing another `OpKernel` with redundant code as above, often you will be able to use a C++ template instead. You will still have one kernel registration (`REGISTER\_KERNEL\_BUILDER` call) per overload.
template <typename T>
class ZeroOutOp : public OpKernel {
 public:
  explicit ZeroOutOp(OpKernelConstruction\* context) : OpKernel(context) {}
void Compute(OpKernelContext\* context) override { // Grab the input tensor const Tensor& input\_tensor = context->input(0); auto input = input\_tensor.flat<T>();
// Create an output tensor Tensor* output = NULL; OP\_REQUIRES\_OK(context, context->allocate\_output(0, input_tensor.shape(), &output)); auto output\_flat = output->template flat<T>();
// Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output\_flat(i) = 0; }
// Preserve the first input value if (N > 0) output\_flat(0) = input(0); } };
// Note that TypeConstraint<int32>("T") means that attr "T" (defined // in the Op registration above) must be "int32" to use this template // instantiation. REGISTER\_KERNEL\_BUILDER( Name("ZeroOut") .Device(DEVICE\_CPU) .TypeConstraint<int32>("T"), ZeroOutOp<int32>); REGISTER\_KERNEL\_BUILDER( Name("ZeroOut") .Device(DEVICE\_CPU) .TypeConstraint<float>("T"), ZeroOutOp<float>); REGISTER\_KERNEL\_BUILDER( Name("ZeroOut") .Device(DEVICE\_CPU) .TypeConstraint<double>("T"), ZeroOutOp<double>);
If you have more than a couple overloads, you can put the registration in a macro. ```c++ #include "tensorflow/core/framework/op_kernel.h" #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) REGISTER_KERNEL(int32); REGISTER_KERNEL(float); REGISTER_KERNEL(double); #undef REGISTER_KERNEL ``` Depending on the list of types you are registering the kernel for, you may be able to use a macro provided by [`tensorflow/core/framework/register_types.h`][register_types]: ```c++ #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/register_types.h" REGISTER_OP("ZeroOut") .Attr("T: realnumbertype") .Input("to_zero: T") .Output("zeroed: T"); template class ZeroOutOp : public OpKernel { ... }; #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL); #undef REGISTER_KERNEL ``` #### List Inputs and Outputs In addition to being able to accept or produce different types, ops can consume or produce a variable number of tensors. In the next example, the attr `T` holds a *list* of types, and is used as the type of both the input `in` and the output `out`. The input and output are lists of tensors of that type (and the number and types of tensors in the output are the same as the input, since both have type `T`). ```c++ REGISTER_OP("PolymorphicListExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T"); ``` You can also place restrictions on what types can be specified in the list. In this next case, the input is a list of `float` and `double` tensors. The Op accepts, for example, input types `(float, double, float)` and in that case the output type would also be `(float, double, float)`. ```c++ REGISTER_OP("ListTypeRestrictionExample") .Attr("T: list({float, double})") .Input("in: T") .Output("out: T"); ``` If you want all the tensors in a list to be of the same type, you might do something like: ```c++ REGISTER_OP("IntListInputExample") .Attr("N: int") .Input("in: N * int32") .Output("out: int32"); ``` This accepts a list of `int32` tensors, and uses an `int` attr `N` to specify the length of the list. This can be made [type polymorphic](#type-polymorphism) as well. In the next example, the input is a list of tensors (with length `"N"`) of the same (but unspecified) type (`"T"`), and the output is a single tensor of matching type: ```c++ REGISTER_OP("SameListInputExample") .Attr("N: int") .Attr("T: type") .Input("in: N * T") .Output("out: T"); ``` By default, tensor lists have a minimum length of 1. You can change that default using [a `">="` constraint on the corresponding attr](#default-values-constraints). In this next example, the input is a list of at least 2 `int32` tensors: ```c++ REGISTER_OP("MinLengthIntListExample") .Attr("N: int >= 2") .Input("in: N * int32") .Output("out: int32"); ``` The same syntax works with `"list(type)"` attrs: ```c++ REGISTER_OP("MinimumLengthPolymorphicListExample") .Attr("T: list(type) >= 3") .Input("in: T") .Output("out: T"); ``` ### Inputs and Outputs To summarize the above, an Op registration can have multiple inputs and outputs: ```c++ REGISTER_OP("MultipleInsAndOuts") .Input("y: int32") .Input("z: float") .Output("a: string") .Output("b: int32"); ``` Each input or output spec is of the form: ``` : ``` where `` begins with a letter and can be composed of alphanumeric characters and underscores. `` is one of the following type expressions: * ``, where `` is a supported input type (e.g. `float`, `int32`, `string`). This specifies a single tensor of the given type. See [the list of supported Tensor types](../../resources/dims_types.md#data-types). ```c++ REGISTER_OP("BuiltInTypesExample") .Input("integers: int32") .Input("complex_numbers: scomplex64"); ``` * ``, where `` is the name of an [Attr](#attrs) with type `type` or `list(type)` (with a possible type restriction). This syntax allows for [polymorphic ops](#Polymorphism). ```c++ REGISTER_OP("PolymorphicSingleInput") .Attr("T: type") .Input("in: T); REGISTER_OP("RestrictedPolymorphicSingleInput") .Attr("T: {int32, int64}") .Input("in: T); ``` Referencing an attr of type `list(type)` allows you to accept a sequence of tensors. ```c++ REGISTER_OP("ArbitraryTensorSequenceExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T"); REGISTER_OP("RestrictedTensorSequenceExample") .Attr("T: list({int32, int64})") .Input("in: T") .Output("out: T"); ``` Note that the number and types of tensors in the output `out` is the same as in the input `in`, since both are of type `T`. * For a sequence of tensors with the same type: ` * `, where `` is the name of an [Attr](#attrs) with type `int`. The `` can either be [a specific type like `int32` or `float`](../../resources/dims_types.md#data-types), or the name of an attr with type `type`. As an example of the first, this Op accepts a list of `int32` tensors: ```c++ REGISTER_OP("Int32SequenceExample") .Attr("NumTensors: int") .Input("in: NumTensors * int32") ``` Whereas this Op accepts a list of tensors of any type, as long as they are all the same: ```c++ REGISTER_OP("SameTypeSequenceExample") .Attr("NumTensors: int") .Attr("T: type") .Input("in: NumTensors * T") ``` * For a reference to a tensor: `Ref()`, where `` is one of the previous types. > A note on naming: Any attr used in the type of an input will be inferred. By > convention those inferred attrs use capital names (like `T` or `N`). > Otherwise inputs, outputs, and attrs have names like function parameters > (e.g. `num_outputs`). For more details, see the > [earlier note on naming](#naming). For more details, see [`tensorflow/core/framework/op_def_builder.h`][op_def_builder]. ### Backwards compatibility In general, changes to specifications must be backwards-compatible: changing the specification of an Op must not break prior serialized GraphDefs constructed from older specfications. There are several ways to preserve backwards-compatibility. 1. Any new attrs added to an operation must have default values defined, and with that default value the Op must have the original behavior. To change an operation from not polymorphic to polymorphic, you *must* give a default value to the new type attr to preserve the original signature by default. For example, if your operation was: ```c++ REGISTER_OP("MyGeneralUnaryOp") .Input("in: float") .Output("out: float"); ``` you can make it polymorphic in a backwards-compatible way using: ```c++ REGISTER_OP("MyGeneralUnaryOp") .Input("in: T") .Output("out: T") .Attr("T: numerictype = float"); ``` 1. You can safely make a constraint on an attr less restrictive. For example, you can change from `{int32, int64}` to `{int32, int64, float}` or from `{"apple", "orange"}` to `{"apple", "banana", "orange"}`. 1. Namespace any new Ops you create, by prefixing the Op names with something unique to your project. This avoids having your Op colliding with any Ops that might be included in future versions of Tensorflow. 1. Plan ahead! Try to anticipate future uses for the Op. Some signature changes can't be done in a compatible way (for example, adding an input, or making a single input into a list). If you cannot make your change to an operation backwards compatible, then create a new operation with a new name with the new semantics. ## GPU Support You can implement different OpKernels and register one for CPU and another for GPU, just like you can [register kernels for different types](#Polymorphism). There are several examples of kernels with GPU support in [`tensorflow/core/kernels/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/). Notice some kernels have a CPU version in a `.cc` file, a GPU version in a file ending in `_gpu.cu.cc`, and some code shared in common in a `.h` file. For example, the [`pad` op](../../api_docs/python/array_ops.md#pad) has everything but the GPU kernel in [`tensorflow/core/kernels/pad_op.cc`][pad_op]. The GPU kernel is in [`tensorflow/core/kernels/pad_op_gpu.cu.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op_gpu.cu.cc), and the shared code is a templated class defined in [`tensorflow/core/kernels/pad_op.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.h). One thing to note, even when the GPU kernel version of `pad` is used, it still needs its `"paddings"` input in CPU memory. To mark that inputs or outputs are kept on the CPU, add a `HostMemory()` call to the kernel registration, e.g.: ```c++ #define REGISTER_GPU_KERNEL(T) \ REGISTER_KERNEL_BUILDER(Name("Pad") \ .Device(DEVICE_GPU) \ .TypeConstraint("T") \ .HostMemory("paddings"), \ PadOp) ``` ## Implement the gradient in Python Given a graph of ops, TensorFlow uses automatic differentiation (backpropagation) to add new ops representing gradients with respect to the existing ops (see [Gradient Computation](../../api_docs/python/train.md#gradient-computation)). To make automatic differentiation work for new ops, you must register a gradient function which computes gradients with respect to the ops' inputs given gradients with respect to the ops' outputs. Mathematically, if an op computes \\(y = f(x)\\) the registered gradient op converts gradients \\(\partial / \partial y\\) with respect to \\(y\\) into gradients \\(\partial / \partial x\\) with respect to \\(x\\) via the chain rule: $$\frac{\partial}{\partial x} = \frac{\partial}{\partial y} \frac{\partial y}{\partial x} = \frac{\partial}{\partial y} \frac{\partial f}{\partial x}.$$ In the case of `ZeroOut`, only one entry in the input affects the output, so the gradient with respect to the input is a sparse "one hot" tensor. This is expressed as follows: ```python from tensorflow.python.framework import ops from tensorflow.python.ops import array_ops from tensorflow.python.ops import sparse_ops @ops.RegisterGradient("ZeroOut") def _zero_out_grad(op, grad): """The gradients for `zero_out`. Args: op: The `zero_out` `Operation` that we are differentiating, which we can use to find the inputs and outputs of the original op. grad: Gradient with respect to the output of the `zero_out` op. Returns: Gradients with respect to the input of `zero_out`. """ to_zero = op.inputs[0] shape = array_ops.shape(to_zero) index = array_ops.zeros_like(shape) first_grad = array_ops.reshape(grad, [-1])[0] to_zero_grad = sparse_ops.sparse_to_dense(index, shape, first_grad, 0) return [to_zero_grad] # List of one Tensor, since we have one input ``` Details about registering gradient functions with [`ops.RegisterGradient`](../../api_docs/python/framework.md#RegisterGradient): * For an op with one output, the gradient function will take an [`Operation`](../../api_docs/python/framework.md#Operation) `op` and a [`Tensor`](../../api_docs/python/framework.md#Tensor) `grad` and build new ops out of the tensors [`op.inputs[i]`](../../api_docs/python/framework.md#Operation.inputs), [`op.outputs[i]`](../../api_docs/python/framework.md#Operation.outputs), and `grad`. Information about any attrs can be found via [`op.get_attr`](../../api_docs/python/framework.md#Operation.get_attr). * If the op has multiple outputs, the gradient function will take `op` and `grads`, where `grads` is a list of gradients with respect to each output. The result of the gradient function must be a list of `Tensor` objects representing the gradients with respect to each input. * If there is no well-defined gradient for some input, such as for integer inputs used as indices, the corresponding returned gradient should be `None`. For example, for an op taking a floating point tensor `x` and an integer index `i`, the gradient function would `return [x_grad, None]`. * If there is no meaningful gradient for the op at all, use `ops.NoGradient("OpName")` to disable automatic differentiation. Note that at the time the gradient function is called, only the data flow graph of ops is available, not the tensor data itself. Thus, all computation must be performed using other tensorflow ops, to be run at graph execution time. ## Implement a shape function in Python The TensorFlow Python API has a feature called "shape inference" that provides information about the shapes of tensors without having to execute the graph. Shape inference is supported by "shape functions" that are registered for each op type, and perform two roles: asserting that the shapes of the inputs are compatible, and specifying the shapes for the outputs. A shape function is a Python function that takes an [`Operation`](../../api_docs/python/framework.md#Operation) as input, and returns a list of [`TensorShape`](../../api_docs/python/framework.md#TensorShape) objects (one per output of the op). To register a shape function, apply the [`tf.RegisterShape` decorator](../../api_docs/python/framework.md#RegisterShape) to a shape function. For example, the [`ZeroOut` op defined above](#define_interface) would have a shape function like the following: ```python @tf.RegisterShape("ZeroOut"): def _zero_out_shape(op): """Shape function for the ZeroOut op. This is the unconstrained version of ZeroOut, which produces an output with the same shape as its input. """ return [op.inputs[0].get_shape()] ``` A shape function can also constrain the shape of an input. For the version of [`ZeroOut` with a vector shape constraint](#Validation), the shape function would be as follows: ```python @tf.RegisterShape("ZeroOut"): def _zero_out_shape(op): """Shape function for the ZeroOut op. This is the constrained version of ZeroOut, which requires the input to have rank 1 (a vector). """ input_shape = op.inputs[0].get_shape().with_rank(1) return [input_shape] ``` If your op is [polymorphic with multiple inputs](#Polymorphism), use the properties of the operation to determine the number of shapes to check: ``` @tf.RegisterShape("IntListInputExample") def _int_list_input_example_shape(op): """Shape function for the "IntListInputExample" op. All inputs and the output are matrices of the same size. """ output_shape = tf.TensorShape(None) for input in op.inputs: output_shape = output_shape.merge_with(input.get_shape().with_rank(2)) return [output_shape] ``` Since shape inference is an optional feature, and the shapes of tensors may vary dynamically, shape functions must be robust to incomplete shape information for any of the inputs. The [`merge_with`](../../api_docs/python/framework.md) method allows the caller to assert that two shapes are the same, even if either or both of them do not have complete information. Shape functions are defined for all of the [standard Python ops](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/), and provide many different usage examples. [core-array_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/array_ops.cc [python-user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py [tf-kernels]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/ [user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/user_ops/ [pad_op]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.cc [standard_ops-py]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/standard_ops.py [standard_ops-cc]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/cc/ops/standard_ops.h [python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD [validation-macros]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h [op_def_builder]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.h [register_types]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/register_types.h [FinalizeAttr]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.cc#FinalizeAttr [DataTypeString]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.cc#DataTypeString [python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD [types-proto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.proto [TensorShapeProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor_shape.proto [TensorProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor.proto ================================================ FILE: SOURCE/how_tos/adding_an_op/zero_out_1_test.py ================================================ """Test for version 1 of the zero_out op.""" import tensorflow.python.platform import tensorflow as tf from tensorflow.g3doc.how_tos.adding_an_op import gen_zero_out_op_1 class ZeroOut1Test(tf.test.TestCase): def test(self): with self.test_session(): result = gen_zero_out_op_1.zero_out([5, 4, 3, 2, 1]) self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0]) if __name__ == '__main__': tf.test.main() ================================================ FILE: SOURCE/how_tos/adding_an_op/zero_out_2_test.py ================================================ """Test for version 2 of the zero_out op.""" import tensorflow.python.platform import tensorflow as tf from tensorflow.g3doc.how_tos.adding_an_op import gen_zero_out_op_2 from tensorflow.g3doc.how_tos.adding_an_op import zero_out_grad_2 from tensorflow.python.kernel_tests import gradient_checker class ZeroOut2Test(tf.test.TestCase): def test(self): with self.test_session(): result = gen_zero_out_op_2.zero_out([5, 4, 3, 2, 1]) self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0]) def test_grad(self): with self.test_session(): shape = (5,) x = tf.constant([5, 4, 3, 2, 1], dtype=tf.float32) y = gen_zero_out_op_2.zero_out(x) err = gradient_checker.ComputeGradientError(x, shape, y, shape) self.assertLess(err, 1e-4) if __name__ == '__main__': tf.test.main() ================================================ FILE: SOURCE/how_tos/adding_an_op/zero_out_3_test.py ================================================ """Test for version 3 of the zero_out op.""" import tensorflow.python.platform import tensorflow as tf from tensorflow.g3doc.how_tos.adding_an_op import gen_zero_out_op_3 class ZeroOut3Test(tf.test.TestCase): def test(self): with self.test_session(): result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1]) self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0]) def testAttr(self): with self.test_session(): result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1], preserve_index=3) self.assertAllEqual(result.eval(), [0, 0, 0, 2, 0]) def testNegative(self): with self.test_session(): result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1], preserve_index=-1) with self.assertRaisesOpError("Need preserve_index >= 0, got -1"): result.eval() def testLarge(self): with self.test_session(): result = gen_zero_out_op_3.zero_out([5, 4, 3, 2, 1], preserve_index=17) with self.assertRaisesOpError("preserve_index out of range"): result.eval() if __name__ == '__main__': tf.test.main() ================================================ FILE: SOURCE/how_tos/adding_an_op/zero_out_grad_2.py ================================================ """The gradient of the tutorial zero_out op.""" from tensorflow.python.framework import ops from tensorflow.python.ops import array_ops from tensorflow.python.ops import sparse_ops @ops.RegisterGradient("ZeroOut") def _zero_out_grad(op, grad): """The gradients for `zero_out`. Args: op: The `zero_out` `Operation` that we are differentiating, which we can use to find the inputs and outputs of the original op. grad: Gradient with respect to the output of the `zero_out` op. Returns: Gradients with respect to the input of `zero_out`. """ to_zero = op.inputs[0] shape = array_ops.shape(to_zero) index = array_ops.zeros_like(shape) first_grad = array_ops.reshape(grad, [-1])[0] to_zero_grad = sparse_ops.sparse_to_dense(index, shape, first_grad, 0) return [to_zero_grad] # List of one Tensor, since we have one input ================================================ FILE: SOURCE/how_tos/adding_an_op/zero_out_op_kernel_1.cc ================================================ #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" using namespace tensorflow; REGISTER_OP("ZeroOut") .Input("to_zero: int32") .Output("zeroed: int32") .Doc(R"doc( Zeros out all but the first value of a Tensor. zeroed: A Tensor whose first value is identical to `to_zero`, and 0 otherwise. )doc"); class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // Create an output tensor Tensor* output_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor)); auto output = output_tensor->template flat(); // Set all but the first element of the output tensor to 0. const int N = input.size(); for (int i = 1; i < N; i++) { output(i) = 0; } // Preserve the first input value. if (N > 0) output(0) = input(0); } }; REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp); ================================================ FILE: SOURCE/how_tos/adding_an_op/zero_out_op_kernel_2.cc ================================================ #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/register_types.h" using namespace tensorflow; REGISTER_OP("ZeroOut") .Attr("T: realnumbertype") .Input("to_zero: T") .Output("zeroed: T") .Doc(R"doc( Zeros out all but the first value of a Tensor. zeroed: A Tensor whose first value is identical to `to_zero`, and 0 otherwise. )doc"); REGISTER_OP("ZeroOut2") .Attr("T: realnumbertype") .Input("to_zero: T") .Output("zeroed: T") .Doc(R"doc( Zeros out all but the first value of a Tensor. zeroed: A Tensor whose first value is identical to `to_zero`, and 0 otherwise. )doc"); REGISTER_OP("ZeroOut3") .Attr("T: realnumbertype") .Input("to_zero: T") .Output("zeroed: T") .Doc(R"doc( Zeros out all but the first value of a Tensor. zeroed: A Tensor whose first value is identical to `to_zero`, and 0 otherwise. )doc"); template class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // Create an output tensor Tensor* output = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output)); auto output_flat = output->template flat(); // Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // Preserve the first input value if (N > 0) output_flat(0) = input(0); } }; REGISTER_KERNEL_BUILDER(Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); REGISTER_KERNEL_BUILDER(Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); REGISTER_KERNEL_BUILDER(Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut2").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) REGISTER_KERNEL(float); REGISTER_KERNEL(double); REGISTER_KERNEL(int32); #undef REGISTER_KERNEL #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut3").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL); #undef REGISTER_KERNEL ================================================ FILE: SOURCE/how_tos/adding_an_op/zero_out_op_kernel_3.cc ================================================ #include "tensorflow/core/framework/op.h" #include "tensorflow/core/framework/op_kernel.h" using namespace tensorflow; REGISTER_OP("ZeroOut") .Attr("preserve_index: int = 0") .Input("to_zero: int32") .Output("zeroed: int32"); class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) { // Get the index of the value to preserve OP_REQUIRES_OK(context, context->GetAttr("preserve_index", &preserve_index_)); // Check that preserve\_index is positive OP_REQUIRES(context, preserve_index_ >= 0, errors::InvalidArgument("Need preserve_index >= 0, got ", preserve_index_)); } void Compute(OpKernelContext* context) override { // Grab the input tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // Check that preserve_index is in range OP_REQUIRES(context, preserve_index_ < input.dimension(0), errors::InvalidArgument("preserve_index out of range")); // Create an output tensor Tensor* output_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor)); auto output = output_tensor->template flat(); // Set all the elements of the output tensor to 0 const int N = input.size(); for (int i = 0; i < N; i++) { output(i) = 0; } // Preserve the requested input value output(preserve_index_) = input(preserve_index_); } private: int preserve_index_; }; REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp); ================================================ FILE: SOURCE/how_tos/adding_an_op.md ================================================ # 增加一个新 Op 预备知识: * 对 C++ 有一定了解. * 已经[下载 TensorFlow 源代码](tensorflow-zh/SOURCE/get_started/introduction.md#source)并有能力编译它. 如果现有的库没有涵盖你想要的操作, 你可以自己定制一个. 为了使定制的 Op 能够兼容原有的库 , 你必须做以下工作: * 在一个 C++ 文件中注册新 Op. Op 的注册与实现是相互独立的. 在其注册时描述了 Op 该如何执行. 例如, 注册 Op 时定义了 Op 的名字, 并指定了它的输入和输出. * 使用 C++ 实现 Op. 每一个实现称之为一个 "kernel", 可以存在多个 kernel, 以适配不同的架构 (CPU, GPU 等)或不同的输入/输出类型. * 创建一个 Python 包装器(wrapper). 这个包装器是创建 Op 的公开 API. 当注册 Op 时, 会自动生成一个默认 默认的包装器. 既可以直接使用默认包装器, 也可以添加一个新的包装器. * (可选) 写一个函数计算 Op 的梯度. * (可选) 写一个函数, 描述 Op 的输入和输出 shape. 该函数能够允许从 Op 推断 shape. * 测试 Op, 通常使用 Pyhton。如果你定义了梯度,你可以使用Python的[GradientChecker](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/kernel_tests/gradient_checker.py)来测试它。 ## 内容 ### [增加一个新 Op](#AUTOGENERATED-adding-a-new-op) * [定义 Op 的接口](#define_interface) * [为 Op 实现 kernel](#AUTOGENERATED-implement-the-kernel-for-the-op) * [生成客户端包装器](#AUTOGENERATED-generate-the-client-wrapper) * [Python Op 包装器](#AUTOGENERATED-the-python-op-wrapper) * [C++ Op 包装器](#AUTOGENERATED-the-c---op-wrapper) * [检查 Op 能否正常工作](#AUTOGENERATED-verify-it-works) * [验证条件](#Validation) * [Op 注册](#AUTOGENERATED-op-registration) * [属性](#Attrs) * [属性类型](#AUTOGENERATED-attr-types) * [多态](#Polymorphism) * [输入和输出](#AUTOGENERATED-inputs-and-outputs) * [向后兼容性](#AUTOGENERATED-backwards-compatibility) * [GPU 支持](#mult-archs) * [使用 Python 实现梯度](#AUTOGENERATED-implement-the-gradient-in-python) * [使用 Python 实现 shape 函数](#AUTOGENERATED-implement-a-shape-function-in-python) ## 定义 Op 的接口 向 TensorFlow 系统注册来定义 Op 的接口. 在注册时, 指定 Op 的名称, 它的输入(类型和名称) 和输出(类型和名称), 和所需要任何 [属性](#Attrs)的文档说明. 为了让你有直观的认识, 创建一个简单的 Op 作为例子. 该 Op 接受一个 `int32` 类型 tensor 作为 输入, 输出这个 tensor 的一个副本, 副本与原 tensor 唯一的区别在于第一个元素被置为 0. 创建 文件 `tensorflow/core/user_ops/zero_out.cc`, 并调用 `REGISTER_OP` 宏来定义 Op 的接口. ``` #include "tensorflow/core/framework/op.h" REGISTER_OP("ZeroOut") .Input("to_zero: int32") .Output("zeroed: int32"); ``` `ZeroOut` Op 接受 32 位整型的 tensor `to_zero` 作为输入, 输出 32 位整型的 tensor `zeroed`. >命名的注意事项: Op 的名称必须是为唯一的, 并使用驼峰命名法. 以下划线 `_` 开始的名称保留为内部使用. ## 为 Op 实现 kernel 在定义接口之后, 提供一个或多个 Op 的实现. 为这些 kernel 的每一个创建一个对应的类, 继承 `OpKernel`, 覆盖 `Compute` 方法. `Compute` 方法提供一个类型为 `OpKernelContext*` 的参数 `context`, 用于访问一些有用的信息, 例如输入和输出的 tensor. 将 kernel 添加到刚才创建的文件中, kernel 看起来和下面的代码类似: ``` #include "tensorflow/core/framework/op_kernel.h" using namespace tensorflow; class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // 获取输入 tensor. const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // 创建一个输出 tensor. Tensor* output_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor)); auto output = output_tensor->template flat(); // 设置 tensor 除第一个之外的元素均设为 0. const int N = input.size(); for (int i = 1; i < N; i++) { output(i) = 0; } // 尽可能地保留第一个元素的值. if (N > 0) output(0) = input(0); } }; ``` 实现 kernel 后, 将其注册到 TensorFlow 系统中. 注册时, 可以指定该 kernel 运行时的多个约束 条件. 例如可以指定一个 kernel 在 CPU 上运行, 另一个在 GPU 上运行. 将下列代码加入到 `zero_out.cc` 中, 注册 `ZeroOut` op: ``` REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp); ``` 一旦[创建和重新安装了 TensorFlow ](tensorflow-zh/SOURCE/get_started/os_setup.md#create-pip), Tensorflow 系统可以在需要时引用和使用该 Op. ## 生成客户端包装器 ### Python Op 包装器 当编译 TensorFlow 时, 所有放在 [`tensorflow/core/user_ops`][user_ops] 目录下 的 Op 会自动在 `bazel-genfiles/tensorflow/python/ops/gen_user_ops.py` 文件 中生成 Python Op 包装器. 通过以下声明, 把那些 Op 引入到 [`tensorflow/python/user_ops/user_ops.py`][python-user_ops] 中: ```python from tensorflow.python.ops.gen_user_ops import * ``` 你可以选择性将部分函数替换为自己的实现. 为此, 首先要隐藏自动生成的代码, 在 [`tensorflow/python/BUILD`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD) 文件中, 将其名字添加到 `"user_ops"` 的 `hidden` 列表. ```python tf_gen_op_wrapper_py( name = "user_ops", hidden = [ "Fact", ], require_shape_functions = False, ) ``` 紧接着 `"Fact"` 列出自己的 Op. 然后, 在 [`tensorflow/python/user_ops/user_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py) 中添加你的替代实现函数. 通常, 替代实现函数也会调用自动生成函数来真正把 Op 添加 到图中. 被隐藏的自动生成函数位于 `gen_user_ops` 包中, 名称多了一个下划线前缀 ("`_`"). 例如: ```python def my_fact(): """覆盖一个 Op 自动生成代码的示例.""" return gen_user_ops._fact() ``` ### C++ Op 包装器 当编译 TensorFlow 时, 所有 [`tensorflow/core/user_ops`][user_ops] 文件夹 下的 Op 会自动创建 C++ Op 包装器. 例如, `tensorflow/core/user_ops/zero_out.cc` 中的 Op 会自动在 `bazel-genfiles/tensorflow/cc/ops/user_ops.{h,cc}` 中生成包装器. [`tensorflow/cc/ops/standard_ops.h`][standard_ops-cc] 通过下述申明, 导入用户自定义 Op 自动生成的包装器. ``` #include "tensorflow/cc/ops/user_ops.h" ``` ## 检查 Op 能否正常工作 验证已经成功实现 Op 的方式是编写测试程序. 创建文件 `tensorflow/python/kernel_tests/zero_out_op_test.py`, 包含以下内容: ```python import tensorflow as tf class ZeroOutTest(tf.test.TestCase): def testZeroOut(self): with self.test_session(): result = tf.user_ops.zero_out([5, 4, 3, 2, 1]) self.assertAllEqual(result.eval(), [5, 0, 0, 0, 0]) ``` 然后运行测试: ```sh $ bazel test tensorflow/python:zero_out_op_test ``` ## 验证条件 上述示例假定 Op 能够应用在任何 shape 的 tensor 上. 如果只想应用到 vector 上 呢? 这意味需要在上述 OpKernel 实现中添加相关的检查. ``` void Compute(OpKernelContext* context) override { // 获取输入 tensor const Tensor& input_tensor = context->input(0); OP_REQUIRES(context, TensorShapeUtils::IsVector(input_tensor.shape()), errors::InvalidArgument("ZeroOut expects a 1-D vector.")); // ... } ``` OP_REQUIRES 断言的输入是一个 vector, 如果不是 vector, 将设置 `InvalidArgument` 状态并返回. [`OP_REQUIRES` 宏][validation-macros] 有三个参数: * `context`: 可以是一个 `OpKernelContext` 或 `OpKernelConstruction` 指针 (参见 [`tensorflow/core/framework/op_kernel.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_kernel.h)), 其 `SetStatus()` 方法将被使用到. * 检查条件: [`tensorflow/core/public/tensor_shape.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/tensor_shape.h) 中有一些验证 tensor shape 的函数. * 条件不满足时产生的错误: 错误用一个 `Status` 对象表示, 参见 [`tensorflow/core/public/status.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/status.h). `Status` 包含一个类型 (通常是 `InvalidArgument`, 但也可以是任何类型) 和一个消息. 构造 一个错误的函数位于 [`tensorflow/core/lib/core/errors.h`][validation-macros] 中. 如果想要测试一个函数返回的 `Status` 对象是否是一个错误, 可以使用 [`OP_REQUIRES_OK`][validation-macros]. 这些宏如果检测到错误, 会直接跳出函数, 终止函数执行. ## Op 注册 ### 属性 Op 可以有属性, 属性的值在 Op 添加到图中时被设置. 属性值用于配置 Op, 在 kernel 实现中, Op 注册的输入和输出类型中, 均可访问这些属性值. 尽可能地使用输入代替属性, 因为输入的灵活性更高, 例如可以在执行步骤中 中被更改, 可以使用 feed 等等. 属性可用于实现一些输入无法做到的事情, 例如影响 Op 签名 (即输入输出的数量和类型) 的配置或只读配置可以通过属性实现. 注册 Op 时可以用 `Attr` 方法指定属性的名称和类型, 以此来定义一个属性, 形式如下: ``` : ``` `` 必须以字母开头, 可以由数字, 字母, 下划线组成. `` 是一个类型表达式, 形式[如下](#attr-types): 例如, 如果想要 `ZeroOut` Op 保存一个用户索引, 指示该 Op 不仅仅只有一个元素, 你可以注册 Op 如下: ``` REGISTER_OP("ZeroOut") .Attr("preserve_index: int") .Input("to_zero: int32") .Output("zeroed: int32"); ``` 你的 kernel 可以在构造函数里, 通过 `context` 参数访问这个属性: ``` class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction * context) : OpKernel(context) { // 获取欲保存的索引值 OP_REQUIRES_OK(context, context->GetAttr("preserve_index", &preserve_index_)); // 检查 preserve_index 是否为正 OP_REQUIRES(context, preserve_index_ >= 0, errors::InvalidArgument("Need preserve_index >= 0, got ", preserve_index_)); } void Compute(OpKernelContext* context) override { // ... } private: int preserve_index_; }; ``` 该值可以在 `Compute` 方法中被使用: ``` void Compute(OpKernelContext* context) override { // ... // 检查 preserve_index 范围是否合法 OP_REQUIRES(context, preserve_index_ < input.dimension(0), errors::InvalidArgument("preserve_index out of range")); // 设置输出 tensor 所有的元素值为 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // 保存请求的输入值 output_flat(preserve_index_) = input(preserve_index_); } ``` > 为了维持[向后兼容性](#backwards-compatibility), 将一个属性添加到一个已有的 Op 时, > 必须指定一个[默认值](#default-values-constraints): ``` REGISTER_OP("ZeroOut") .Attr("preserve_index: int = 0") .Input("to_zero: int32") .Output("zeroed: int32"); ``` ### 属性类型 属性可以使用下面的类型: * `string`: 任何二进制字节流 (UTF8 不是必须的). * `int`: 一个有型整数. * `float`: 一个浮点数. * `bool`: 真或假. * `type`: [`DataType`][DataTypeString] 非引用类型之一. * `shape`: 一个 [`TensorShapeProto`][TensorShapeProto]. * `tensor`: 一个 [`TensorProto`][TensorProto]. * `list()`: `` 列表, 其中 `` 是上述类型之一. 注意 `list(list())` 是无效的. 权威的列表以 [`op_def_builder.cc:FinalizeAttr`][FinalizeAttr] 为准. #### 默认值和约束条件 属性可能有默认值, 一些类型的属性可以有约束条件. 为了定义一个有约束条件的属性, 你可以使用下列的 `` 形式: * `{'', ''}`: 属性值必须是一个字符串, 取值可以为 `` 或 ``. 值的语法已经暗示了值的类型为 `string`, 已经暗示了. 下述语句模拟了一个枚举值: ``` REGISTER_OP("EnumExample") .Attr("e: {'apple', 'orange'}"); ``` * `{, }`: 值是 `type` 类型, 且必须为 `` 或 `` 之一, 当然 `` 和 `` 必须都是有效的 [tensor 类型](tensorflow-zh/SOURCE/resources/dims_types.md#data-types). 你无须指定属性的类型为 `type`, 而是通过 `{...}` 语句给出一个类型列表. 例如, 在下面的例子里, 属性 `t` 的类型必须为 `int32`, `float`, 或 `bool`: ``` REGISTER_OP("RestrictedTypeExample") .Attr("t: {int32, float, bool}"); ``` * 这里有一些常见类型约束条件的快捷方式: * `numbertype`: 限制类型为数字类型, 即非 string 非 bool 的类型. * `realnumbertype`: 与 `numbertype` 区别是不支持复杂类型. * `quantizedtype`: 与 `numbertype` 区别是只支持量化数值 (quantized number type). 这些类型的列表在 [`tensorflow/core/framework/types.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.h) 文件中通过函数定义 (如 `NumberTypes()`). 本例中属性 `t` 必须为某种数字类型: ``` REGISTER_OP("NumberType") .Attr("t: numbertype"); ``` 对于这个 Op: ```python tf.number_type(t=tf.int32) # 有效 tf.number_type(t=tf.bool) # 无效 ``` * `int >= `: 值必须是一个整数, 且取值大于等于 ``, `` 是一个自然数. 例如, 下列 Op 注册操作指定了属性 `a` 的取值至少为 `2`. ``` REGISTER_OP("MinIntExample") .Attr("a: int >= 2"); ``` * `list() >= `: 一个 `` 类型列表, 列表长度必须大于等于 ``. 例如, 下面的 Op 注册操作指定属性 `a` 是一个列表, 列表中的元素类型是 `int32` 或 `float`列表长度至少为3. ``` REGISTER_OP("TypeListExample") .Attr("a: list({int32, float}) >= 3"); ``` 通过添加 `= ` 到约束条件末尾, 给一个属性设置默认值 (使其在自动生成的代码里 变成可选属性), 如下: ``` REGISTER_OP("AttrDefaultExample") .Attr("i: int = 0"); ``` 默认值支持的语法将在最终 GraphDef 定义的 protobuf 表示中被使用. 下面是给所有类型赋予默认值的例子: ``` REGISTER_OP("AttrDefaultExampleForAllTypes") .Attr("s: string = 'foo'") .Attr("i: int = 0") .Attr("f: float = 1.0") .Attr("b: bool = true") .Attr("ty: type = DT_INT32") .Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }") .Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }") .Attr("l_empty: list(int) = []") .Attr("l_int: list(int) = [2, 3, 5, 7]"); ``` 请特别注意那些类型值里面包含的 [`DT_*` 名称](tensorflow-zh/SOURCE/resources/dims_types.md#data-types). ### 多态 #### Type Polymorphism 对于那些可以使用不同类型输入或产生不同类型输出的 Op, 可以注册 Op 时为输入/输出类型里指定一个[属性](#attrs). 一般紧接着, 会为每一个支持的类型注册一个 `OpKernel`. 例如, 除了 `int32` 外, 想要 `ZeroOut` Op 支持 `float`, 注册代码如下: ``` REGISTER_OP("ZeroOut") .Attr("T: {float, int32}") .Input("to_zero: T") .Output("zeroed: T"); ``` 这段 Op 注册代码现在指定了输入的类型必须为 `float` 或 `int32`, 而且 既然输入和输出制定了同样的类型 `T`, 输出也同样如此. > 一个命名建议:{#naming} 输入, 输出, 和属性通常使用 snake_case 命名法. > 唯一的例外是属性被用作输入类型或是输入类型的一部分. 当添加到图中时, 这些属性 > 可以被推断出来, 因此不会出现在 Op 的函数里. 例如, 最后一个 ZeroOut 定义 > 生成的 Python 函数如下: ```python def zero_out(to_zero, name=None): """... 参数: to_zero: 一个 `Tensor`. 必须为下列类型之一: `float32`, `int32`. name: 操作的名字 (可选). 返回值: 一个 `Tensor`, 类型和 `to_zero` 一样. """ ``` > 如果输入的 `to_zero` 是一个 `int32` 的tensor, 然后 `T` 将被自动 > 设置为 `int32` (实际上是 `DT_INT32`). 那些推导出的属性的名称字母全大写 > 或采用驼峰命名法. > > 下面是一个输出类型自动推断的例子, 读者可以对比一下: ``` REGISTER_OP("StringToNumber") .Input("string_tensor: string") .Output("output: out_type") .Attr("out_type: {float, int32}"); .Doc(R"doc( Converts each string in the input Tensor to the specified numeric type. )doc"); ``` > 在这种情况下, 用户需要在生成的 Python 代码中指定输出类型. ```python def string_to_number(string_tensor, out_type=None, name=None): """将输入 Tensor 中的每一个字符串转化成指定的数字类型 参数: string_tensor: 一个 `string` 类型的 `Tensor`. out_type: 一个可选的 `tf.DType`, 取值为 `tf.float32, tf.int32`. 默认值是 `tf.float32`. name: 操作的名称 (可选). 返回值: 一个 `out_type` 类型的 `Tensor`. """ ``` ``` #include "tensorflow/core/framework/op_kernel.h" class ZeroOutInt32Op : public OpKernel { // 和之前一样 }; class ZeroOutFloatOp : public OpKernel { public: explicit ZeroOutFloatOp(OpKernelConstruction * context) : OpKernel(context) {} void Compute(OpKernelContext * context) override { // 获取输入 tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // 创建一个输出 tensor Tensor * output = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output)); auto output_flat = output->template flat(); // 设置输出 tensor 的所有元素为 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; }
// 保留第一个输入值 if (N > 0) output_flat(0) = input(0); } }; // 注意, TypeConstraint("T") 意味着属性 "T" (在上面 Op 注册代码中 // 定义的) 必须是 "int32", 才能实例化. REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint<int32>("T"), ZeroOutOpInt32); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutFloatOp); ``` > 为了保持[向后兼容性](#backwards-compatibility), 你在为一个 > 已有的 op 添加属性时, 必须指定一个[默认值](#default-values-constraints): ``` REGISTER_OP("ZeroOut") .Attr("T: {float, int32} = DT_INT32") .Input("to_zero: T") .Output("zeroed: T") ``` 如果需要添加更多类型, 例如 `double`: ``` REGISTER_OP("ZeroOut") .Attr("T: {float, double, int32}") .Input("to_zero: T") .Output("zeroed: T"); ``` 为了避免为新增的类型写冗余的 `OpKernel` 代码, 通常可以写一个 C++ 模板作为替代. 当然, 仍然需要为每一个重载版本定义一个 keneral 注册 (`REGISTER\_KERNEL\_BUILDER` 调用). ``` template ; class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // 获取输入 tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // 创建一个输出 tensor Tensor* output = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output)); auto output_flat = output->template flat(); // 设置输出 tensor 的所有元素为 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // Preserve the first input value if (N > 0) output_flat(0) = input(0); } }; };
// 注意, TypeConstraint("T") 意味着属性 "T" (在上面 Op 注册代码中 // 定义的) 必须是 "int32", 才能实例化. REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); ``` 如果有很多重载版本, 可以将注册操作通过一个宏来实现. ``` #include "tensorflow/core/framework/op_kernel.h" #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) REGISTER_KERNEL(int32); REGISTER_KERNEL(float); REGISTER_KERNEL(double); #undef REGISTER_KERNEL ``` 取决于注册 kernel 使用哪些类型, 你可能可以使用[`tensorflow/core/framework/register_types.h`][register_types] 提供的宏: ``` #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/register_types.h" REGISTER_OP("ZeroOut") .Attr("T: realnumbertype") .Input("to_zero: T") .Output("zeroed: T"); template class ZeroOutOp : public OpKernel { ... }; #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL); #undef REGISTER_KERNEL ``` #### 列表输入和输出 除了能够使用不同类型的 tensor 作为输入或输出, Op 还支持使用多个 tensor 作为输入或输出. 在接下来的例子里, 属性 `T` 存储了一个类型*列表*, 并同时作为输入 `in` 和输出 `out` 的类型. 输入和输出均为指定类型的 tensor 列表. 既然输入和输出的类型均为 `T`, 它们的 tensor 数量和类型 是一致的. ``` REGISTER_OP("PolymorphicListExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T"); ``` 可以为列表中可存放的类型设置约束条件. 在下一个例子中, 输入是 `float` 和 `double` 类型的 tensor 列表. 例如, 这个 Op 可接受的 输入类型为 `(float, double, float)` 的数据, 且在此情况下, 输出类型同样 为 `(float, double, float)`. ``` REGISTER_OP("ListTypeRestrictionExample") .Attr("T: list({float, double})") .Input("in: T") .Output("out: T"); ``` 如果想要一个列表中的所有 tensor 是同一类型, 你需要写下列代码: ``` REGISTER_OP("IntListInputExample") .Attr("N: int") .Input("in: N * int32") .Output("out: int32"); ``` 这段代码接受 `int32` tensor 列表, 并用一个 `int` 属性 `N` 来指定列表的长度. 这也可用于[类型推断](#type-polymorphism). 在下一个例子中, 输入是一个 tensor 列表, 长度为 `"N"`, 类型为 `"T"`, 输出是单个 `"T"` 的 tensor: ``` REGISTER_OP("SameListInputExample") .Attr("N: int") .Attr("T: type") .Input("in: N * T") .Output("out: T"); ``` 默认情况下, tensor 列表的最小长度为1. 这个约束条件可以通过 [为指定的属性增加一个 `">="` 约束](#default-values-constraints)来变更: ``` REGISTER_OP("MinLengthIntListExample") .Attr("N: int >= 2") .Input("in: N * int32") .Output("out: int32"); ``` 同样的语法也适用于 `"list(type)"` 属性: ``` REGISTER_OP("MinimumLengthPolymorphicListExample") .Attr("T: list(type) >= 3") .Input("in: T") .Output("out: T"); ``` ### 输入和输出 总结一下上述内容, 一个 Op 注册操作可以指定多个输入和输出: ``` REGISTER_OP("MultipleInsAndOuts") .Input("y: int32") .Input("z: float") .Output("a: string") .Output("b: int32"); ``` 每一个输入或输出形式如下: ``` : ``` 其中, `` 以字母打头, 且只能由数字, 字母和下划线组成. `` 可以是 下列类型表达式之一: * ``, 一个合法的输入类型, 如 `float`, `int32`, `string`. 这可用于指定给定类型的单个 tensor. 参见[合法 Tensor 类型列表](tensorflow-zh/SOURCE/resources/dims_types.md#data-types). ``` REGISTER_OP("BuiltInTypesExample") .Input("integers: int32") .Input("complex_numbers: scomplex64"); ``` * ``, 一个[属性](#attrs)和一个类型 `type` 或类型列表 `list(type)`(可能 包含类型限制). 该语法可实现[多态 Op](#Polymorphism). ``` REGISTER_OP("PolymorphicSingleInput") .Attr("T: type") .Input("in: T); REGISTER_OP("RestrictedPolymorphicSingleInput") .Attr("T: {int32, int64}") .Input("in: T); ``` 将属性的类型设置为 `list(type)` 将允许你接受一个序列的 tensor. ``` REGISTER_OP("ArbitraryTensorSequenceExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T"); REGISTER_OP("RestrictedTensorSequenceExample") .Attr("T: list({int32, int64})") .Input("in: T") .Output("out: T"); ``` 注意, 输入和输出均为 `T`, 意味着输入和输出的类型与数量均相同. * ` * `, 一组拥有相同类型的 tensor, `` 是一个 `int` 类型属性的名称. `` 可以是[一个类似于 `int32` 和 `float` 的特定类型](tensorflow-zh/SOURCE/resources/dims_types.md#data-types), 或者一个 `type` 类型属性的名字. 前者的例子如下, 该例子接受一个 `int32` tensor 列表作为 Op 输入: ``` REGISTER_OP("Int32SequenceExample") .Attr("NumTensors: int") .Input("in: NumTensors * int32") ``` 后者的例子如下, 该例子接受一个泛型 tensor 列表作为 Op 输入: ``` REGISTER_OP("SameTypeSequenceExample") .Attr("NumTensors: int") .Attr("T: type") .Input("in: NumTensors * T") ``` * Tensor 的引用表示为 `Ref()`, 其中 `` 是上述类型之一. > 一个命名建议: 当使用属性表示一个输入的类型时, 该类型可以被推断出来. 实现该特性, 将需要推断 > 的类型用大写名称表示 (如 `T` 或 `N`), 其它的输入, 输出, 和属性像使用函数参数一样使用这些 > 大写名称. 参见之前的[命名建议](#naming)章节查看更多细节. 更多细节参见 [`tensorflow/core/framework/op_def_builder.h`][op_def_builder]. ### 向后兼容性 通常, 对规范的改变必须保持向后兼容性: Op 使用新规范后, 需保证使用旧规范构造的序列化 GraphDef 仍能正确工作. 下面是几种保持向后兼容性的方式: 1. 任何添加到 Op 的新属性必须有默认值, 且默认值下的行为有明确定义. 将一个非多态的操作变为多态操作, 你*必须*为新的类型属性赋予默认值, 以保持原始的函数签名. 例如, 有如下操作: ``` REGISTER_OP("MyGeneralUnaryOp") .Input("in: float") .Output("out: float"); ``` 可以通过下述方式将其变为多态, 且保持向后兼容性: ``` REGISTER_OP("MyGeneralUnaryOp") .Input("in: T") .Output("out: T") .Attr("T: numerictype = float"); ``` 1.放宽一个属性的约束条件是安全的. 例如, 你可以将 `{int32, int64}` 变为 `{int32, int64, float}`, 或者, 将 `{"apple", "orange"}` 变为 `{"apple", "banana", "orange"}`. 2.通过给 Op 名称添加一些项目中唯一的标识作为前缀, 来为新建的 Op 添加命名空间. 命名空间 可以预防你的 Op 与 TensorFlow 未来版本里的内置 Op 产生命名冲突. 3.超前计划! 尝试着去预测 Op 未来的的用途, 超前设计, 毕竟, 一些签名的变更无法保证兼容性 (例如, 增加新的输入, 或将原来的单元素输入变成一个列表). 如果不能以兼容的方式改变一个操作, 那就创建一个全新的操作, 来实现所需功能. ## GPU 支持 你可以实现不同的 OpKernel, 将其中之一注册到 GPU, 另一个注册到 GPU, 正如[为不同的类型注册 kernel ](#Polymorphism)一样. [`tensorflow/core/kernels/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/) 中有一些 GPU 支持的例子. 注意, 一些 kernel 的 CPU 版本位于 `.cc` 文件, GPU 版本位于 `_gpu.cu.cc` 文件, 共享的代码位于 `.h` 文件. 例如, [`pad` op](tensorflow-zh/SOURCE/api_docs/python/array_ops.md#pad) 除了 GPU kernel 外的其它代码 均在 [`tensorflow/core/kernels/pad_op.cc`][pad_op] 中. GPU kernel 位于 [`tensorflow/core/kernels/pad_op_gpu.cu.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op_gpu.cu.cc), 共享的一个模板类代码定义在 [`tensorflow/core/kernels/pad_op.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.h). 需要注意的事情是, 即使使用 `pad` 的 GPU 版本时, 仍然需要将 `"paddings"` 输入放置到内存中. 为了实现这一点, 将输入或输出标记为必须保存在内存中, 为 kernel 注册一个 `HostMemory()` 调用. 如下: ``` #define REGISTER_GPU_KERNEL(T) \ REGISTER_KERNEL_BUILDER(Name("Pad") \ .Device(DEVICE_GPU) \ .TypeConstraint("T") \ .HostMemory("paddings"), \ PadOp) ``` ## 使用 Python 实现梯度 给定一个 Op 组成的图, TensorFlow 使用自动微分 (反向传播) 来添加新的 Op 以表示梯度运算, 同时 不影响已有的 Op (参见[梯度运算](tensorflow-zh/SOURCE/api_docs/python/train.md#gradient-computation)). 为了使自动微分能够与新的 Op 协同工作, 必须注册一个梯度函数, 从 Op 的输入计算梯度, 并返回代表 梯度值的输出. 数学上, 如果一个 Op 计算 \\(y = f(x)\\), 注册的梯度 Op 通过以下链式法则, 将 \\(\partial / \partial y\\) 的梯度运算转化为 \\(\partial / \partial x\\) 的梯度运算. $$\frac{\partial}{\partial x} = \frac{\partial}{\partial y} \frac{\partial y}{\partial x} = \frac{\partial}{\partial y} \frac{\partial f}{\partial x}.$$ 在 `ZeroOut` 的例子中, 输入中只有一个项会影响输出, 所以, 代表输入的梯度值的 tensor 也只有 一个输入项. 如下所示: ```python from tensorflow.python.framework import ops from tensorflow.python.ops import array_ops from tensorflow.python.ops import sparse_ops @ops.RegisterGradient("ZeroOut") def _zero_out_grad(op, grad): """`zero_out` 的梯度. 参数: op: 欲进行微分的 `zero_out` `操作`, 可以用于获取原始 Op 的输入和输出. grad: 代表 `zero_out` 输出的梯度 Op. 返回: 代表输入 `zero_out` 的微分. """ to_zero = op.inputs[0] shape = array_ops.shape(to_zero) index = array_ops.zeros_like(shape) first_grad = array_ops.reshape(grad, [-1])[0] to_zero_grad = sparse_ops.sparse_to_dense(index, shape, first_grad, 0) return [to_zero_grad] # 单个 Tensor 的列表, 既然只有一个输入 ``` 使用 [`ops.RegisterGradient`](tensorflow-zh/SOURCE/api_docs/python/framework.md#RegisterGradient) 注册梯度函数需要注意的一些细节: * 对于仅有一个输出的 Op, 梯度函数使用 [`Operation`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation) `op` 和一个 [`Tensor`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Tensor) `grad` 作为参数, 并从 [`op.inputs[i]`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation.inputs), [`op.outputs[i]`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation.outputs), 和 `grad` 构建新的 Op. 属性的信息可以通过 [`op.get_attr`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation.get_attr) 获取. * 如果 Op 有多个输出, 梯度函数将使用 `op` 和 `grads` 作为参数, 其中, `grads` 是一个 梯度 Op 的列表, 为每一个输出计算梯度. 梯度函数的输出必须是一个 `Tensor` 对象列表, 对应到 每一个输入的梯度. * 如果没有为一些输入定义梯度, 譬如用作索引的整型, 这些输入返回的梯度为 `None`. 举一个例子, 如果一个 Op 的输入为一个浮点数 tensor `x` 和一个整型索引 `i`, 那么梯度函数将返回 `[x_grad, None]`. * 如果梯度对于一个 Op 来说毫无意义, 使用 `ops.NoGradient("OpName")` 禁用自动差分. 注意当梯度函数被调用时, 作用的对象是数据流图中的 Op, 而不是 tensor 数据本身. 因此, 只有在图运行时, 梯度运算才会被其它 tensorflow Op 的执行动作所触发. ## 在 Python 中实现一个形状函数 TensorFlow Python API 有一个 "形状推断" 功能, 可以不执行图就获取 tensor 的形状信息. 形状推断功能藉由每一个 Op 类型注册的 "形状函数" 来支持, 该函数有两个规则: 假设所有输入的 形状必须是兼容的, 以及指定输出的形状. 一个形状函数以一个 [`Operation`](tensorflow-zh/SOURCE/api_docs/python/framework.md#Operation) 作为输入, 返回一个 [`TensorShape`](tensorflow-zh/SOURCE/api_docs/python/framework.md#TensorShape) 对象列表 (每一个输出一个对象). 使用 [`tf.RegisterShape` 装饰器](tensorflow-zh/SOURCE/api_docs/python/framework.md#RegisterShape) 注册形状函数. 例如, [上文定义的 `ZeroOut` Op](#define_interface) 的形状函数如下: ```python @tf.RegisterShape("ZeroOut"): def _zero_out_shape(op): """ZeroOut Op 的形状函数. 这是 ZeroOut 形状函数的无约束版本, 为每一个输出产生的形状和对应的输入一样. """ return [op.inputs[0].get_shape()] ``` 一个形状函数也可以约束输入的形状. 下面是 [ZeroOut 形状函数的 vector 输入约束](#Validation)版本: ```python @tf.RegisterShape("ZeroOut"): def _zero_out_shape(op): """ZeroOut Op 的形状函数. 这是 ZeroOut 形状函数的约束版本, 要输入的 rank 必须是 1 (即使一个 vector). """ input_shape = op.inputs[0].get_shape().with_rank(1) return [input_shape] ``` 如果 Op 是[多输入的多态 Op](#Polymorphism), 使用操作的属性来决定需要检查的形状数量: ``` @tf.RegisterShape("IntListInputExample") def _int_list_input_example_shape(op): """ "IntListInputExample" Op 的形状函数. 所有的输入和输出是同大小的矩阵. """ output_shape = tf.TensorShape(None) for input in op.inputs: output_shape = output_shape.merge_with(input.get_shape().with_rank(2)) return [output_shape] ``` 既然形状推断是一个可选的特性, 且 tensor 的形状可能动态变化, 形状函数必须足够健壮, 能够处理任意 输入形状信息缺失的情形. [`merge_with`](tensorflow-zh/SOURCE/api_docs/python/framework.md) 方法能够帮助 调用者判断两个形状是否是一样的, 即使两个形状的信息不全, 该函数同样有效. 所有的[标准 Python Op](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/) 的形状函数都已经定义好了, 并且已经有很多不同的使用示例. [core-array_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/array_ops.cc [python-user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py [tf-kernels]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/ [user_ops]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/user_ops/ [pad_op]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.cc [standard_ops-py]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/standard_ops.py [standard_ops-cc]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/cc/ops/standard_ops.h [python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD [validation-macros]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h [op_def_builder]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.h [register_types]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/register_types.h [FinalizeAttr]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.cc#FinalizeAttr [DataTypeString]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.cc#DataTypeString [python-BUILD]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD [types-proto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.proto [TensorShapeProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor_shape.proto [TensorProto]:https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor.proto > 原文:[Adding a New Op](http://www.tensorflow.org/how_tos/adding_an_op/index.html#adding-a-new-op) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@ZHNathanielLee](https://github.com/ZHNathanielLee) ================================================ FILE: SOURCE/how_tos/graph_viz/index.md ================================================ # TensorBoard: Graph Visualization TensorFlow computation graphs are powerful but complicated. The graph visualization can help you understand and debug them. Here's an example of the visualization at work. ![Visualization of a TensorFlow graph](./graph_vis_animation.gif "Visualization of a TensorFlow graph") *Visualization of a TensorFlow graph.* To see your own graph, run TensorBoard pointing it to the log directory of the job, click on the graph tab on the top pane and select the appropriate run using the menu at the upper left corner. For in depth information on how to run TensorBoard and make sure you are logging all the necessary information, see [Summaries and TensorBoard](../../how_tos/summaries_and_tensorboard/index.md). ## Name scoping and nodes Typical TensorFlow graphs can have many thousands of nodes--far too many to see easily all at once, or even to lay out using standard graph tools. To simplify, variable names can be scoped and the visualization uses this information to define a hierarchy on the nodes in the graph. By default, only the top of this hierarchy is shown. Here is an example that defines three operations under the `hidden` name scope using [`tf.name_scope`](../../api_docs/python/framework.md#name_scope): ```python import tensorflow as tf with tf.name_scope('hidden') as scope: a = tf.constant(5, name='alpha') W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0), name='weights') b = tf.Variable(tf.zeros([1]), name='biases') ``` This results in the following three op names: * *hidden*/alpha * *hidden*/weights * *hidden*/biases By default, the visualization will collapse all three into a node labeled `hidden`. The extra detail isn't lost. You can double-click, or click on the orange `+` sign in the top right to expand the node, and then you'll see three subnodes for `alpha`, `weights` and `biases`. Here's a real-life example of a more complicated node in its initial and expanded states.
Unexpanded name scope Expanded name scope
Initial view of top-level name scope pool_1. Clicking on the orange + button on the top right or double-clicking on the node itself will expand it. Expanded view of pool_1 name scope. Clicking on the orange - button on the top right or double-clicking on the node itself will collapse the name scope.
Grouping nodes by name scopes is critical to making a legible graph. If you're building a model, name scopes give you control over the resulting visualization. **The better your name scopes, the better your visualization.** The figure above illustrates a second aspect of the visualization. TensorFlow graphs have two kinds of connections: data dependencies and control dependencies. Data dependencies show the flow of tensors between two ops and are shown as solid arrows, while control dependencies use dotted lines. In the expanded view (right side of the figure above) all the connections are data dependencies with the exception of the dotted line connecting `CheckNumerics` and `control_dependency`. There's a second trick to simplifying the layout. Most TensorFlow graphs have a few nodes with many connections to other nodes. For example, many nodes might have a control dependencies on an initialization step. Drawing all edges between the `init` node and its dependencies would create a very cluttered view. To reduce clutter, the visualization separates out all high-degree nodes to an *auxiliary* area on the right and doesn't draw lines to represent their edges. Instead of lines, we draw small *node icons* to indicate the connections. Separating out the auxiliary nodes typically doesn't remove critical information since these nodes are usually related to bookkeeping functions.
conv_1 is part of the main graph save is extracted as auxiliary node
Node conv_1 is connected to save. Note the little save node icon on its right. save has a high degree, and will appear as an auxiliary node. The connection with conv_1 is shown as a node icon on its left. To further reduce clutter, since save has a lot of connections, we show the first 5 and abbreviate the others as ... 12 more.
One last structural simplification is *series collapsing*. Sequential motifs--that is, nodes whose names differ by a number at the end and have isomorphic structures--are collapsed into a single *stack* of nodes, as shown below. For networks with long sequences, this greatly simplifies the view. As with hierarchical nodes, double-clicking expands the series.
Sequence of nodes Expanded sequence of nodes
A collapsed view of a node sequence. A small piece of the expanded view, after double-click.
Finally, as one last aid to legibility, the visualization uses special icons for constants and summary nodes. To summarize, here's a table of node symbols: Symbol | Meaning --- | --- ![Name scope](./namespace_node.png "Name scope") | *High-level* node representing a name scope. Double-click to expand a high-level node. ![Sequence of unconnected nodes](./horizontal_stack.png "Sequence of unconnected nodes") | Sequence of numbered nodes that are not connected to each other. ![Sequence of connected nodes](./vertical_stack.png "Sequence of connected nodes") | Sequence of numbered nodes that are connected to each other. ![Operation node](./op_node.png "Operation node") | An individual operation node. ![Constant node](./constant.png "Constant node") | A constant. ![Summary node](./summary.png "Summary node") | A summary node. ![Data flow edge](./dataflow_edge.png "Data flow edge") | Edge showing the data flow between operations. ![Control dependency edge](./control_edge.png "Control dependency edge") | Edge showing the control dependency between operations. ![Reference edge](./reference_edge.png "Reference edge") | A reference edge showing that the outgoing operation node can mutate the incoming tensor. ## Interaction Navigate the graph by panning and zooming. Click and drag to pan, and use a scroll gesture to zoom. Double-click on a node, or click on its `+` button, to expand a name scope that represents a group of operations. To easily keep track of the current viewpoint when zooming and panning, there is a minimap in the bottom right corner. To close an open node, double-click it again or click its `-` button. You can also click once to select a node. It will turn a darker color, and details about it and the nodes it connects to will appear in the info card at upper right corner of the visualization.
Info card of a name scope Info card of operation node
Info card showing detailed information for the conv2 name scope. The inputs and outputs are combined from the inputs and outputs of the operation nodes inside the name scope. For name scopes no attributes are shown. Info card showing detailed information for the DecodeRaw operation node. In addition to inputs and outputs, the card shows the device and the attributes associated with the current operation.
Selection can also be helpful in understanding high-degree nodes. Select any high-degree node, and the corresponding node icons for its other connections will be selected as well. This makes it easy, for example, to see which nodes are being saved--and which aren't. Clicking on a node name in the info card will select it. If necessary, the viewpoint will automatically pan so that the node is visible. Finally, you can choose two color schemes for your graph, using the color menu above the legend. The default *Structure View* shows structure: when two high-level nodes have the same structure, they appear in the same color of the rainbow. Uniquely structured nodes are gray. There's a second view, which shows what device the different operations run on. Name scopes are colored proportionally to the fraction of devices for the operations inside them. The images below give an illustration for a piece of a real-life graph.
Color by structure Color by device
Structure view: The gray nodes have unique structure. The orange conv1 and conv2 nodes have the same structure, and analogously for nodes with other colors. Device view: Name scopes are colored proportionally to the fraction of devices of the operation nodes inside them. Here, purple means GPU and the green is CPU.
================================================ FILE: SOURCE/how_tos/graph_viz.md ================================================ # TensorBoard: 图表可视化 TensorFlow 图表计算强大而又复杂,图表可视化在理解和调试时显得非常有帮助。 下面是一个运作时的可式化例子。 ![一个TensorFlow图表的可视化](../images/graph_vis_animation.gif) "一个TensorFlow图表的可视化") *一个TensorFlow图表的可视化。* 为了显示自己的图表,需将 TensorBoard 指向此工作的日志目录并运行,点击图表顶部窗格的标签页,然后在左上角的菜单中选择合适的运行。想要深入学习关于如何运行 TensorBoard 以及如何保证所有必要信息被记录下来,请查看 [Summaries 和 TensorBoard](tensorflow-zh/SOURCE/how_tos/summaries_and_tensorboard/index.md). ## 名称域(Name scoping)和节点(Node) 典型的 TensorFlow 可以有数以千计的节点,如此多而难以一下全部看到,甚至无法使用标准图表工具来展示。为简单起见,我们为变量名划定范围,并且可视化把该信息用于在图表中的节点上定义一个层级。默认情况下, 只有顶层节点会显示。下面这个例子使用[`tf.name_scope`](tensorflow-zh/SOURCE/api_docs/python/framework.md#name_scope)在`hidden`命名域下定义了三个操作: ```python import tensorflow as tf with tf.name_scope('hidden') as scope: a = tf.constant(5, name='alpha') W = tf.Variable(tf.random_uniform([1, 2], -1.0, 1.0), name='weights') b = tf.Variable(tf.zeros([1]), name='biases') ``` 结果是得到了下面三个操作名: * *hidden*/alpha * *hidden*/weights * *hidden*/biases 默认地,三个操作名会折叠为一个节点并标注为`hidden`。其额外细节并没有丢失,你可以双击,或点击右上方橙色的`+`来展开节点,然后就会看到三个子节点`alpha`,`weights`和`biases`了。 这有一个生动的例子,例中有一个更复杂的节点,节点处于其初始和展开状态。
未展开的名称域 展开的名称域
顶级名称域的初始视图pool_1,点击右上方橙色的+按钮或双击节点来展开。 展开的pool_1名称域视图,点击右上方橙色的-按钮或双击节点来收起此名称域。
通过名称域把节点分组来得到可读性高的图表很关键的。如果你在构建一个模型,名称域就可以用来控制可视化结果。**你的名称域越好,可视性就越好。** 上面的图像例子说明了可视化的另一方面, TensorFlow 图表有两种连接关系:数据依赖和控制依赖。数据依赖显示两个操作之间的tensor流程,用实心箭头指示,而控制依赖用点线表示。在已展开的视图(上面的右图)中,除了用点线连接的`CheckNumerics`和`control_dependency`之外,所有连接都是数据依赖的。 还有一种手段用来简化布局。大多数 TensorFlow 图表有一部分节点,这部分节点和其他节点之间有很多连接。比如,许多节点在初始化阶段可能会有一个控制依赖,而绘制所有`init`节点的边缘和其依赖可能会创造出一个混乱的视图。 为了减少混乱,可视化把所有 high-degree 节点分离到右边的一个*从属*区域, 而不会绘制线条来表示他们的边缘。线条也不用来表示连接了,我们绘制了小*节点图标*来指示这些连接关系。分离出从属节点通常不会把关键信息删除掉,因为这些节点和内构功能是相关的。
conv_1是主图表的部分 save被抽出为从属节点
节点conv_1被连接到save,注意其右边save节点图标。 save has a high degree, 并会作为从属节点出现,与conv_1的连接作为一个节点图标显示在其左边。为了继续减少杂乱,既然save有很多连接,我们则只显示前5个,而把其余的缩略为... 12 more
最后一个结构上的简化法叫做*序列折叠(series collapsing)*。 序列基序(Sequential motifs)是拥有相同结构并且其名称结尾的数字不同的节点,它们被折叠进一个单独的节点块(stack)中。对长序列网络来说,序列折叠极大地简化了视图,对于已层叠的节点,双击会展开序列。
节点序列 展开的节点序列
一个节点序列的折叠视图。 视图的一小块, 双击后展开。
最后,针对易读性的最后一点要说到的是,可视化为常节点和摘要节点使用了特别的图标,总结起来有下面这些节点符号: 符号 | 意义 --- | --- ![名称域](../images/namespace_node.png "名称域") | *High-level*节点代表一个名称域,双击则展开一个高层节点。 ![断线节点序列](../images/horizontal_stack.png "断线节点序列") | 彼此之间不连接的有限个节点序列。 ![相连节点序列](../images/vertical_stack.png "相连节点序列") | 彼此之间相连的有限个节点序列。 ![操作节点](../images/op_node.png "操作节点") | 一个单独的操作节点。 ![常量节点](../images/constant.png "常量节点") | 一个常量结点。 ![摘要节点](../images/summary.png "摘要节点") | 一个摘要节点。 ![数据流边](../images/dataflow_edge.png "数据流边") | 显示各操作间的数据流边。 ![控制依赖边](../images/control_edge.png "控制依赖边") | 显示各操作间的控制依赖边。 ![引用边](../images/reference_edge.png "引用边") | 引用边,表示出度操作节点可以使入度tensor发生变化。 ## 交互 通过平移和缩放来导航图表,点击和拖动用于平移,滚动手势用于缩放。双击一个节点或点击其`+`按钮来展开代表一组操作的名称域。右下角有一个小地图可以在缩放和平移时方便的改变当前视角。 要关闭一个打开的节点,再次双击它或点击它的`-`按钮,你也可以只点击一次来选中一个节点,节点的颜色会加深,并且会看到节点的详情,其连接到的节点会在可视化右上角的详情卡片显现。
一个名称域的详情卡片 操作节点的详情卡片
详情卡片展示conv2名称域的详细信息,名称域中操作节点的输入和输出被结合在一起,适用于不显示属性的名称域。 详情卡片展示DecodeRaw操作节点,除了输入和输出,卡片也会展示与当前节点相关的设备和属性。
选择对于 high-degree 节点的理解也很有帮助,选择任意节点,则与它的其余连接相应的节点也会选中,这使得在进行例如查看哪一个节点是否已保存等操作时非常容易。 点击详情卡片中的一个节点名称时会选中该节点,必要的话,视角会自动平移以使该节点可见。 最后,使用图例上方的颜色菜单,你可以给你的图表选择两个颜色方案。默认的*结构视图*下,当两个 high-level 节点颜色一样时,其会以相同的彩虹色彩出现,而结构唯一的节点颜色是灰色。还有一个视图则展示了不同的操作运行于什么设备之上。名称域被恰当的根据其中的操作节点的设备片件来着色。 下图是一张真实图表的图解:
按结构着色 按设备着色
结构视图:灰色节点的结构是唯一的。橙色的conv1conv2节点有相同的结构, 其他颜色的节点也类似。 设备视图:名称域根据其中的操作节点的设备片件来着色,在此紫色代表GPU,绿色代表CPU。
原文: [TensorBoard: Graph Visualization](../images/index.html#tensorboard-graph-visualization) 翻译: [@Warln](https://github.com/Warln) 校对: [lucky521](https://github.com/lucky521) ================================================ FILE: SOURCE/how_tos/new_data_formats/Custom Data Readers.md ================================================ # 自定义数据读取 基本要求: * 熟悉 C++ 编程。 * 确保 [下载 TensorFlow 源文件](../../get_started/os_setup.md#source), 并可编译使用。 我们将支持文件格式的任务分成两部分: * 文件格式: 我们使用 *Reader* Op来从文件中读取一个 *record* (可以使任意字符串)。 * 记录格式: 我们使用解码器或者解析运算将一个字符串记录转换为TensorFlow可以使用的张量。 例如, 读取一个 [CSV 文件](https://en.wikipedia.org/wiki/Comma-separated_values),我们使用 [一个文本读写器](../../api_docs/python/io_ops.md#TextLineReader), 然后是[从一行文本中解析CSV数据的运算](../../api_docs/python/io_ops.md#decode_csv)。 ## 主要内容 ### [自定义数据读取](#AUTOGENERATED-custom-data-readers) * [编写一个文件格式读写器](#AUTOGENERATED-writing-a-reader-for-a-file-format) * [编写一个记录格式Op](#AUTOGENERATED-writing-an-op-for-a-record-format) ## 编写一个文件格式读写器 Reader 是专门用来读取文件中的记录的。TensorFlow中内建了一些读写器Op的实例: * [tf.TFRecordReader](../../api_docs/python/io_ops.md#TFRecordReader) ([代码位于kernels/tf_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/tf_record_reader_op.cc)) * [tf.FixedLengthRecordReader](../../api_docs/python/io_ops.md#FixedLengthRecordReader) ([代码位于 kernels/fixed_length_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/fixed_length_record_reader_op.cc)) * [tf.TextLineReader](../../api_docs/python/io_ops.md#TextLineReader) ([代码位于 kernels/text_line_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/text_line_reader_op.cc)) 你可以看到这些读写器的界面是一样的,唯一的差异是在它们的构造函数中。最重要的方法是 Read。 它需要一个行列参数,通过这个行列参数,可以在需要的时候随时读取文件名 (例如: 当 Read Op首次运行,或者 前一个 Read` 从一个文件中读取最后一条记录时)。它将会生成两个标量张量: 一个字符串和一个字符串关键值。 新创建一个名为 SomeReader 的读写器,需要以下步骤: 1. 在 C++ 中, 定义一个 [tensorflow::ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h)的子类,命名为 "SomeReader". 2. 在 C++ 中,注册一个新的读写器Op和Kernel,命名为 "SomeReader"。 3. 在 Python 中, 定义一个 [tf.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py) 的子类,命名为 "SomeReader"。 你可以把所有的 C++ 代码放在 tensorflow/core/user_ops/some_reader_op.cc文件中. 读取文件的代码将被嵌入到C++ 的 ReaderBase 类的迭代中。 这个 ReaderBase 类 是在 [tensorflow/core/kernels/reader_base.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h) 中定义的。 你需要执行以下的方法: * OnWorkStartedLocked:打开下一个文件 * ReadLocked:读取一个记录或报告 EOF/error * OnWorkFinishedLocked:关闭当前文件 * ResetLocked:清空记录,例如:一个错误记录 以上这些方法的名字后面都带有 "Locked", 表示 ReaderBase 在调用任何一个方法之前确保获得互斥锁,这样就不用担心线程安全(虽然只保护了该类中的元素而不是全局的)。 对于 OnWorkStartedLocked, 需要打开的文件名是 current_work() 函数的返回值。 此时的 ReadLocked 的数字签名如下: ```c++ Status ReadLocked(string* key, string* value, bool* produced, bool* at_end) ``` 如果 ReadLocked 从文件中成功读取了一条记录,它将更新为: * *key: 记录的标志位,通过该标志位可以重新定位到该记录。 可以包含从 current_work() 返回值获得的文件名,并追加一个记录号或其他信息。 * *value: 包含记录的内容。 * *produced: 设置为 true。 当你在文件(EOF)末尾,设置 *at_end 为 true ,在任何情况下,都将返回 Status::OK()。 当出现错误的时候,只需要使用 [tensorflow/core/lib/core/errors.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h) 中的一个辅助功能就可以简单地返回,不需要做任何参数修改。 接下来你讲创建一个实际的读写器Op。 如果你已经熟悉了[添加新的Op](../../how_tos/adding_an_op/index.md) 那会很有帮助。 主要步骤如下: * 注册Op。 * 定义并注册 OpKernel。 要注册Op,你需要用到一个调用指令定义在 [tensorflow/core/framework/op.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op.h)中的REGISTER_OP。 读写器 Op 没有输入,只有 Ref(string) 类型的单输出。它们调用 SetIsStateful(),并有一个 container 字符串和 shared_name 属性. 你可以在一个 Doc 中定义配置或包含文档的额外属性。 例如:详见 [tensorflow/core/ops/io_ops.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/io_ops.cc)等: ```c++ #include "tensorflow/core/framework/op.h" REGISTER_OP("TextLineReader") .Output("reader_handle: Ref(string)") .Attr("skip_header_lines: int = 0") .Attr("container: string = ''") .Attr("shared_name: string = ''") .SetIsStateful() .Doc(R"doc( A Reader that outputs the lines of a file delimited by '\n'. )doc"); ``` 要定义一个 OpKernel, 读写器可以使用定义在[tensorflow/core/framework/reader_op_kernel.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/reader_op_kernel.h)中的 ReaderOpKernel 的递减快捷方式,并运行一个叫 SetReaderFactory 的构造函数。 定义所需要的类之后,你需要通过 REGISTER_KERNEL_BUILDER(...) 注册这个类。 一个没有属性的例子: ```c++ #include "tensorflow/core/framework/reader_op_kernel.h" class TFRecordReaderOp : public ReaderOpKernel { public: explicit TFRecordReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { Env* env = context->env(); SetReaderFactory([this, env]() { return new TFRecordReader(name(), env); }); } }; REGISTER_KERNEL_BUILDER(Name("TFRecordReader").Device(DEVICE_CPU), TFRecordReaderOp); ``` 一个带有属性的例子: ```c++ #include "tensorflow/core/framework/reader_op_kernel.h" class TextLineReaderOp : public ReaderOpKernel { public: explicit TextLineReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { int skip_header_lines = -1; OP_REQUIRES_OK(context, context->GetAttr("skip_header_lines", &skip_header_lines)); OP_REQUIRES(context, skip_header_lines >= 0, errors::InvalidArgument("skip_header_lines must be >= 0 not ", skip_header_lines)); Env* env = context->env(); SetReaderFactory([this, skip_header_lines, env]() { return new TextLineReader(name(), skip_header_lines, env); }); } }; REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU), TextLineReaderOp); ``` 最后一步是添加 Python 包装器,你需要将 tensorflow.python.ops.io_ops 导入到 [tensorflow/python/user_ops/user_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py),并添加一个 [io_ops.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)的衍生函数。 ```python from tensorflow.python.framework import ops from tensorflow.python.ops import common_shapes from tensorflow.python.ops import io_ops class SomeReader(io_ops.ReaderBase): def __init__(self, name=None): rr = gen_user_ops.some_reader(name=name) super(SomeReader, self).__init__(rr) ops.NoGradient("SomeReader") ops.RegisterShape("SomeReader")(common_shapes.scalar_shape) ``` 你可以在 [tensorflow/python/ops/io_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)中查看一些范例。 ## 编写一个记录格式Op 一般来说,这是一个普通的Op, 需要一个标量字符串记录作为输入, 因此遵循 [添加Op的说明](../../how_tos/adding_an_op/index.md)。 你可以选择一个标量字符串作为输入, 并包含在错误消息中报告不正确的格式化数据。 用于解码记录的运算实例: * [tf.parse_single_example](../../api_docs/python/io_ops.md#parse_single_example) (and [tf.parse_example](../../api_docs/python/io_ops.md#parse_example)) * [tf.decode_csv](../../api_docs/python/io_ops.md#decode_csv) * [tf.decode_raw](../../api_docs/python/io_ops.md#decode_raw) 请注意,使用多个Op 来解码某个特定的记录格式也是有效的。 例如,你有一张以字符串格式保存在 [tf.train.Example 协议缓冲区](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)的图像文件。 根据该图像的格式, 你可能从 [tf.parse_single_example](../../api_docs/python/io_ops.md#parse_single_example) 的Op 读取响应输出并调用 [tf.decode_jpeg](../../api_docs/python/image.md#decode_jpeg), [tf.decode_png](../../api_docs/python/image.md#decode_png), 或者 [tf.decode_raw](../../api_docs/python/io_ops.md#decode_raw)。通过读取 tf.decode_raw 的响应输出并使用[tf.slice](../../api_docs/python/array_ops.md#slice) 和 [tf.reshape](../../api_docs/python/array_ops.md#reshape) 来提取数据是通用的方法。 > 原文:[Custom Data Readers](http://tensorflow.org/how_tos/new_data_formats/index.html#custom-data-readers) 翻译:[@derekshang](https://github.com/derekshang) 校对: ================================================ FILE: SOURCE/how_tos/new_data_formats/index.md ================================================ # Custom Data Readers PREREQUISITES: * Some familiarity with C++. * Must have [downloaded TensorFlow source](../../get_started/os_setup.md#source), and be able to build it. We divide the task of supporting a file format into two pieces: * File formats: We use a *Reader* Op to read a *record* (which can be any string) from a file. * Record formats: We use decoder or parsing Ops to turn a string record into tensors usable by TensorFlow. For example, to read a [CSV file](https://en.wikipedia.org/wiki/Comma-separated_values), we use [a Reader for text files](../../api_docs/python/io_ops.md#TextLineReader) followed by [an Op that parses CSV data from a line of text](../../api_docs/python/io_ops.md#decode_csv). ## Contents ### [Custom Data Readers](#AUTOGENERATED-custom-data-readers) * [Writing a Reader for a file format](#AUTOGENERATED-writing-a-reader-for-a-file-format) * [Writing an Op for a record format](#AUTOGENERATED-writing-an-op-for-a-record-format) ## Writing a Reader for a file format A `Reader` is something that reads records from a file. There are some examples of Reader Ops already built into TensorFlow: * [`tf.TFRecordReader`](../../api_docs/python/io_ops.md#TFRecordReader) ([source in `kernels/tf_record_reader_op.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/tf_record_reader_op.cc)) * [`tf.FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader) ([source in `kernels/fixed_length_record_reader_op.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/fixed_length_record_reader_op.cc)) * [`tf.TextLineReader`](../../api_docs/python/io_ops.md#TextLineReader) ([source in `kernels/text_line_reader_op.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/text_line_reader_op.cc)) You can see these all expose the same interface, the only differences are in their constructors. The most important method is `read`. It takes a queue argument, which is where it gets filenames to read from whenever it needs one (e.g. when the `read` op first runs, or the previous `read` reads the last record from a file). It produces two scalar tensors: a string key and and a string value. To create a new reader called `SomeReader`, you will need to: 1. In C++, define a subclass of [`tensorflow::ReaderBase`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h) called `SomeReader`. 2. In C++, register a new reader op and kernel with the name `"SomeReader"`. 3. In Python, define a subclass of [`tf.ReaderBase`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py) called `SomeReader`. You can put all the C++ code in a file in `tensorflow/core/user_ops/some_reader_op.cc`. The code to read a file will live in a descendant of the C++ `ReaderBase` class, which is defined in [`tensorflow/core/kernels/reader_base.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h). You will need to implement the following methods: * `OnWorkStartedLocked`: open the next file * `ReadLocked`: read a record or report EOF/error * `OnWorkFinishedLocked`: close the current file, and * `ResetLocked`: get a clean slate after, e.g., an error These methods have names ending in "Locked" since `ReaderBase` makes sure to acquire a mutex before calling any of these methods, so you generally don't have to worry about thread safety (though that only protects the members of the class, not global state). For `OnWorkStartedLocked`, the name of the file to open is the value returned by the `current_work()` method. `ReadLocked` has this signature: ```c++ Status ReadLocked(string* key, string* value, bool* produced, bool* at_end) ``` If `ReadLocked` successfully reads a record from the file, it should fill in: * `*key`: with an identifier for the record, that a human could use to find this record again. You can include the filename from `current_work()`, and append a record number or whatever. * `*value`: with the contents of the record. * `*produced`: set to `true`. If you hit the end of a file (EOF), set `*at_end` to `true`. In either case, return `Status::OK()`. If there is an error, simply return it using one of the helper functions from [`tensorflow/core/lib/core/errors.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h) without modifying any arguments. Next you will create the actual Reader op. It will help if you are familiar with [the adding an op how-to](../../how_tos/adding_an_op/index.md). The main steps are: * Registering the op. * Define and register an `OpKernel`. To register the op, you will use a `REGISTER_OP` call defined in [`tensorflow/core/framework/op.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op.h). Reader ops never take any input and always have a single output with type `Ref(string)`. They should always call `SetIsStateful()`, and have a string `container` and `shared_name` attrs. You may optionally define additional attrs for configuration or include documentation in a `Doc`. For examples, see [`tensorflow/core/ops/io_ops.cc`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/io_ops.cc), e.g.: ```c++ #include "tensorflow/core/framework/op.h" REGISTER_OP("TextLineReader") .Output("reader_handle: Ref(string)") .Attr("skip_header_lines: int = 0") .Attr("container: string = ''") .Attr("shared_name: string = ''") .SetIsStateful() .Doc(R"doc( A Reader that outputs the lines of a file delimited by '\n'. )doc"); ``` To define an `OpKernel`, Readers can use the shortcut of descending from `ReaderOpKernel`, defined in [`tensorflow/core/framework/reader_op_kernel.h`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/reader_op_kernel.h), and implement a constructor that calls `SetReaderFactory`. After defining your class, you will need to register it using `REGISTER_KERNEL_BUILDER(...)`. An example with no attrs: ```c++ #include "tensorflow/core/framework/reader_op_kernel.h" class TFRecordReaderOp : public ReaderOpKernel { public: explicit TFRecordReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { Env* env = context->env(); SetReaderFactory([this, env]() { return new TFRecordReader(name(), env); }); } }; REGISTER_KERNEL_BUILDER(Name("TFRecordReader").Device(DEVICE_CPU), TFRecordReaderOp); ``` An example with attrs: ```c++ #include "tensorflow/core/framework/reader_op_kernel.h" class TextLineReaderOp : public ReaderOpKernel { public: explicit TextLineReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { int skip_header_lines = -1; OP_REQUIRES_OK(context, context->GetAttr("skip_header_lines", &skip_header_lines)); OP_REQUIRES(context, skip_header_lines >= 0, errors::InvalidArgument("skip_header_lines must be >= 0 not ", skip_header_lines)); Env* env = context->env(); SetReaderFactory([this, skip_header_lines, env]() { return new TextLineReader(name(), skip_header_lines, env); }); } }; REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU), TextLineReaderOp); ``` The last step is to add the Python wrapper. You will import `tensorflow.python.ops.io_ops` in [`tensorflow/python/user_ops/user_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py) and add a descendant of [`io_ops.ReaderBase`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py). ```python from tensorflow.python.framework import ops from tensorflow.python.ops import common_shapes from tensorflow.python.ops import io_ops class SomeReader(io_ops.ReaderBase): def __init__(self, name=None): rr = gen_user_ops.some_reader(name=name) super(SomeReader, self).__init__(rr) ops.NoGradient("SomeReader") ops.RegisterShape("SomeReader")(common_shapes.scalar_shape) ``` You can see some examples in [`tensorflow/python/ops/io_ops.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py). ## Writing an Op for a record format Generally this is an ordinary op that takes a scalar string record as input, and so follow [the instructions to add an Op](../../how_tos/adding_an_op/index.md). You may optionally take a scalar string key as input, and include that in error messages reporting improperly formatted data. That way users can more easily track down where the bad data came from. Examples of Ops useful for decoding records: * [`tf.parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example) (and [`tf.parse_example`](../../api_docs/python/io_ops.md#parse_example)) * [`tf.decode_csv`](../../api_docs/python/io_ops.md#decode_csv) * [`tf.decode_raw`](../../api_docs/python/io_ops.md#decode_raw) Note that it can be useful to use multiple Ops to decode a particular record format. For example, you may have an image saved as a string in [a `tf.train.Example` protocol buffer](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto). Depending on the format of that image, you might take the corresponding output from a [`tf.parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example) op and call [`tf.decode_jpeg`](../../api_docs/python/image.md#decode_jpeg), [`tf.decode_png`](../../api_docs/python/image.md#decode_png), or [`tf.decode_raw`](../../api_docs/python/io_ops.md#decode_raw). It is common to take the output of `tf.decode_raw` and use [`tf.slice`](../../api_docs/python/array_ops.md#slice) and [`tf.reshape`](../../api_docs/python/array_ops.md#reshape) to extract pieces. ================================================ FILE: SOURCE/how_tos/new_data_formats.md ================================================ # 自定义数据读取 基本要求: * 熟悉 C++ 编程。 * 确保[下载 TensorFlow 源文件](tensorflow-zh/SOURCE/get_started/os_setup.md#source), 并可编译使用。 我们将支持文件格式的任务分成两部分: * 文件格式: 我们使用 *Reader* Op来从文件中读取一个 *record* (可以使任意字符串)。 * 记录格式: 我们使用解码器或者解析运算将一个字符串记录转换为TensorFlow可以使用的张量。 例如, 读取一个 [CSV 文件](https://en.wikipedia.org/wiki/Comma-separated_values),我们使用 [一个文本读写器](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TextLineReader), 然后是[从一行文本中解析CSV数据的运算](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_csv)。 ## 主要内容 ### [自定义数据读取](#AUTOGENERATED-custom-data-readers) * [编写一个文件格式读写器](#AUTOGENERATED-writing-a-reader-for-a-file-format) * [编写一个记录格式Op](#AUTOGENERATED-writing-an-op-for-a-record-format) ## 编写一个文件格式读写器 Reader 是专门用来读取文件中的记录的。TensorFlow中内建了一些读写器Op的实例: * [tf.TFRecordReader](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TFRecordReader) ([代码位于kernels/tf_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/tf_record_reader_op.cc)) * [tf.FixedLengthRecordReader](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#FixedLengthRecordReader) ([代码位于 kernels/fixed_length_record_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/fixed_length_record_reader_op.cc)) * [tf.TextLineReader](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TextLineReader) ([代码位于 kernels/text_line_reader_op.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/text_line_reader_op.cc)) 你可以看到这些读写器的界面是一样的,唯一的差异是在它们的构造函数中。最重要的方法是 Read。 它需要一个行列参数,通过这个行列参数,可以在需要的时候随时读取文件名 (例如: 当 Read Op首次运行,或者前一个 Read` 从一个文件中读取最后一条记录时)。它将会生成两个标量张量: 一个字符串和一个字符串关键值。 新创建一个名为 SomeReader 的读写器,需要以下步骤: 1. 在 C++ 中, 定义一个 [tensorflow::ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h)的子类,命名为 "SomeReader". 2. 在 C++ 中,注册一个新的读写器Op和Kernel,命名为 "SomeReader"。 3. 在 Python 中, 定义一个 [tf.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py) 的子类,命名为 "SomeReader"。 你可以把所有的 C++ 代码放在 `tensorflow/core/user_ops/some_reader_op.cc`文件中. 读取文件的代码将被嵌入到C++ 的 ReaderBase 类的迭代中。 这个 ReaderBase 类 是在 [tensorflow/core/kernels/reader_base.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h) 中定义的。 你需要执行以下的方法: * OnWorkStartedLocked:打开下一个文件 * ReadLocked:读取一个记录或报告 EOF/error * OnWorkFinishedLocked:关闭当前文件 * ResetLocked:清空记录,例如:一个错误记录 以上这些方法的名字后面都带有 "Locked", 表示 ReaderBase 在调用任何一个方法之前确保获得互斥锁,这样就不用担心线程安全(虽然只保护了该类中的元素而不是全局的)。 对于 OnWorkStartedLocked, 需要打开的文件名是 `current_work()` 函数的返回值。此时的 ReadLocked 的数字签名如下: ``` Status ReadLocked(string* key, string* value, bool* produced, bool* at_end) ``` 如果 ReadLocked 从文件中成功读取了一条记录,它将更新为: * *key: 记录的标志位,通过该标志位可以重新定位到该记录。 可以包含从 current_work() 返回值获得的文件名,并追加一个记录号或其他信息。 * *value: 包含记录的内容。 * *produced: 设置为 true。 当你在文件(EOF)末尾,设置 *at_end 为 true ,在任何情况下,都将返回 Status::OK()。 当出现错误的时候,只需要使用 [tensorflow/core/lib/core/errors.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h) 中的一个辅助功能就可以简单地返回,不需要做任何参数修改。 接下来你讲创建一个实际的读写器Op。 如果你已经熟悉了[添加新的Op](tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md) 那会很有帮助。 主要步骤如下: * 注册Op。 * 定义并注册 OpKernel。 要注册Op,你需要用到一个调用指令定义在 [tensorflow/core/framework/op.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op.h)中的REGISTER_OP。 读写器 Op 没有输入,只有 Ref(string) 类型的单输出。它们调用 SetIsStateful(),并有一个 container 字符串和 shared_name 属性. 你可以在一个 Doc 中定义配置或包含文档的额外属性。 例如:详见 [tensorflow/core/ops/io_ops.cc](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/io_ops.cc)等: ``` #include "tensorflow/core/framework/op.h" REGISTER_OP("TextLineReader") .Output("reader_handle: Ref(string)") .Attr("skip_header_lines: int = 0") .Attr("container: string = ''") .Attr("shared_name: string = ''") .SetIsStateful() .Doc(R"doc( A Reader that outputs the lines of a file delimited by '\n'. )doc"); ``` 要定义一个 OpKernel, 读写器可以使用定义在[tensorflow/core/framework/reader_op_kernel.h](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/reader_op_kernel.h)中的 ReaderOpKernel 的递减快捷方式,并运行一个叫 SetReaderFactory 的构造函数。 定义所需要的类之后,你需要通过 REGISTER_KERNEL_BUILDER(...) 注册这个类。 一个没有属性的例子: ``` #include "tensorflow/core/framework/reader_op_kernel.h" class TFRecordReaderOp : public ReaderOpKernel { public: explicit TFRecordReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { Env* env = context->env(); SetReaderFactory([this, env]() { return new TFRecordReader(name(), env); }); } }; REGISTER_KERNEL_BUILDER(Name("TFRecordReader").Device(DEVICE_CPU), TFRecordReaderOp); ``` 一个带有属性的例子: ``` #include "tensorflow/core/framework/reader_op_kernel.h" class TextLineReaderOp : public ReaderOpKernel { public: explicit TextLineReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { int skip_header_lines = -1; OP_REQUIRES_OK(context, context->GetAttr("skip_header_lines", &skip_header_lines)); OP_REQUIRES(context, skip_header_lines >= 0, errors::InvalidArgument("skip_header_lines must be >= 0 not ", skip_header_lines)); Env* env = context->env(); SetReaderFactory([this, skip_header_lines, env]() { return new TextLineReader(name(), skip_header_lines, env); }); } }; REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU), TextLineReaderOp); ``` 最后一步是添加 Python 包装器,你需要将 tensorflow.python.ops.io_ops 导入到 [tensorflow/python/user_ops/user_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py),并添加一个 [io_ops.ReaderBase](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)的衍生函数。 ``` from tensorflow.python.framework import ops from tensorflow.python.ops import common_shapes from tensorflow.python.ops import io_ops class SomeReader(io_ops.ReaderBase): def __init__(self, name=None): rr = gen_user_ops.some_reader(name=name) super(SomeReader, self).__init__(rr) ops.NoGradient("SomeReader") ops.RegisterShape("SomeReader")(common_shapes.scalar_shape) ``` 你可以在 [tensorflow/python/ops/io_ops.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py)中查看一些范例。 ## 编写一个记录格式Op 一般来说,这是一个普通的Op, 需要一个标量字符串记录作为输入, 因此遵循 [添加Op的说明](tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md)。 你可以选择一个标量字符串作为输入, 并包含在错误消息中报告不正确的格式化数据。 用于解码记录的运算实例: * [tf.parse_single_example](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_single_example) (and [tf.parse_example](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_example)) * [tf.decode_csv](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_csv) * [tf.decode_raw](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_raw) 请注意,使用多个Op 来解码某个特定的记录格式也是有效的。 例如,你有一张以字符串格式保存在 [tf.train.Example 协议缓冲区](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)的图像文件。 根据该图像的格式, 你可能从 [tf.parse_single_example](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_single_example) 的Op 读取响应输出并调用 [tf.decode_jpeg](tensorflow-zh/SOURCE/api_docs/python/image.md#decode_jpeg), [tf.decode_png](tensorflow-zh/SOURCE/api_docs/python/image.md#decode_png), 或者 [tf.decode_raw](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_raw)。通过读取 tf.decode_raw 的响应输出并使用[tf.slice](tensorflow-zh/SOURCE/api_docs/python/array_ops.md#slice) 和 [tf.reshape](tensorflow-zh/SOURCE/api_docs/python/array_ops.md#reshape) 来提取数据是通用的方法。 > 原文:[Custom Data Readers](http://tensorflow.org/how_tos/new_data_formats/index.html#custom-data-readers) 翻译:[@derekshang](https://github.com/derekshang) 校对:[Wiki](https://github.com/jikexueyuanwiki) ================================================ FILE: SOURCE/how_tos/overview.md ================================================ # 综述 Overview ## Variables: 创建,初始化,保存,和恢复 TensorFlow Variables 是内存中的容纳 tensor 的缓存。这一小节介绍了用它们在模型训练时(during training)创建、保存和更新模型参数(model parameters) 的方法。 [参看教程](../how_tos/variables.md) ## TensorFlow 机制 101 用 MNIST 手写数字识别作为一个小例子,一步一步的将使用 TensorFlow 基础架构(infrastructure)训练大规模模型的细节做详细介绍。 [参看教程](../tutorials/mnist_tf.md) ## TensorBoard: 学习过程的可视化 对模型进行训练和评估时,TensorBoard 是一个很有用的可视化工具。此教程解释了创建和运行 TensorBoard 的方法,和使用摘要操作(Summary ops)的方法,通过添加摘要操作(Summary ops),可以自动把数据传输到 TensorBoard 所使用的事件文件。 [参看教程](../how_tos/summaries_and_tensorboard.md) ## TensorBoard: 图的可视化 此教程介绍了在 TensorBoard 中使用可视化工具的方法,它可以帮助你理解张量流图的过程并 debug。 [参看教程](../how_tos/graph_viz.md) ## 数据读入 此教程介绍了把数据传入 TensorSlow 程序的三种主要的方法: Feeding, Reading 和 Preloading. [参看教程](../how_tos/reading_data.md) ## 线程和队列 此教程介绍 TensorFlow 中为了更容易进行异步和并发训练的各种不同结构(constructs)。 [参看教程](../how_tos/threading_and_queues.md) ## 添加新的 Op TensorFlow 已经提供一整套节点操作()operation),你可以在你的 graph 中随意使用它们,不过这里有关于添加自定义操作(custom op)的细节。 [参看教程](../how_tos/adding_an_op.md)。 ## 自定义数据的 Readers 如果你有相当大量的自定义数据集合,可能你想要对 TensorFlow 的 Data Readers 进行扩展,使它能直接以数据自身的格式将其读入。 [参看教程](../how_tos/new_data_formats.md)。 ## 使用 GPUs 此教程描述了用多个 GPU 构建和运行模型的方法。 [参看教程](../how_tos/using_gpu.md) ## 共享变量 Sharing Variables 当在多 GPU 上部署大型的模型,或展开复杂的 LSTMs 或 RNNs 时,在模型构建代码的不同位置对许多相同的变量(Variable)进行读写常常是必须的。设计变量作用域(Variable Scope)机制的目的就是为了帮助上述任务的实现。 [参看教程](../how_tos/variable_scope/index.md)。 原文: [How-to](http://tensorflow.org/how_tos/index.html) 翻译:[Terence Cooper](https://github.com/TerenceCooper) 校对:[lonlonago](https://github.com/lonlonago) ================================================ FILE: SOURCE/how_tos/reading_data/__init__.py ================================================ ================================================ FILE: SOURCE/how_tos/reading_data/convert_to_records.py ================================================ """Converts MNIST data to TFRecords file format with Example protos.""" from __future__ import print_function import os import tensorflow.python.platform import numpy import tensorflow as tf from tensorflow.g3doc.tutorials.mnist import input_data TRAIN_IMAGES = 'train-images-idx3-ubyte.gz' # MNIST filenames TRAIN_LABELS = 'train-labels-idx1-ubyte.gz' TEST_IMAGES = 't10k-images-idx3-ubyte.gz' TEST_LABELS = 't10k-labels-idx1-ubyte.gz' tf.app.flags.DEFINE_string('directory', 'data', 'Directory to download data files and write the ' 'converted result') tf.app.flags.DEFINE_integer('validation_size', 5000, 'Number of examples to separate from the training ' 'data for the validation set.') FLAGS = tf.app.flags.FLAGS def _int64_feature(value): return tf.train.Feature(int64_list=tf.train.Int64List(value=[value])) def _bytes_feature(value): return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value])) def convert_to(images, labels, name): num_examples = labels.shape[0] if images.shape[0] != num_examples: raise ValueError("Images size %d does not match label size %d." % (dat.shape[0], num_examples)) rows = images.shape[1] cols = images.shape[2] depth = images.shape[3] filename = os.path.join(FLAGS.directory, name + '.tfrecords') print('Writing', filename) writer = tf.python_io.TFRecordWriter(filename) for index in range(num_examples): image_raw = images[index].tostring() example = tf.train.Example(features=tf.train.Features(feature={ 'height': _int64_feature(rows), 'width': _int64_feature(cols), 'depth': _int64_feature(depth), 'label': _int64_feature(int(labels[index])), 'image_raw': _bytes_feature(image_raw)})) writer.write(example.SerializeToString()) def main(argv): # Get the data. train_images_filename = input_data.maybe_download( TRAIN_IMAGES, FLAGS.directory) train_labels_filename = input_data.maybe_download( TRAIN_LABELS, FLAGS.directory) test_images_filename = input_data.maybe_download( TEST_IMAGES, FLAGS.directory) test_labels_filename = input_data.maybe_download( TEST_LABELS, FLAGS.directory) # Extract it into numpy arrays. train_images = input_data.extract_images(train_images_filename) train_labels = input_data.extract_labels(train_labels_filename) test_images = input_data.extract_images(test_images_filename) test_labels = input_data.extract_labels(test_labels_filename) # Generate a validation set. validation_images = train_images[:FLAGS.validation_size, :, :, :] validation_labels = train_labels[:FLAGS.validation_size] train_images = train_images[FLAGS.validation_size:, :, :, :] train_labels = train_labels[FLAGS.validation_size:] # Convert to Examples and write the result to TFRecords. convert_to(train_images, train_labels, 'train') convert_to(validation_images, validation_labels, 'validation') convert_to(test_images, test_labels, 'test') if __name__ == '__main__': tf.app.run() ================================================ FILE: SOURCE/how_tos/reading_data/fully_connected_preloaded.py ================================================ """Trains the MNIST network using preloaded data in a constant. Command to run this py_binary target: bazel run -c opt \ <...>/tensorflow/g3doc/how_tos/reading_data:fully_connected_preloaded """ from __future__ import print_function import os.path import time import tensorflow.python.platform import numpy import tensorflow as tf from tensorflow.g3doc.tutorials.mnist import input_data from tensorflow.g3doc.tutorials.mnist import mnist # Basic model parameters as external flags. flags = tf.app.flags FLAGS = flags.FLAGS flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.') flags.DEFINE_integer('num_epochs', 2, 'Number of epochs to run trainer.') flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.') flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.') flags.DEFINE_integer('batch_size', 100, 'Batch size. ' 'Must divide evenly into the dataset sizes.') flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.') flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data ' 'for unit testing.') def run_training(): """Train MNIST for a number of epochs.""" # Get the sets of images and labels for training, validation, and # test on MNIST. data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data) # Tell TensorFlow that the model will be built into the default Graph. with tf.Graph().as_default(): with tf.name_scope('input'): # Input data input_images = tf.constant(data_sets.train.images) input_labels = tf.constant(data_sets.train.labels) image, label = tf.train.slice_input_producer( [input_images, input_labels], num_epochs=FLAGS.num_epochs) label = tf.cast(label, tf.int32) images, labels = tf.train.batch( [image, label], batch_size=FLAGS.batch_size) # Build a Graph that computes predictions from the inference model. logits = mnist.inference(images, FLAGS.hidden1, FLAGS.hidden2) # Add to the Graph the Ops for loss calculation. loss = mnist.loss(logits, labels) # Add to the Graph the Ops that calculate and apply gradients. train_op = mnist.training(loss, FLAGS.learning_rate) # Add the Op to compare the logits to the labels during evaluation. eval_correct = mnist.evaluation(logits, labels) # Build the summary operation based on the TF collection of Summaries. summary_op = tf.merge_all_summaries() # Create a saver for writing training checkpoints. saver = tf.train.Saver() # Create the op for initializing variables. init_op = tf.initialize_all_variables() # Create a session for running Ops on the Graph. sess = tf.Session() # Run the Op to initialize the variables. sess.run(init_op) # Instantiate a SummaryWriter to output summaries and the Graph. summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, graph_def=sess.graph_def) # Start input enqueue threads. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) # And then after everything is built, start the training loop. try: step = 0 while not coord.should_stop(): start_time = time.time() # Run one step of the model. _, loss_value = sess.run([train_op, loss]) duration = time.time() - start_time # Write the summaries and print an overview fairly often. if step % 100 == 0: # Print status to stdout. print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration)) # Update the events file. summary_str = sess.run(summary_op) summary_writer.add_summary(summary_str, step) step += 1 # Save a checkpoint periodically. if (step + 1) % 1000 == 0: print('Saving') saver.save(sess, FLAGS.train_dir, global_step=step) step += 1 except tf.errors.OutOfRangeError: print('Saving') saver.save(sess, FLAGS.train_dir, global_step=step) print('Done training for %d epochs, %d steps.' % (FLAGS.num_epochs, step)) finally: # When done, ask the threads to stop. coord.request_stop() # Wait for threads to finish. coord.join(threads) sess.close() def main(_): run_training() if __name__ == '__main__': tf.app.run() ================================================ FILE: SOURCE/how_tos/reading_data/fully_connected_preloaded_var.py ================================================ """Trains the MNIST network using preloaded data stored in a variable. Command to run this py_binary target: bazel run -c opt \ <...>/tensorflow/g3doc/how_tos/reading_data:fully_connected_preloaded_var """ from __future__ import print_function import os.path import time import tensorflow.python.platform import numpy import tensorflow as tf from tensorflow.g3doc.tutorials.mnist import input_data from tensorflow.g3doc.tutorials.mnist import mnist # Basic model parameters as external flags. flags = tf.app.flags FLAGS = flags.FLAGS flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.') flags.DEFINE_integer('num_epochs', 2, 'Number of epochs to run trainer.') flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.') flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.') flags.DEFINE_integer('batch_size', 100, 'Batch size. ' 'Must divide evenly into the dataset sizes.') flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.') flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data ' 'for unit testing.') def run_training(): """Train MNIST for a number of epochs.""" # Get the sets of images and labels for training, validation, and # test on MNIST. data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data) # Tell TensorFlow that the model will be built into the default Graph. with tf.Graph().as_default(): with tf.name_scope('input'): # Input data images_initializer = tf.placeholder( dtype=data_sets.train.images.dtype, shape=data_sets.train.images.shape) labels_initializer = tf.placeholder( dtype=data_sets.train.labels.dtype, shape=data_sets.train.labels.shape) input_images = tf.Variable( images_initializer, trainable=False, collections=[]) input_labels = tf.Variable( labels_initializer, trainable=False, collections=[]) image, label = tf.train.slice_input_producer( [input_images, input_labels], num_epochs=FLAGS.num_epochs) label = tf.cast(label, tf.int32) images, labels = tf.train.batch( [image, label], batch_size=FLAGS.batch_size) # Build a Graph that computes predictions from the inference model. logits = mnist.inference(images, FLAGS.hidden1, FLAGS.hidden2) # Add to the Graph the Ops for loss calculation. loss = mnist.loss(logits, labels) # Add to the Graph the Ops that calculate and apply gradients. train_op = mnist.training(loss, FLAGS.learning_rate) # Add the Op to compare the logits to the labels during evaluation. eval_correct = mnist.evaluation(logits, labels) # Build the summary operation based on the TF collection of Summaries. summary_op = tf.merge_all_summaries() # Create a saver for writing training checkpoints. saver = tf.train.Saver() # Create the op for initializing variables. init_op = tf.initialize_all_variables() # Create a session for running Ops on the Graph. sess = tf.Session() # Run the Op to initialize the variables. sess.run(init_op) sess.run(input_images.initializer, feed_dict={images_initializer: data_sets.train.images}) sess.run(input_labels.initializer, feed_dict={labels_initializer: data_sets.train.labels}) # Instantiate a SummaryWriter to output summaries and the Graph. summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, graph_def=sess.graph_def) # Start input enqueue threads. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) # And then after everything is built, start the training loop. try: step = 0 while not coord.should_stop(): start_time = time.time() # Run one step of the model. _, loss_value = sess.run([train_op, loss]) duration = time.time() - start_time # Write the summaries and print an overview fairly often. if step % 100 == 0: # Print status to stdout. print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration)) # Update the events file. summary_str = sess.run(summary_op) summary_writer.add_summary(summary_str, step) step += 1 # Save a checkpoint periodically. if (step + 1) % 1000 == 0: print('Saving') saver.save(sess, FLAGS.train_dir, global_step=step) step += 1 except tf.errors.OutOfRangeError: print('Saving') saver.save(sess, FLAGS.train_dir, global_step=step) print('Done training for %d epochs, %d steps.' % (FLAGS.num_epochs, step)) finally: # When done, ask the threads to stop. coord.request_stop() # Wait for threads to finish. coord.join(threads) sess.close() def main(_): run_training() if __name__ == '__main__': tf.app.run() ================================================ FILE: SOURCE/how_tos/reading_data/fully_connected_reader.py ================================================ """Train and Eval the MNIST network. This version is like fully_connected_feed.py but uses data converted to a TFRecords file containing tf.train.Example protocol buffers. See tensorflow/g3doc/how_tos/reading_data.md#reading-from-files for context. YOU MUST run convert_to_records before running this (but you only need to run it once). """ from __future__ import print_function import os.path import time import tensorflow.python.platform import numpy import tensorflow as tf from tensorflow.g3doc.tutorials.mnist import mnist # Basic model parameters as external flags. flags = tf.app.flags FLAGS = flags.FLAGS flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.') flags.DEFINE_integer('num_epochs', 2, 'Number of epochs to run trainer.') flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.') flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.') flags.DEFINE_integer('batch_size', 100, 'Batch size.') flags.DEFINE_string('train_dir', 'data', 'Directory with the training data.') # Constants used for dealing with the files, matches convert_to_records. TRAIN_FILE = 'train.tfrecords' VALIDATION_FILE = 'validation.tfrecords' def read_and_decode(filename_queue): reader = tf.TFRecordReader() _, serialized_example = reader.read(filename_queue) features = tf.parse_single_example( serialized_example, dense_keys=['image_raw', 'label'], # Defaults are not specified since both keys are required. dense_types=[tf.string, tf.int64]) # Convert from a scalar string tensor (whose single string has # length mnist.IMAGE_PIXELS) to a uint8 tensor with shape # [mnist.IMAGE_PIXELS]. image = tf.decode_raw(features['image_raw'], tf.uint8) image.set_shape([mnist.IMAGE_PIXELS]) # OPTIONAL: Could reshape into a 28x28 image and apply distortions # here. Since we are not applying any distortions in this # example, and the next step expects the image to be flattened # into a vector, we don't bother. # Convert from [0, 255] -> [-0.5, 0.5] floats. image = tf.cast(image, tf.float32) * (1. / 255) - 0.5 # Convert label from a scalar uint8 tensor to an int32 scalar. label = tf.cast(features['label'], tf.int32) return image, label def inputs(train, batch_size, num_epochs): """Reads input data num_epochs times. Args: train: Selects between the training (True) and validation (False) data. batch_size: Number of examples per returned batch. num_epochs: Number of times to read the input data, or 0/None to train forever. Returns: A tuple (images, labels), where: * images is a float tensor with shape [batch_size, mnist.IMAGE_PIXELS] in the range [-0.5, 0.5]. * labels is an int32 tensor with shape [batch_size] with the true label, a number in the range [0, mnist.NUM_CLASSES). Note that an tf.train.QueueRunner is added to the graph, which must be run using e.g. tf.train.start_queue_runners(). """ if not num_epochs: num_epochs = None filename = os.path.join(FLAGS.train_dir, TRAIN_FILE if train else VALIDATION_FILE) with tf.name_scope('input'): filename_queue = tf.train.string_input_producer( [filename], num_epochs=num_epochs) # Even when reading in multiple threads, share the filename # queue. image, label = read_and_decode(filename_queue) # Shuffle the examples and collect them into batch_size batches. # (Internally uses a RandomShuffleQueue.) # We run this in two threads to avoid being a bottleneck. images, sparse_labels = tf.train.shuffle_batch( [image, label], batch_size=batch_size, num_threads=2, capacity=1000 + 3 * batch_size, # Ensures a minimum amount of shuffling of examples. min_after_dequeue=1000) return images, sparse_labels def run_training(): """Train MNIST for a number of steps.""" # Tell TensorFlow that the model will be built into the default Graph. with tf.Graph().as_default(): # Input images and labels. images, labels = inputs(train=True, batch_size=FLAGS.batch_size, num_epochs=FLAGS.num_epochs) # Build a Graph that computes predictions from the inference model. logits = mnist.inference(images, FLAGS.hidden1, FLAGS.hidden2) # Add to the Graph the loss calculation. loss = mnist.loss(logits, labels) # Add to the Graph operations that train the model. train_op = mnist.training(loss, FLAGS.learning_rate) # The op for initializing the variables. init_op = tf.initialize_all_variables() # Create a session for running operations in the Graph. sess = tf.Session() # Initialize the variables (the trained variables and the # epoch counter). sess.run(init_op) # Start input enqueue threads. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) try: step = 0 while not coord.should_stop(): start_time = time.time() # Run one step of the model. The return values are # the activations from the `train_op` (which is # discarded) and the `loss` op. To inspect the values # of your ops or variables, you may include them in # the list passed to sess.run() and the value tensors # will be returned in the tuple from the call. _, loss_value = sess.run([train_op, loss]) duration = time.time() - start_time # Print an overview fairly often. if step % 100 == 0: print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration)) step += 1 except tf.errors.OutOfRangeError: print('Done training for %d epochs, %d steps.' % (FLAGS.num_epochs, step)) finally: # When done, ask the threads to stop. coord.request_stop() # Wait for threads to finish. coord.join(threads) sess.close() def main(_): run_training() if __name__ == '__main__': tf.app.run() ================================================ FILE: SOURCE/how_tos/reading_data/index.md ================================================ # Reading data There are three main methods of getting data into a TensorFlow program: * Feeding: Python code provides the data when running each step. * Reading from files: an input pipeline reads the data from files at the beginning of a TensorFlow graph. * Preloaded data: a constant or variable in the TensorFlow graph holds all the data (for small data sets). ## Contents ### [Reading data](#AUTOGENERATED-reading-data) * [Feeding](#Feeding) * [Reading from files](#AUTOGENERATED-reading-from-files) * [Filenames, shuffling, and epoch limits](#AUTOGENERATED-filenames--shuffling--and-epoch-limits) * [File formats](#AUTOGENERATED-file-formats) * [Preprocessing](#AUTOGENERATED-preprocessing) * [Batching](#AUTOGENERATED-batching) * [Creating threads to prefetch using `QueueRunner` objects](#QueueRunner) * [Filtering records or producing multiple examples per record](#AUTOGENERATED-filtering-records-or-producing-multiple-examples-per-record) * [Sparse input data](#AUTOGENERATED-sparse-input-data) * [Preloaded data](#AUTOGENERATED-preloaded-data) * [Multiple input pipelines](#AUTOGENERATED-multiple-input-pipelines) ## Feeding TensorFlow's feed mechanism lets you inject data into any Tensor in a computation graph. A python computation can thus feed data directly into the graph. Supply feed data through the `feed_dict` argument to a run() or eval() call that initiates computation. ```python with tf.Session(): input = tf.placeholder(tf.float32) classifier = ... print classifier.eval(feed_dict={input: my_python_preprocessing_fn()}) ``` While you can replace any Tensor with feed data, including variables and constants, the best practice is to use a [`placeholder` op](../../api_docs/python/io_ops.md#placeholder) node. A `placeholder` exists solely to serve as the target of feeds. It is not initialized and contains no data. A placeholder generates an error if it is executed without a feed, so you won't forget to feed it. An example using `placeholder` and feeding to train on MNIST data can be found in [`tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py), and is described in the [MNIST tutorial](../../tutorials/mnist/tf/index.md). ## Reading from files A typical pipeline for reading records from files has the following stages: 1. The list of filenames 2. *Optional* filename shuffling 3. *Optional* epoch limit 4. Filename queue 5. A Reader for the file format 6. A decoder for a record read by the reader 7. *Optional* preprocessing 8. Example queue ### Filenames, shuffling, and epoch limits For the list of filenames, use either a constant string Tensor (like `["file0", "file1"]` or `[("file%d" % i) for i in range(2)]`) or the [`tf.train.match_filenames_once` function](../../api_docs/python/io_ops.md#match_filenames_once). Pass the list of filenames to the [`tf.train.string_input_producer` function](../../api_docs/python/io_ops.md#string_input_producer). `string_input_producer` creates a FIFO queue for holding the filenames until the reader needs them. `string_input_producer` has options for shuffling and setting a maximum number of epochs. A queue runner adds the whole list of filenames to the queue once for each epoch, shuffling the filenames within an epoch if `shuffle=True`. This procedure provides a uniform sampling of files, so that examples are not under- or over- sampled relative to each other. The queue runner works in a thread separate from the reader that pulls filenames from the queue, so the shuffling and enqueuing process does not block the reader. ### File formats Select the reader that matches your input file format and pass the filename queue to the reader's read method. The read method outputs a key identifying the file and record (useful for debugging if you have some weird records), and a scalar string value. Use one (or more) of the decoder and conversion ops to decode this string into the tensors that make up an example. #### CSV files To read text files in [comma-separated value (CSV) format](https://tools.ietf.org/html/rfc4180), use a [`TextLineReader`](../../api_docs/python/io_ops.md#TextLineReader) with the [`decode_csv`](../../api_docs/python/io_ops.md#decode_csv) operation. For example: ```python filename_queue = tf.train.string_input_producer(["file0.csv", "file1.csv"]) reader = tf.TextLineReader() key, value = reader.read(filename_queue) # Default values, in case of empty columns. Also specifies the type of the # decoded result. record_defaults = [[1], [1], [1], [1], [1]] col1, col2, col3, col4, col5 = tf.decode_csv( value, record_defaults=record_defaults) features = tf.concat(0, [col1, col2, col3, col4]) with tf.Session() as sess: # Start populating the filename queue. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) for i in range(1200): # Retrieve a single instance: example, label = sess.run([features, col5]) coord.request_stop() coord.join(threads) ``` Each execution of `read` reads a single line from the file. The `decode_csv` op then parses the result into a list of tensors. The `record_defaults` argument determines the type of the resulting tensors and sets the default value to use if a value is missing in the input string. You must call `tf.train.start_queue_runners` to populate the queue before you call `run` or `eval` to execute the `read`. Otherwise `read` will block while it waits for filenames from the queue. #### Fixed length records To read binary files in which each record is a fixed number of bytes, use [`tf.FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader) with the [`tf.decode_raw`](../../api_docs/python/io_ops.md#decode_raw) operation. The `decode_raw` op converts from a string to a uint8 tensor. For example, [the CIFAR-10 dataset](http://www.cs.toronto.edu/~kriz/cifar.html) uses a file format where each record is represented using a fixed number of bytes: 1 byte for the label followed by 3072 bytes of image data. Once you have a uint8 tensor, standard operations can slice out each piece and reformat as needed. For CIFAR-10, you can see how to do the reading and decoding in [`tensorflow/models/image/cifar10/cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py) and described in [this tutorial](../../tutorials/deep_cnn/index.md#prepare-the-data). #### Standard TensorFlow format Another approach is to convert whatever data you have into a supported format. This approach makes it easier to mix and match data sets and network architectures. The recommended format for TensorFlow is a TFRecords file containing [`tf.train.Example` protocol buffers](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto) (which contain [`Features`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto) as a field). You write a little program that gets your data, stuffs it in an `Example` protocol buffer, serializes the protocol buffer to a string, and then writes the string to a TFRecords file using the [`tf.python_io.TFRecordWriter` class](../../api_docs/python/python_io.md#TFRecordWriter). For example, [`tensorflow/g3doc/how_tos/reading_data/convert_to_records.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/convert_to_records.py) converts MNIST data to this format. To read a file of TFRecords, use [`tf.TFRecordReader`](../../api_docs/python/io_ops.md#TFRecordReader) with the [`tf.parse_single_example`](../../api_docs/python/io_ops.md#parse_single_example) decoder. The `parse_single_example` op decodes the example protocol buffers into tensors. An MNIST example using the data produced by `convert_to_records` can be found in [`tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py), which you can compare with the `fully_connected_feed` version. ### Preprocessing You can then do any preprocessing of these examples you want. This would be any processing that doesn't depend on trainable parameters. Examples include normalization of your data, picking a random slice, adding noise or distortions, etc. See [`tensorflow/models/image/cifar10/cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py) for an example. ### Batching At the end of the pipeline we use another queue to batch together examples for training, evaluation, or inference. For this we use a queue that randomizes the order of examples, using the [`tf.train.shuffle_batch` function](../../api_docs/python/io_ops.md#shuffle_batch). Example: ``` def read_my_file_format(filename_queue): reader = tf.SomeReader() key, record_string = reader.read(filename_queue) example, label = tf.some_decoder(record_string) processed_example = some_processing(example) return processed_example, label def input_pipeline(filenames, batch_size, num_epochs=None): filename_queue = tf.train.string_input_producer( filenames, num_epochs=num_epochs, shuffle=True) example, label = read_my_file_format(filename_queue) # min_after_dequeue defines how big a buffer we will randomly sample # from -- bigger means better shuffling but slower start up and more # memory used. # capacity must be larger than min_after_dequeue and the amount larger # determines the maximum we will prefetch. Recommendation: # min_after_dequeue + (num_threads + a small safety margin) * batch_size min_after_dequeue = 10000 capacity = min_after_dequeue + 3 * batch_size example_batch, label_batch = tf.train.shuffle_batch( [example, label], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch ``` If you need more parallelism or shuffling of examples between files, use multiple reader instances using the [`tf.train.shuffle_batch_join` function](../../api_docs/python/io_ops.md#shuffle_batch_join). For example: ``` def read_my_file_format(filename_queue): # Same as above def input_pipeline(filenames, batch_size, read_threads, num_epochs=None): filename_queue = tf.train.string_input_producer( filenames, num_epochs=num_epochs, shuffle=True) example_list = [read_my_file_format(filename_queue) for _ in range(read_threads)] min_after_dequeue = 10000 capacity = min_after_dequeue + 3 * batch_size example_batch, label_batch = tf.train.shuffle_batch_join( example_list, batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch ``` You still only use a single filename queue that is shared by all the readers. That way we ensure that the different readers use different files from the same epoch until all the files from the epoch have been started. (It is also usually sufficient to have a single thread filling the filename queue.) An alternative is to use a single reader via the [`tf.train.shuffle_batch` function](../../api_docs/python/io_ops.md#shuffle_batch) with `num_threads` bigger than 1. This will make it read from a single file at the same time (but faster than with 1 thread), instead of N files at once. This can be important: * If you have more reading threads than input files, to avoid the risk that you will have two threads reading the same example from the same file near each other. * Or if reading N files in parallel causes too many disk seeks. How many threads do you need? the `tf.train.shuffle_batch*` functions add a summary to the graph that indicates how full the example queue is. If you have enough reading threads, that summary will stay above zero. You can [view your summaries as training progresses using TensorBoard](../../how_tos/summaries_and_tensorboard/index.md). ### Creating threads to prefetch using `QueueRunner` objects The short version: many of the `tf.train` functions listed above add [`QueueRunner`](../../api_docs/python/train.md#QueueRunner) objects to your graph. These require that you call [`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners) before running any training or inference steps, or it will hang forever. This will start threads that run the input pipeline, filling the example queue so that the dequeue to get the examples will succeed. This is best combined with a [`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator) to cleanly shut down these threads when there are errors. If you set a limit on the number of epochs, that will use an epoch counter that will need to be intialized. The recommended code pattern combining these is: ```python # Create the graph, etc. init_op = tf.initialize_all_variables() # Create a session for running operations in the Graph. sess = tf.Session() # Initialize the variables (like the epoch counter). sess.run(init_op) # Start input enqueue threads. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) try: while not coord.should_stop(): # Run training steps or whatever sess.run(train_op) except tf.errors.OutOfRangeError: print 'Done training -- epoch limit reached' finally: # When done, ask the threads to stop. coord.request_stop() # Wait for threads to finish. coord.join(threads) sess.close() ``` #### Aside: What is happening here? First we create the graph. It will have a few pipeline stages that are connected by queues. The first stage will generate filenames to read and enqueue them in the filename queue. The second stage consumes filenames (using a `Reader`), produces examples, and enqueues them in an example queue. Depending on how you have set things up, you may actually have a few independent copies of the second stage, so that you can read from multiple files in parallel. At the end of these stages is an enqueue operation, which enqueues into a queue that the next stage dequeues from. We want to start threads running these enqueuing operations, so that our training loop can dequeue examples from the example queue.
The helpers in `tf.train` that create these queues and enqueuing operations add a [`tf.train.QueueRunner`](../../api_docs/python/train.md#QueueRunner) to the graph using the [`tf.train.add_queue_runner`](../../api_docs/python/train.md#add_queue_runner) function. Each `QueueRunner` is responsible for one stage, and holds the list of enqueue operations that need to be run in threads. Once the graph is constructed, the [`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners) function asks each QueueRunner in the graph to start its threads running the enqueuing operations. If all goes well, you can now run your training steps and the queues will be filled by the background threads. If you have set an epoch limit, at some point an attempt to dequeue examples will get an [`tf.OutOfRangeError`](../../api_docs/python/client.md#OutOfRangeError). This is the TensorFlow equivalent of "end of file" (EOF) -- this means the epoch limit has been reached and no more examples are available. The last ingredient is the [`Coordinator`](../../api_docs/python/train.md#Coordinator). This is responsible for letting all the threads know if anything has signalled a shut down. Most commonly this would be because an exception was raised, for example one of the threads got an error when running some operation (or an ordinary Python exception). For more about threading, queues, QueueRunners, and Coordinators [see here](../../how_tos/threading_and_queues/index.md). #### Aside: How clean shut-down when limiting epochs works Imagine you have a model that has set a limit on the number of epochs to train on. That means that the thread generating filenames will only run that many times before generating an `OutOfRange` error. The QueueRunner will catch that error, close the filename queue, and exit the thread. Closing the queue does two things: * Any future attempt to enqueue in the filename queue will generate an error. At this point there shouldn't be any threads trying to do that, but this is helpful when queues are closed due to other errors. * Any current or future dequeue will either succeed (if there are enough elements left) or fail (with an `OutOfRange` error) immediately. They won't block waiting for more elements to be enqueued, since by the previous point that can't happen. The point is that when the filename queue is closed, there will likely still be many filenames in that queue, so the next stage of the pipeline (with the reader and other preprocessing) may continue running for some time. Once the filename queue is exhausted, though, the next attempt to dequeue a filename (e.g. from a reader that has finished with the file it was working on) will trigger an `OutOfRange` error. In this case, though, you might have multiple threads associated with a single QueueRunner. If this isn't the last thread in the QueueRunner, the `OutOfRange` error just causes the one thread to exit. This allows the other threads, which are still finishing up their last file, to proceed until they finish as well. (Assuming you are using a [`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator), other types of errors will cause all the threads to stop.) Once all the reader threads hit the `OutOfRange` error, only then does the next queue, the example queue, gets closed. Again, the example queue will have some elements queued, so training will continue until those are exhausted. If the example queue is a [`RandomShuffleQueue`](../../api_docs/python/io_ops.md#RandomShuffleQueue), say because you are using `shuffle_batch` or `shuffle_batch_join`, it normally will avoid ever going having fewer than its `min_after_dequeue` attr elements buffered. However, once the queue is closed that restriction will be lifted and the queue will eventually empty. At that point the actual training threads, when they try and dequeue from example queue, will start getting `OutOfRange` errors and exiting. Once all the training threads are done, [`tf.train.Coordinator.join`](../../api_docs/python/train.md#Coordinator.join) will return and you can exit cleanly. ### Filtering records or producing multiple examples per record Instead of examples with shapes `[x, y, z]`, you will produce a batch of examples with shape `[batch, x, y, z]`. The batch size can be 0 if you want to filter this record out (maybe it is in a hold-out set?), or bigger than 1 if you are producing multiple examples per record. Then simply set `enqueue_many=True` when calling one of the batching functions (such as `shuffle_batch` or `shuffle_batch_join`). ### Sparse input data SparseTensors don't play well with queues. If you use SparseTensors you have to decode the string records using [`tf.parse_example`](../../api_docs/python/io_ops.md#parse_example) **after** batching (instead of using `tf.parse_single_example` before batching). ## Preloaded data This is only used for small data sets that can be loaded entirely in memory. There are two approaches: * Store the data in a constant. * Store the data in a variable, that you initialize and then never change. Using a constant is a bit simpler, but uses more memory (since the constant is stored inline in the graph data structure, which may be duplicated a few times). ```python training_data = ... training_labels = ... with tf.Session(): input_data = tf.constant(training_data) input_labels = tf.constant(training_labels) ... ``` To instead use a variable, you need to also initialize it after the graph has been built. ```python training_data = ... training_labels = ... with tf.Session() as sess: data_initializer = tf.placeholder(dtype=training_data.dtype, shape=training_data.shape) label_initializer = tf.placeholder(dtype=training_labels.dtype, shape=training_labels.shape) input_data = tf.Variable(data_initalizer, trainable=False, collections=[]) input_labels = tf.Variable(label_initalizer, trainable=False, collections=[]) ... sess.run(input_data.initializer, feed_dict={data_initializer: training_data}) sess.run(input_labels.initializer, feed_dict={label_initializer: training_lables}) ``` Setting `trainable=False` keeps the variable out of the `GraphKeys.TRAINABLE_VARIABLES` collection in the graph, so we won't try and update it when training. Setting `collections=[]` keeps the variable out of the `GraphKeys.VARIABLES` collection used for saving and restoring checkpoints. Either way, [`tf.train.slice_input_producer function`](../../api_docs/python/io_ops.md#slice_input_producer) can be used to produce a slice at a time. This shuffles the examples across an entire epoch, so further shuffling when batching is undesirable. So instead of using the `shuffle_batch` functions, we use the plain [`tf.train.batch` function](../../api_docs/python/io_ops.md#batch). To use multiple preprocessing threads, set the `num_threads` parameter to a number bigger than 1. An MNIST example that preloads the data using constants can be found in [`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py), and one that preloads the data using variables can be found in [`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py), You can compare these with the `fully_connected_feed` and `fully_connected_reader` versions above. ## Multiple input pipelines Commonly you will want to train on one dataset and evaluate (or "eval") on another. One way to do this is to actually have two separate processes: * The training process reads training input data and periodically writes checkpoint files with all the trained variables. * The evaluation process restores the checkpoint files into an inference model that reads validation input data. This is what is done in [the example CIFAR-10 model](../../tutorials/deep_cnn/index.md#save-and-restore-checkpoints). This has a couple of benefits: * The eval is performed on a single snapshot of the trained variables. * You can perform the eval even after training has completed and exited. You can have the train and eval in the same graph in the same process, and share their trained variables. See [the shared variables tutorial](../../how_tos/variable_scope/index.md). ================================================ FILE: SOURCE/how_tos/reading_data.md ================================================ # 数据读取 TensorFlow程序读取数据一共有3种方法: * 供给数据(Feeding): 在TensorFlow程序运行的每一步, 让Python代码来供给数据。 * 从文件读取数据: 在TensorFlow图的起始, 让一个输入管线从文件中读取数据。 * 预加载数据: 在TensorFlow图中定义常量或变量来保存所有数据(仅适用于数据量比较小的情况)。 ## 目录 ### [数据读取](#AUTOGENERATED-reading-data) * [供给数据(Feeding)](#Feeding) * [从文件读取数据](#AUTOGENERATED-reading-from-files) * [文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits)](#AUTOGENERATED-filenames--shuffling--and-epoch-limits) * [文件格式](#AUTOGENERATED-file-formats) * [预处理](#AUTOGENERATED-preprocessing) * [批处理](#AUTOGENERATED-batching) * [使用`QueueRunner`创建预读线程](#QueueRunner) * [对记录进行过滤或者为每个纪录创建多个样本](#AUTOGENERATED-filtering-records-or-producing-multiple-examples-per-record) * [序列化输入数据(Sparse input data)](#AUTOGENERATED-sparse-input-data) * [预加载数据](#AUTOGENERATED-preloaded-data) * [多管线输入](#AUTOGENERATED-multiple-input-pipelines) ## 供给数据 TensorFlow的数据供给机制允许你在TensorFlow运算图中将数据注入到任一张量中。因此,python运算可以把数据直接设置到TensorFlow图中。 通过给run()或者eval()函数输入`feed_dict`参数, 可以启动运算过程。 ```python with tf.Session(): input = tf.placeholder(tf.float32) classifier = ... print classifier.eval(feed_dict={input: my_python_preprocessing_fn()}) ``` 虽然你可以使用常量和变量来替换任何一个张量, 但是最好的做法应该是使用[`placeholder` op](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#placeholder)节点。设计`placeholder`节点的唯一的意图就是为了提供数据供给(feeding)的方法。`placeholder`节点被声明的时候是未初始化的, 也不包含数据, 如果没有为它供给数据, 则TensorFlow运算的时候会产生错误, 所以千万不要忘了为`placeholder`提供数据。 可以在[`tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py)找到使用`placeholder`和MNIST训练的例子,[MNIST tutorial](tensorflow-zh/SOURCE/tutorials/mnist/tf/index.md)也讲述了这一例子。 ## 从文件读取数据 一共典型的文件读取管线会包含下面这些步骤: 1. 文件名列表 2. *可配置的* 文件名乱序(shuffling) 3. *可配置的* 最大训练迭代数(epoch limit) 4. 文件名队列 5. 针对输入文件格式的阅读器 6. 纪录解析器 7. *可配置的*预处理器 8. 样本队列 ### 文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits) 可以使用字符串张量(比如`["file0", "file1"]`, `[("file%d" % i) for i in range(2)]`, `[("file%d" % i) for i in range(2)]`) 或者[`tf.train.match_filenames_once` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#match_filenames_once)来产生文件名列表。 将文件名列表交给[`tf.train.string_input_producer` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#string_input_producer).`string_input_producer`来生成一个先入先出的队列, 文件阅读器会需要它来读取数据。 `string_input_producer` 提供的可配置参数来设置文件名乱序和最大的训练迭代数, `QueueRunner`会为每次迭代(epoch)将所有的文件名加入文件名队列中, 如果`shuffle=True`的话, 会对文件名进行乱序处理。这一过程是比较均匀的,因此它可以产生均衡的文件名队列。 这个`QueueRunner`的工作线程是独立于文件阅读器的线程, 因此乱序和将文件名推入到文件名队列这些过程不会阻塞文件阅读器运行。 ### 文件格式 根据你的文件格式, 选择对应的文件阅读器, 然后将文件名队列提供给阅读器的`read`方法。阅读器的`read`方法会输出一个key来表征输入的文件和其中的纪录(对于调试非常有用),同时得到一个字符串标量, 这个字符串标量可以被一个或多个解析器,或者转换操作将其解码为张量并且构造成为样本。 #### CSV 文件 从CSV文件中读取数据, 需要使用[`TextLineReader`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TextLineReader)和[`decode_csv`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_csv) 操作, 如下面的例子所示: ```python filename_queue = tf.train.string_input_producer(["file0.csv", "file1.csv"]) reader = tf.TextLineReader() key, value = reader.read(filename_queue) # Default values, in case of empty columns. Also specifies the type of the # decoded result. record_defaults = [[1], [1], [1], [1], [1]] col1, col2, col3, col4, col5 = tf.decode_csv( value, record_defaults=record_defaults) features = tf.concat(0, [col1, col2, col3, col4]) with tf.Session() as sess: # Start populating the filename queue. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) for i in range(1200): # Retrieve a single instance: example, label = sess.run([features, col5]) coord.request_stop() coord.join(threads) ``` 每次`read`的执行都会从文件中读取一行内容, `decode_csv` 操作会解析这一行内容并将其转为张量列表。如果输入的参数有缺失,`record_default`参数可以根据张量的类型来设置默认值。 在调用`run`或者`eval`去执行`read`之前, 你必须调用`tf.train.start_queue_runners`来将文件名填充到队列。否则`read`操作会被阻塞到文件名队列中有值为止。 #### 固定长度的记录 从二进制文件中读取固定长度纪录, 可以使用[`tf.FixedLengthRecordReader`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#FixedLengthRecordReader)的[`tf.decode_raw`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#decode_raw)操作。`decode_raw`操作可以讲一个字符串转换为一个uint8的张量。 举例来说,[the CIFAR-10 dataset](http://www.cs.toronto.edu/~kriz/cifar.html)的文件格式定义是:每条记录的长度都是固定的,一个字节的标签,后面是3072字节的图像数据。uint8的张量的标准操作就可以从中获取图像片并且根据需要进行重组。 例子代码可以在[`tensorflow/models/image/cifar10/cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py)找到,具体讲述可参见[教程](tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md#prepare-the-data). #### 标准TensorFlow格式 另一种保存记录的方法可以允许你讲任意的数据转换为TensorFlow所支持的格式, 这种方法可以使TensorFlow的数据集更容易与网络应用架构相匹配。这种建议的方法就是使用TFRecords文件,TFRecords文件包含了[`tf.train.Example` 协议内存块(protocol buffer)](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto)(协议内存块包含了字段 [`Features`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto))。你可以写一段代码获取你的数据, 将数据填入到`Example`协议内存块(protocol buffer),将协议内存块序列化为一个字符串, 并且通过[`tf.python_io.TFRecordWriter` class](tensorflow-zh/SOURCE/api_docs/python/python_io.md#TFRecordWriter)写入到TFRecords文件。[`tensorflow/g3doc/how_tos/reading_data/convert_to_records.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/convert_to_records.py)就是这样的一个例子。 从TFRecords文件中读取数据, 可以使用[`tf.TFRecordReader`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#TFRecordReader)的[`tf.parse_single_example`](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#parse_single_example)解析器。这个`parse_single_example`操作可以将`Example`协议内存块(protocol buffer)解析为张量。 MNIST的例子就使用了`convert_to_records` 所构建的数据。 请参看[`tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py), 您也可以将这个例子跟`fully_connected_feed`的版本加以比较。 ### 预处理 你可以对输入的样本进行任意的预处理, 这些预处理不依赖于训练参数, 你可以在[`tensorflow/models/image/cifar10/cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py)找到数据归一化, 提取随机数据片,增加噪声或失真等等预处理的例子。 ### 批处理 在数据输入管线的末端, 我们需要有另一个队列来执行输入样本的训练,评价和推理。因此我们使用[`tf.train.shuffle_batch` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#shuffle_batch)来对队列中的样本进行乱序处理 示例: ``` def read_my_file_format(filename_queue): reader = tf.SomeReader() key, record_string = reader.read(filename_queue) example, label = tf.some_decoder(record_string) processed_example = some_processing(example) return processed_example, label def input_pipeline(filenames, batch_size, num_epochs=None): filename_queue = tf.train.string_input_producer( filenames, num_epochs=num_epochs, shuffle=True) example, label = read_my_file_format(filename_queue) # min_after_dequeue defines how big a buffer we will randomly sample # from -- bigger means better shuffling but slower start up and more # memory used. # capacity must be larger than min_after_dequeue and the amount larger # determines the maximum we will prefetch. Recommendation: # min_after_dequeue + (num_threads + a small safety margin) * batch_size min_after_dequeue = 10000 capacity = min_after_dequeue + 3 * batch_size example_batch, label_batch = tf.train.shuffle_batch( [example, label], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch ``` 如果你需要对不同文件中的样子有更强的乱序和并行处理,可以使用[`tf.train.shuffle_batch_join` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#shuffle_batch_join). 示例: ``` def read_my_file_format(filename_queue): # Same as above def input_pipeline(filenames, batch_size, read_threads, num_epochs=None): filename_queue = tf.train.string_input_producer( filenames, num_epochs=num_epochs, shuffle=True) example_list = [read_my_file_format(filename_queue) for _ in range(read_threads)] min_after_dequeue = 10000 capacity = min_after_dequeue + 3 * batch_size example_batch, label_batch = tf.train.shuffle_batch_join( example_list, batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch ``` 在这个例子中, 你虽然只使用了一个文件名队列, 但是TensorFlow依然能保证多个文件阅读器从同一次迭代(epoch)的不同文件中读取数据,知道这次迭代的所有文件都被开始读取为止。(通常来说一个线程来对文件名队列进行填充的效率是足够的) 另一种替代方案是: 使用[`tf.train.shuffle_batch` 函数](tensorflow-zh/SOURCE/api_docs/python/io_ops.md#shuffle_batch),设置`num_threads`的值大于1。 这种方案可以保证同一时刻只在一个文件中进行读取操作(但是读取速度依然优于单线程),而不是之前的同时读取多个文件。这种方案的优点是: * 避免了两个不同的线程从同一个文件中读取同一个样本。 * 避免了过多的磁盘搜索操作。 你一共需要多少个读取线程呢? 函数`tf.train.shuffle_batch*`为TensorFlow图提供了获取文件名队列中的元素个数之和的方法。 如果你有足够多的读取线程, 文件名队列中的元素个数之和应该一直是一个略高于0的数。具体可以参考[TensorBoard:可视化学习](tensorflow-zh/SOURCE/how_tos/summaries_and_tensorboard/index.md). ### 创建线程并使用`QueueRunner`对象来预取 简单来说:使用上面列出的许多`tf.train`函数添加[`QueueRunner`](../../api_docs/python/train.md#QueueRunner)到你的数据流图中。在你运行任何训练步骤之前,需要调用[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners)函数,否则数据流图将一直挂起。[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners) 这个函数将会启动输入管道的线程,填充样本到队列中,以便出队操作可以从队列中拿到样本。这种情况下最好配合使用一个[`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator),这样可以在发生错误的情况下正确地关闭这些线程。如果你对训练迭代数做了限制,那么需要使用一个训练迭代数计数器,并且需要被初始化。推荐的代码模板如下: ```python # Create the graph, etc. init_op = tf.initialize_all_variables() # Create a session for running operations in the Graph. sess = tf.Session() # Initialize the variables (like the epoch counter). sess.run(init_op) # Start input enqueue threads. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(sess=sess, coord=coord) try: while not coord.should_stop(): # Run training steps or whatever sess.run(train_op) except tf.errors.OutOfRangeError: print 'Done training -- epoch limit reached' finally: # When done, ask the threads to stop. coord.request_stop() # Wait for threads to finish. coord.join(threads) sess.close() ``` #### 疑问: 这是怎么回事? 首先,我们先创建数据流图,这个数据流图由一些流水线的阶段组成,阶段间用队列连接在一起。第一阶段将生成文件名,我们读取这些文件名并且把他们排到文件名队列中。第二阶段从文件中读取数据(使用`Reader`),产生样本,而且把样本放在一个样本队列中。根据你的设置,实际上也可以拷贝第二阶段的样本,使得他们相互独立,这样就可以从多个文件中并行读取。在第二阶段的最后是一个排队操作,就是入队到队列中去,在下一阶段出队。因为我们是要开始运行这些入队操作的线程,所以我们的训练循环会使得样本队列中的样本不断地出队。
在`tf.train`中要创建这些队列和执行入队操作,就要添加[`tf.train.QueueRunner`](../../api_docs/python/train.md#QueueRunner)到一个使用[`tf.train.add_queue_runner`](../../api_docs/python/train.md#add_queue_runner)函数的数据流图中。每个`QueueRunner`负责一个阶段,处理那些需要在线程中运行的入队操作的列表。一旦数据流图构造成功,[`tf.train.start_queue_runners`](../../api_docs/python/train.md#start_queue_runners)函数就会要求数据流图中每个`QueueRunner`去开始它的线程运行入队操作。 如果一切顺利的话,你现在可以执行你的训练步骤,同时队列也会被后台线程来填充。如果您设置了最大训练迭代数,在某些时候,样本出队的操作可能会得到一个[`tf.OutOfRangeError`](../../api_docs/python/client.md#OutOfRangeError)的错误。这其实是TensorFlow的“文件结束”(EOF) ———— 这就意味着已经达到了最大训练迭代数,已经没有更多可用的样本了。 最后一个因素是[`Coordinator`](../../api_docs/python/train.md#Coordinator)。这是负责在收到任何关闭信号的时候,让所有的线程都知道。最常用的是在发生异常时这种情况就会呈现出来,比如说其中一个线程在运行某些操作时出现错误(或一个普通的Python异常)。 想要了解更多的关于threading, queues, QueueRunners, and Coordinators的内容可以[看这里](../../how_tos/threading_and_queues/index.md). #### 疑问: 在达到最大训练迭代数的时候如何清理关闭线程? 想象一下,你有一个模型并且设置了最大训练迭代数。这意味着,生成文件的那个线程将只会在产生`OutOfRange`错误之前运行许多次。该`QueueRunner`会捕获该错误,并且关闭文件名的队列,最后退出线程。关闭队列做了两件事情: * 如果还试着对文件名队列执行入队操作时将发生错误。任何线程不应该尝试去这样做,但是当队列因为其他错误而关闭时,这就会有用了。 * 任何当前或将来出队操作要么成功(如果队列中还有足够的元素)或立即失败(发生`OutOfRange`错误)。它们不会防止等待更多的元素被添加到队列中,因为上面的一点已经保证了这种情况不会发生。 关键是,当在文件名队列被关闭时候,有可能还有许多文件名在该队列中,这样下一阶段的流水线(包括reader和其它预处理)还可以继续运行一段时间。 一旦文件名队列空了之后,如果后面的流水线还要尝试从文件名队列中取出一个文件名(例如,从一个已经处理完文件的reader中),这将会触发`OutOfRange`错误。在这种情况下,即使你可能有一个QueueRunner关联着多个线程。如果这不是在QueueRunner中的最后那个线程,`OutOfRange`错误仅仅只会使得一个线程退出。这使得其他那些正处理自己的最后一个文件的线程继续运行,直至他们完成为止。 (但如果假设你使用的是[`tf.train.Coordinator`](../../api_docs/python/train.md#Coordinator),其他类型的错误将导致所有线程停止)。一旦所有的reader线程触发`OutOfRange`错误,然后才是下一个队列,再是样本队列被关闭。 同样,样本队列中会有一些已经入队的元素,所以样本训练将一直持续直到样本队列中再没有样本为止。如果样本队列是一个[`RandomShuffleQueue`](../../api_docs/python/io_ops.md#RandomShuffleQueue),因为你使用了`shuffle_batch` 或者 `shuffle_batch_join`,所以通常不会出现以往那种队列中的元素会比`min_after_dequeue` 定义的更少的情况。 然而,一旦该队列被关闭,`min_after_dequeue`设置的限定值将失效,最终队列将为空。在这一点来说,当实际训练线程尝试从样本队列中取出数据时,将会触发`OutOfRange`错误,然后训练线程会退出。一旦所有的培训线程完成,[`tf.train.Coordinator.join`](../../api_docs/python/train.md#Coordinator.join)会返回,你就可以正常退出了。 ### 筛选记录或产生每个记录的多个样本 举个例子,有形式为`[x, y, z]`的样本,我们可以生成一批形式为`[batch, x, y, z]`的样本。 如果你想滤除这个记录(或许不需要这样的设置),那么可以设置batch的大小为0;但如果你需要每个记录产生多个样本,那么batch的值可以大于1。 然后很简单,只需调用批处理函数(比如: `shuffle_batch` or `shuffle_batch_join`)去设置`enqueue_many=True`就可以实现。 ### 稀疏输入数据 SparseTensors这种数据类型使用队列来处理不是太好。如果要使用SparseTensors你就必须在批处理**之后**使用[`tf.parse_example`](../../api_docs/python/io_ops.md#parse_example) 去解析字符串记录 (而不是在批处理**之前**使用 `tf.parse_single_example`) 。 ## 预取数据 这仅用于可以完全加载到存储器中的小的数据集。有两种方法: * 存储在常数中。 * 存储在变量中,初始化后,永远不要改变它的值。 使用常数更简单一些,但是会使用更多的内存(因为常数会内联的存储在数据流图数据结构中,这个结构体可能会被复制几次)。 ```python training_data = ... training_labels = ... with tf.Session(): input_data = tf.constant(training_data) input_labels = tf.constant(training_labels) ... ``` 要改为使用变量的方式,您就需要在数据流图建立后初始化这个变量。 ```python training_data = ... training_labels = ... with tf.Session() as sess: data_initializer = tf.placeholder(dtype=training_data.dtype, shape=training_data.shape) label_initializer = tf.placeholder(dtype=training_labels.dtype, shape=training_labels.shape) input_data = tf.Variable(data_initalizer, trainable=False, collections=[]) input_labels = tf.Variable(label_initalizer, trainable=False, collections=[]) ... sess.run(input_data.initializer, feed_dict={data_initializer: training_data}) sess.run(input_labels.initializer, feed_dict={label_initializer: training_lables}) ``` 设定`trainable=False` 可以防止该变量被数据流图的 `GraphKeys.TRAINABLE_VARIABLES` 收集, 这样我们就不会在训练的时候尝试更新它的值; 设定 `collections=[]` 可以防止`GraphKeys.VARIABLES` 收集后做为保存和恢复的中断点。 无论哪种方式,[`tf.train.slice_input_producer function`](../../api_docs/python/io_ops.md#slice_input_producer)函数可以被用来每次产生一个切片。这样就会让样本在整个迭代中被打乱,所以在使用批处理的时候不需要再次打乱样本。所以我们不使用`shuffle_batch`函数,取而代之的是纯[`tf.train.batch` 函数](../../api_docs/python/io_ops.md#batch)。 如果要使用多个线程进行预处理,需要将`num_threads`参数设置为大于1的数字。 在[`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py) 中可以找到一个MNIST例子,使用常数来预加载。 另外使用变量来预加载的例子在[`tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py),你可以用上面 `fully_connected_feed` 和 `fully_connected_reader` 的描述来进行比较。 ## 多输入管道 通常你会在一个数据集上面训练,然后在另外一个数据集上做评估计算(或称为 "eval")。 这样做的一种方法是,实际上包含两个独立的进程: * 训练过程中读取输入数据,并定期将所有的训练的变量写入还原点文件)。 * 在计算过程中恢复还原点文件到一个推理模型中,读取有效的输入数据。 这两个进程在下面的例子中已经完成了:[the example CIFAR-10 model](../../tutorials/deep_cnn/index.md#save-and-restore-checkpoints),有以下几个好处: * eval被当做训练后变量的一个简单映射。 * 你甚至可以在训练完成和退出后执行eval。 您可以在同一个进程的相同的数据流图中有训练和eval,并分享他们的训练后的变量。参考[the shared variables tutorial](../../how_tos/variable_scope/index.md). 原文地址:[Reading data](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/reading_data/index.md) 翻译:[volvet](https://github.com/volvet) and [zhangkom](https://github.com/zhangkom) 校对: ================================================ FILE: SOURCE/how_tos/summaries_and_tensorboard/index.md ================================================ # TensorBoard: Visualizing Learning The computations you'll use TensorBoard for - like training a massive deep neural network - can be complex and confusing. To make it easier to understand, debug, and optimize TensorFlow programs, we've included a suite of visualization tools called TensorBoard. You can use TensorBoard to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it. When TensorBoard is fully configured, it looks like this: ![MNIST TensorBoard](./mnist_tensorboard.png "MNIST TensorBoard") ## Serializing the data TensorBoard operates by reading TensorFlow events files, which contain summary data that you can generate when running TensorFlow. Here's the general lifecycle for summary data within TensorBoard. First, create the TensorFlow graph that you'd like to collect summary data from, and decide which nodes you would like to annotate with [summary operations] (../../api_docs/python/train.md#summary-operations). For example, suppose you are training a convolutional neural network for recognizing MNIST digits. You'd like to record how the learning rate varies over time, and how the objective function is changing. Collect these by attaching [`scalar_summary`](../../api_docs/python/train.md#scalar_summary) ops to the nodes that output the learning rate and loss respectively. Then, give each `scalar_summary` a meaningful `tag`, like `'learning rate'` or `'loss function'`. Perhaps you'd also like to visualize the distributions of activations coming off a particular layer, or the distribution of gradients or weights. Collect this data by attaching [`histogram_summary`](../../api_docs/python/train.md#histogram_summary) ops to the gradient outputs and to the variable that holds your weights, respectively. For details on all of the summary operations avaiable, check out the docs on [summary operations] (../../api_docs/python/train.md#summary-operations). Operations in TensorFlow don't do anything until you run them, or an op that depends on their output. And the summary nodes that we've just created are peripheral to your graph: none of the ops you are currently running depend on them. So, to generate summaries, we need to run all of these summary nodes. Managing them by hand would be tedious, so use [`tf.merge_all_summaries`](../../api_docs/python/train.md#merge_all_summaries) to combine them into a single op that generates all the summary data. Then, you can just run the merged summary op, which will generate a serialized `Summary` protobuf object with all of your summary data at a given step. Finally, to write this summary data to disk, pass the summary protobuf to a [`tf.train.SummaryWriter`](../../api_docs/python/train.md#SummaryWriter). The `SummaryWriter` takes a logdir in its constructor - this logdir is quite important, it's the directory where all of the events will be written out. Also, the `SummaryWriter` can optionally take a `GraphDef` in its constructor. If it receives one, then TensorBoard will visualize your graph as well. Now that you've modified your graph and have a `SummaryWriter`, you're ready to start runing your network! If you want, you could run the merged summary op every single step, and record a ton of training data. That's likely to be more data than you need, though. Instead, consider running the merged summary op every hundred steps or so, as in the following code example. ```python merged_summary_op = tf.merge_all_summaries() summary_writer = tf.train.SummaryWriter('/tmp/mnist_logs', sess.graph) total_step = 0 while training: total_step += 1 session.run(training_op) if total_step % 100 == 0: summary_str = session.run(merged_summary_op) summary_writer.add_summary(summary_str, total_step) ``` You're now all set to visualize this data using TensorBoard. ## Launching TensorBoard To run TensorBoard, use the command python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory where `logdir` points to the directory where the `SummaryWriter` serialized its data. If this `logdir` directory contains subdirectories which contain serialized data from separate runs, then TensorBoard will visualize the data from all of those runs. Once TensorBoard is running, navigate your web browser to `localhost:6006` to view the TensorBoard. If you have pip installed TensorBoard, you can use the simpler command tensorboard --logdir=/path/to/log-directory When looking at TensorBoard, you will see the navigation tabs in the top right corner. Each tab represents a set of serialized data that can be visualized. For any tab you are looking at, if the logs being looked at by TensorBoard do not contain any data relevant to that tab, a message will be displayed indicating how to serialize data that is applicable to that tab. For in depth information on how to use the *graph* tab to visualize your graph, see [TensorBoard: Visualizing your graph](../../how_tos/graph_viz/index.md). ================================================ FILE: SOURCE/how_tos/summaries_and_tensorboard.md ================================================ # TensorBoard:可视化学习  TensorBoard 涉及到的运算,通常是在训练庞大的深度神经网络中出现的复杂而又难以理解的运算。 为了更方便 TensorFlow 程序的理解、调试与优化,我们发布了一套叫做 TensorBoard 的可视化工具。你可以用 TensorBoard 来展现你的 TensorFlow 图像,绘制图像生成的定量指标图以及附加数据。 当 TensorBoard 设置完成后,它应该是这样子的: ![MNIST TensorBoard](../images/mnist_tensorboard.png) ## 数据序列化  TensorBoard 通过读取 TensorFlow 的事件文件来运行。TensorFlow 的事件文件包括了你会在 TensorFlow 运行中涉及到的主要数据。下面是 TensorBoard 中汇总数据(Summary data)的大体生命周期。 首先,创建你想汇总数据的 TensorFlow 图,然后再选择你想在哪个节点进行[汇总(summary)操作](../api_docs/python/train.md#summary_options)。 比如,假设你正在训练一个卷积神经网络,用于识别 MNISt 标签。你可能希望记录学习速度(learning rate)的如何变化,以及目标函数如何变化。通过向节点附加[scalar_summary](../api_docs/python/train.md#scalary_summary)操作来分别输出学习速度和期望误差。然后你可以给每个 scalary_summary 分配一个有意义的 `标签`,比如 `'learning rate'` 和 `'loss function'`。 或者你还希望显示一个特殊层中激活的分布,或者梯度权重的分布。可以通过分别附加 [histogram_summary](../api_docs/python/train.md#histogram_summary) 运算来收集权重变量和梯度输出。 所有可用的 summary 操作详细信息,可以查看[summary_operation](../api_docs/python/train.md#summary_operation)文档。 在TensorFlow中,所有的操作只有当你执行,或者另一个操作依赖于它的输出时才会运行。我们刚才创建的这些节点(summary nodes)都围绕着你的图像:没有任何操作依赖于它们的结果。因此,为了生成汇总信息,我们需要运行所有这些节点。这样的手动工作是很乏味的,因此可以使用[tf.merge_all_summaries](../api_docs/python/train.md#scalary_summary)来将他们合并为一个操作。 然后你可以执行合并命令,它会依据特点步骤将所有数据生成一个序列化的`Summary` protobuf对象。最后,为了将汇总数据写入磁盘,需要将汇总的protobuf对象传递给[tf.train.Summarywriter](../api_docs/python/train.md#SummaryWriter)。 `SummaryWriter` 的构造函数中包含了参数 logdir。这个 logdir 非常重要,所有事件都会写到它所指的目录下。此外,`SummaryWriter` 中还包含了一个可选择的参数 `GraphDef`。如果输入了该参数,那么 TensorBoard 也会显示你的图像。 现在已经修改了你的图,也有了 `SummaryWriter`,现在就可以运行你的神经网络了!如果你愿意的话,你可以每一步执行一次合并汇总,这样你会得到一大堆训练数据。这很有可能超过了你想要的数据量。你也可以每一百步执行一次合并汇总,或者如下面代码里示范的这样。 ```python merged_summary_op = tf.merge_all_summaries() summary_writer = tf.train.SummaryWriter('/tmp/mnist_logs', sess.graph) total_step = 0 while training: total_step += 1 session.run(training_op) if total_step % 100 == 0: summary_str = session.run(merged_summary_op) summary_writer.add_summary(summary_str, total_step) ``` 现在已经准备好用 TensorBoard 来可视化这些数据了。 ## 启动TensorBoard  输入下面的指令来启动TensorBoard ``` python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory ``` 这里的参数 `logdir` 指向 `SummaryWriter` 序列化数据的存储路径。如果`logdir`目录的子目录中包含另一次运行时的数据,那么 TensorBoard 会展示所有运行的数据。一旦 TensorBoard 开始运行,你可以通过在浏览器中输入 `localhost:6006` 来查看 TensorBoard。 如果你已经通过pip安装了 TensorBoard,你可以通过执行更为简单地命令来访问 TensorBoard ``` tensorboard --logdir=/path/to/log-directory ``` 进入 TensorBoard 的界面时,你会在右上角看到导航选项卡,每一个选项卡将展现一组可视化的序列化数据集 。对于你查看的每一个选项卡,如果 TensorBoard 中没有数据与这个选项卡相关的话,则会显示一条提示信息指示你如何序列化相关数据。 更多更详细的关于如何使用 graph 选项来显示你的图像的信息。参见 [TensorBoard:图表可视化](./graph_viz.md) 原文地址:[TensorBoard:Visualizing Learning](http://tensorflow.org/how_tos/summaries_and_tensorboard/index.html#tensorboard-visualizing-learning) 翻译:[thylaco1eo](https://github.com/thylaco1eo) 校对:[lucky521](https://github.com/lucky521) ================================================ FILE: SOURCE/how_tos/threading_and_queues/index.md ================================================ # Threading and Queues Queues are a powerful mechanism for asynchronous computation using TensorFlow. Like everything in TensorFlow, a queue is a node in a TensorFlow graph. It's a stateful node, like variable: other nodes can modify its content. In particular, nodes can enqueue new items in to the queue, or dequeue existing items from the queue. To get a feel for queues, let's consider a simple example. We will create a "first in, first out" queue (`FIFOQueue`) and fill it with zeros. Then we'll construct a graph that takes an item off the queue, adds one to that item, and puts it back on the end of the queue. Slowly, the numbers on the queue increase.
`Enqueue`, `EnqueueMany`, and `Dequeue` are special nodes. They take a pointer to the queue instead of a normal value, allowing them to change it. We recommend you think of these as being like methods of the queue. In fact, in the Python API, they are methods of the queue object (eg. `q.enqueue(...)`). Now that you have a bit of a feel for queues, let's dive into the details... ## Queue Use Overview Queues, such as `FIFOQueue` and `RandomShuffleQueue`, are important TensorFlow objects for computing tensors asynchronously in a graph. For example, a typical input architecture is to use a `RandomShuffleQueue` to prepare inputs for training a model: * Multiple threads prepare training examples and push them in the queue. * A training thread executes a training op that dequeues mini-batches from the queue This architecture has many benefits, as highlighted in the [Reading data how to](../reading_data), which also gives an overview of functions that simplify the construction of input pipelines. The TensorFlow `Session` object is multithreaded, so multiple threads can easily use the same session and run ops in parallel. However, it is not always easy to implement a Python program that drives threads as described above. All threads must be able to stop together, exceptions must be caught and reported, and queues must be properly closed when stopping. TensorFlow provides two classes to help: [tf.Coordinator](../../api_docs/python/train.md#Coordinator) and [tf.QueueRunner](../../api_docs/python/train.md#QueueRunner). These two classes are designed to be used together. The `Coordinator` class helps multiple threads stop together and report exceptions to a program that waits for them to stop. The `QueueRunner` class is used to create a number of threads cooperating to enqueue tensors in the same queue. ## Coordinator The Coordinator class helps multiple threads stop together. Its key methods are: * `should_stop()`: returns True if the threads should stop. * `request_stop()`: requests that threads should stop. * `join()`: waits until the specified threads have stopped. You first create a `Coordinator` object, and then create a number of threads that use the coordinator. The threads typically run loops that stop when `should_stop()` returns `True`. Any thread can decide that the computation should stop. It only has to call `request_stop()` and the other threads will stop as `should_stop()` will then return `True`. ```python # Thread body: loop until the coordinator indicates a stop was requested. # If some condition becomes true, ask the coordinator to stop. def MyLoop(coord): while not coord.should_stop(): ...do something... if ...some condition...: coord.request_stop() # Main code: create a coordinator. coord = Coordinator() # Create 10 threads that run 'MyLoop()' threads = [threading.Thread(target=MyLoop, args=(coord)) for i in xrange(10)] # Start the threads and wait for all of them to stop. for t in threads: t.start() coord.join(threads) ``` Obviously, the coordinator can manage threads doing very different things. They don't have to be all the same as in the example above. The coordinator also has support to capture and report exceptions. See the [Coordinator class](../../api_docs/python/train.md#Coordinator) documentation for more details. ## QueueRunner The `QueueRunner` class creates a number of threads that repeatedly run an enqueue op. These threads can use a coordinator to stop together. In addition, a queue runner runs a *closer thread* that automatically closes the queue if an exception is reported to the coordinator. You can use a queue runner to implement the architecture described above. First build a graph that uses a `Queue` for input examples. Add ops that process examples and enqueue them in the queue. Add training ops that start by dequeueing from the queue. ```python example = ...ops to create one example... # Create a queue, and an op that enqueues examples one at a time in the queue. queue = tf.RandomShuffleQueue(...) enqueue_op = queue.enqueue(example) # Create a training graph that starts by dequeuing a batch of examples. inputs = queue.dequeue_many(batch_size) train_op = ...use 'inputs' to build the training part of the graph... ``` In the Python training program, create a `QueueRunner` that will run a few threads to process and enqueue examples. Create a `Coordinator` and ask the queue runner to start its threads with the coordinator. Write a training loop that also uses the coordinator. ``` # Create a queue runner that will run 4 threads in parallel to enqueue # examples. qr = tf.train.QueueRunner(queue, [enqueue_op] * 4) # Launch the graph. sess = tf.Session() # Create a coordinator, launch the queue runner threads. coord = tf.train.Coordinator() enqueue_threads = qr.create_threads(sess, coord=coord, start=True) # Run the training loop, controlling termination with the coordinator. for step in xrange(1000000): if coord.should_stop(): break sess.run(train_op) # When done, ask the threads to stop. coord.request_stop() # And wait for them to actually do it. coord.join(threads) ``` ## Handling Exceptions Threads started by queue runners do more than just run the enqueue ops. They also catch and handle exceptions generated by queues, including `OutOfRangeError` which is used to report that a queue was closed. A training program that uses a coordinator must similarly catch and report exceptions in its main loop. Here is an improved version of the training loop above. ```python try: for step in xrange(1000000): if coord.should_stop(): break sess.run(train_op) except Exception, e: # Report exceptions to the coordinator. coord.request_stop(e) # Terminate as usual. It is innocuous to request stop twice. coord.request_stop() coord.join(threads) ``` 原文地址:([Threading and Queues](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/threading_and_queues/index.md)) 翻译:([zhangkom](https://github.com/zhangkom)) 校对: ================================================ FILE: SOURCE/how_tos/threading_and_queues.md ================================================ # 线程和队列 在使用TensorFlow进行异步计算时,队列是一种强大的机制。 正如TensorFlow中的其他组件一样,队列就是TensorFlow图中的节点。这是一种有状态的节点,就像变量一样:其他节点可以修改它的内容。具体来说,其他节点可以把新元素插入到队列后端(rear),也可以把队列前端(front)的元素删除。 为了感受一下队列,让我们来看一个简单的例子。我们先创建一个“先入先出”的队列(FIFOQueue),并将其内部所有元素初始化为零。然后,我们构建一个TensorFlow图,它从队列前端取走一个元素,加上1之后,放回队列的后端。慢慢地,队列的元素的值就会增加。
`Enqueue`、 `EnqueueMany`和`Dequeue`都是特殊的节点。他们需要获取队列指针,而非普通的值,如此才能修改队列内容。我们建议您将它们看作队列的方法。事实上,在Python API中,它们就是队列对象的方法(例如`q.enqueue(...)`)。 现在你已经对队列有了一定的了解,让我们深入到细节... ## 队列使用概述 队列,如`FIFOQueue`和`RandomShuffleQueue`,在TensorFlow的张量异步计算时都非常重要。 例如,一个典型的输入结构:是使用一个`RandomShuffleQueue`来作为模型训练的输入: * 多个线程准备训练样本,并且把这些样本推入队列。 * 一个训练线程执行一个训练操作,此操作会从队列中移除最小批次的样本(mini-batches)。 这种结构具有许多优点,正如在[Reading data how to](../reading_data)中强调的,同时,[Reading data how to](../reading_data)也概括地描述了如何简化输入管道的构造过程。 TensorFlow的`Session`对象是可以支持多线程的,因此多个线程可以很方便地使用同一个会话(Session)并且并行地执行操作。然而,在Python程序实现这样的并行运算却并不容易。所有线程都必须能被同步终止,异常必须能被正确捕获并报告,回话终止的时候, 队列必须能被正确地关闭。 所幸TensorFlow提供了两个类来帮助多线程的实现:[tf.Coordinator](tensorflow-zh/SOURCE/api_docs/python/train.md#Coordinator)和 [tf.QueueRunner](tensorflow-zh/SOURCE/api_docs/python/train.md#QueueRunner)。从设计上这两个类必须被一起使用。`Coordinator`类可以用来同时停止多个工作线程并且向那个在等待所有工作线程终止的程序报告异常。`QueueRunner`类用来协调多个工作线程同时将多个张量推入同一个队列中。 ## Coordinator Coordinator类用来帮助多个线程协同工作,多个线程同步终止。 其主要方法有: * `should_stop()`:如果线程应该停止则返回True。 * `request_stop()`: 请求该线程停止。 * `join()`:等待被指定的线程终止。 首先创建一个`Coordinator`对象,然后建立一些使用`Coordinator`对象的线程。这些线程通常一直循环运行,一直到`should_stop()`返回True时停止。 任何线程都可以决定计算什么时候应该停止。它只需要调用`request_stop()`,同时其他线程的`should_stop()`将会返回`True`,然后都停下来。 ```python # 线程体:循环执行,直到`Coordinator`收到了停止请求。 # 如果某些条件为真,请求`Coordinator`去停止其他线程。 def MyLoop(coord): while not coord.should_stop(): ...do something... if ...some condition...: coord.request_stop() # Main code: create a coordinator. coord = Coordinator() # Create 10 threads that run 'MyLoop()' threads = [threading.Thread(target=MyLoop, args=(coord)) for i in xrange(10)] # Start the threads and wait for all of them to stop. for t in threads: t.start() coord.join(threads) ``` 显然,Coordinator可以管理线程去做不同的事情。上面的代码只是一个简单的例子,在设计实现的时候不必完全照搬。Coordinator还支持捕捉和报告异常, 具体可以参考[Coordinator class](tensorflow-zh/SOURCE/api_docs/python/train.md#Coordinator)的文档。 ## QueueRunner `QueueRunner`类会创建一组线程, 这些线程可以重复的执行Enquene操作, 他们使用同一个Coordinator来处理线程同步终止。此外,一个QueueRunner会运行一个*closer thread*,当Coordinator收到异常报告时,这个*closer thread*会自动关闭队列。 您可以使用一个queue runner,来实现上述结构。 首先建立一个TensorFlow图表,这个图表使用队列来输入样本。增加处理样本并将样本推入队列中的操作。增加training操作来移除队列中的样本。 ```python example = ...ops to create one example... # Create a queue, and an op that enqueues examples one at a time in the queue. queue = tf.RandomShuffleQueue(...) enqueue_op = queue.enqueue(example) # Create a training graph that starts by dequeuing a batch of examples. inputs = queue.dequeue_many(batch_size) train_op = ...use 'inputs' to build the training part of the graph... ``` 在Python的训练程序中,创建一个`QueueRunner`来运行几个线程, 这几个线程处理样本,并且将样本推入队列。创建一个`Coordinator`,让queue runner使用`Coordinator`来启动这些线程,创建一个训练的循环, 并且使用`Coordinator`来控制`QueueRunner`的线程们的终止。 ``` # Create a queue runner that will run 4 threads in parallel to enqueue # examples. qr = tf.train.QueueRunner(queue, [enqueue_op] * 4) # Launch the graph. sess = tf.Session() # Create a coordinator, launch the queue runner threads. coord = tf.train.Coordinator() enqueue_threads = qr.create_threads(sess, coord=coord, start=True) # Run the training loop, controlling termination with the coordinator. for step in xrange(1000000): if coord.should_stop(): break sess.run(train_op) # When done, ask the threads to stop. coord.request_stop() # And wait for them to actually do it. coord.join(threads) ``` ## 异常处理 通过queue runners启动的线程不仅仅只处理推送样本到队列。他们还捕捉和处理由队列产生的异常,包括`OutOfRangeError`异常,这个异常是用于报告队列被关闭。 使用`Coordinator`的训练程序在主循环中必须同时捕捉和报告异常。 下面是对上面训练循环的改进版本。 ```python try: for step in xrange(1000000): if coord.should_stop(): break sess.run(train_op) except Exception, e: # Report exceptions to the coordinator. coord.request_stop(e) # Terminate as usual. It is innocuous to request stop twice. coord.request_stop() coord.join(threads) ``` 原文地址:[Threading and Queues](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/threading_and_queues/index.md) 翻译:[zhangkom](https://github.com/zhangkom) 校对:[volvet](https://github.com/volvet) ================================================ FILE: SOURCE/how_tos/using_gpu/index.md ================================================ # Using GPUs ## Supported devices On a typical system, there are multiple computing devices. In TensorFlow, the supported device types are `CPU` and `GPU`. They are represented as `strings`. For example: * `"/cpu:0"`: The CPU of your machine. * `"/gpu:0"`: The GPU of your machine, if you have one. * `"/gpu:1"`: The second GPU of your machine, etc. If a TensorFlow operation has both CPU and GPU implementations, the GPU devices will be given priority when the operation is assigned to a device. For example, `matmul` has both CPU and GPU kernels. On a system with devices `cpu:0` and `gpu:0`, `gpu:0` will be selected to run `matmul`. ## Logging Device placement To find out which devices your operations and tensors are assigned to, create the session with `log_device_placement` configuration option set to `True`. ```python # Creates a graph. a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # Creates a session with log_device_placement set to True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # Runs the op. print sess.run(c) ``` You should see the following output: ``` Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0 b: /job:localhost/replica:0/task:0/gpu:0 a: /job:localhost/replica:0/task:0/gpu:0 MatMul: /job:localhost/replica:0/task:0/gpu:0 [[ 22. 28.] [ 49. 64.]] ``` ## Manual device placement If you would like a particular operation to run on a device of your choice instead of what's automatically selected for you, you can use `with tf.device` to create a device context such that all the operations within that context will have the same device assignment. ```python # Creates a graph. with tf.device('/cpu:0'): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # Creates a session with log_device_placement set to True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # Runs the op. print sess.run(c) ``` You will see that now `a` and `b` are assigned to `cpu:0`. ``` Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0 b: /job:localhost/replica:0/task:0/cpu:0 a: /job:localhost/replica:0/task:0/cpu:0 MatMul: /job:localhost/replica:0/task:0/gpu:0 [[ 22. 28.] [ 49. 64.]] ``` ## Using a single GPU on a multi-GPU system If you have more than one GPU in your system, the GPU with the lowest ID will be selected by default. If you would like to run on a different GPU, you will need to specify the preference explicitly: ```python # Creates a graph. with tf.device('/gpu:2'): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # Creates a session with log_device_placement set to True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # Runs the op. print sess.run(c) ``` If the device you have specified does not exist, you will get `InvalidArgumentError`: ``` InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b': Could not satisfy explicit device specification '/gpu:2' [[Node: b = Const[dtype=DT_FLOAT, value=Tensor, _device="/gpu:2"]()]] ``` If you would like TensorFlow to automatically choose an existing and supported device to run the operations in case the specified one doesn't exist, you can set `allow_soft_placement` to `True` in the configuration option when creating the session. ```python # Creates a graph. with tf.device('/gpu:2'): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # Creates a session with allow_soft_placement and log_device_placement set # to True. sess = tf.Session(config=tf.ConfigProto( allow_soft_placement=True, log_device_placement=True)) # Runs the op. print sess.run(c) ``` ## Using multiple GPUs If you would like to run TensorFlow on multiple GPUs, you can construct your model in a multi-tower fashion where each tower is assigned to a different GPU. For example: ``` # Creates a graph. c = [] for d in ['/gpu:2', '/gpu:3']: with tf.device(d): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3]) b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2]) c.append(tf.matmul(a, b)) with tf.device('/cpu:0'): sum = tf.add_n(c) # Creates a session with log_device_placement set to True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # Runs the op. print sess.run(sum) ``` You will see the following output. ``` Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K20m, pci bus id: 0000:02:00.0 /job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K20m, pci bus id: 0000:03:00.0 /job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: Tesla K20m, pci bus id: 0000:83:00.0 /job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: Tesla K20m, pci bus id: 0000:84:00.0 Const_3: /job:localhost/replica:0/task:0/gpu:3 Const_2: /job:localhost/replica:0/task:0/gpu:3 MatMul_1: /job:localhost/replica:0/task:0/gpu:3 Const_1: /job:localhost/replica:0/task:0/gpu:2 Const: /job:localhost/replica:0/task:0/gpu:2 MatMul: /job:localhost/replica:0/task:0/gpu:2 AddN: /job:localhost/replica:0/task:0/cpu:0 [[ 44. 56.] [ 98. 128.]] ``` The [cifar10 tutorial](../../tutorials/deep_cnn/index.md) is a good example demonstrating how to do training with multiple GPUs. ================================================ FILE: SOURCE/how_tos/using_gpu.md ================================================ # 使用 GPUs ## 支持的设备 在一套标准的系统上通常有多个计算设备. TensorFlow 支持 CPU 和 GPU 这两种设备. 我们用指定字符串 `strings` 来标识这些设备. 比如: * `"/cpu:0"`: 机器中的 CPU * `"/gpu:0"`: 机器中的 GPU, 如果你有一个的话. * `"/gpu:1"`: 机器中的第二个 GPU, 以此类推... 如果一个 TensorFlow 的 operation 中兼有 CPU 和 GPU 的实现, 当这个算子被指派设备时, GPU 有优先权. 比如`matmul`中 CPU 和 GPU kernel 函数都存在. 那么在 `cpu:0` 和 `gpu:0` 中, `matmul` operation 会被指派给 `gpu:0` . ## 记录设备指派情况 为了获取你的 operations 和 Tensor 被指派到哪个设备上运行, 用 `log_device_placement` 新建一个 `session`, 并设置为 `True`. ```python # 新建一个 graph. a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # 新建session with log_device_placement并设置为True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # 运行这个 op. print sess.run(c) ``` 你应该能看见以下输出: ``` Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0 b: /job:localhost/replica:0/task:0/gpu:0 a: /job:localhost/replica:0/task:0/gpu:0 MatMul: /job:localhost/replica:0/task:0/gpu:0 [[ 22. 28.] [ 49. 64.]] ``` ## 手工指派设备 如果你不想使用系统来为 operation 指派设备, 而是手工指派设备, 你可以用 `with tf.device` 创建一个设备环境, 这个环境下的 operation 都统一运行在环境指定的设备上. ```python # 新建一个graph. with tf.device('/cpu:0'): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # 新建session with log_device_placement并设置为True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # 运行这个op. print sess.run(c) ``` 你会发现现在 `a` 和 `b` 操作都被指派给了 `cpu:0`. ``` Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0 b: /job:localhost/replica:0/task:0/cpu:0 a: /job:localhost/replica:0/task:0/cpu:0 MatMul: /job:localhost/replica:0/task:0/gpu:0 [[ 22. 28.] [ 49. 64.]] ``` ## 在多GPU系统里使用单一GPU 如果你的系统里有多个 GPU, 那么 ID 最小的 GPU 会默认使用. 如果你想用别的 GPU, 可以用下面的方法显式的声明你的偏好: ```python # 新建一个 graph. with tf.device('/gpu:2'): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # 新建 session with log_device_placement 并设置为 True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # 运行这个 op. print sess.run(c) ``` 如果你指定的设备不存在, 你会收到 `InvalidArgumentError` 错误提示: ``` InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b': Could not satisfy explicit device specification '/gpu:2' [[Node: b = Const[dtype=DT_FLOAT, value=Tensor, _device="/gpu:2"]()]] ``` 为了避免出现你指定的设备不存在这种情况, 你可以在创建的 `session` 里把参数 `allow_soft_placement` 设置为 `True`, 这样 tensorFlow 会自动选择一个存在并且支持的设备来运行 operation. ```python # 新建一个 graph. with tf.device('/gpu:2'): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a') b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b') c = tf.matmul(a, b) # 新建 session with log_device_placement 并设置为 True. sess = tf.Session(config=tf.ConfigProto( allow_soft_placement=True, log_device_placement=True)) # 运行这个 op. print sess.run(c) ``` ## 使用多个 GPU 如果你想让 TensorFlow 在多个 GPU 上运行, 你可以建立 multi-tower 结构, 在这个结构 里每个 tower 分别被指配给不同的 GPU 运行. 比如: ``` # 新建一个 graph. c = [] for d in ['/gpu:2', '/gpu:3']: with tf.device(d): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3]) b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2]) c.append(tf.matmul(a, b)) with tf.device('/cpu:0'): sum = tf.add_n(c) # 新建session with log_device_placement并设置为True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # 运行这个op. print sess.run(sum) ``` 你会看到如下输出: ``` Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K20m, pci bus id: 0000:02:00.0 /job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K20m, pci bus id: 0000:03:00.0 /job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: Tesla K20m, pci bus id: 0000:83:00.0 /job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: Tesla K20m, pci bus id: 0000:84:00.0 Const_3: /job:localhost/replica:0/task:0/gpu:3 Const_2: /job:localhost/replica:0/task:0/gpu:3 MatMul_1: /job:localhost/replica:0/task:0/gpu:3 Const_1: /job:localhost/replica:0/task:0/gpu:2 Const: /job:localhost/replica:0/task:0/gpu:2 MatMul: /job:localhost/replica:0/task:0/gpu:2 AddN: /job:localhost/replica:0/task:0/cpu:0 [[ 44. 56.] [ 98. 128.]] ``` [cifar10 tutorial](tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md) 这个例子很好的演示了怎样用GPU集群训练. > 原文:[using_gpu](http://tensorflow.org/how_tos/using_gpu/index.md) 翻译:[@lianghyv](https://github.com/lianghyv) 校对:[Wiki](https://github.com/jikexueyuanwiki) ================================================ FILE: SOURCE/how_tos/variable_scope/index.md ================================================ # 共享变量 你可以在[怎么使用变量](../../how_tos/variables/index.md)中所描述的方式来创建,初始化,保存及加载单一的变量.但是当创建复杂的模块时,通常你需要共享大量变量集并且如果你还想在同一个地方初始化这所有的变量,我们又该怎么做呢.本教程就是演示如何使用`tf.variable_scope()` 和`tf.get_variable()`两个方法来实现这一点. ## 问题 假设你为图片过滤器创建了一个简单的模块,和我们的[卷积神经网络教程](../../tutorials/deep_cnn/index.md)模块相似,但是这里包括两个卷积(为了简化实例这里只有两个).如果你仅使用`tf.Variable`变量,那么你的模块就如[怎么使用变量](../../how_tos/variables/index.md)里面所解释的是一样的模块. ```python def my_image_filter(input_images): conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]), name="conv1_weights") conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases") conv1 = tf.nn.conv2d(input_images, conv1_weights, strides=[1, 1, 1, 1], padding='SAME') relu1 = tf.nn.relu(conv1 + conv1_biases) conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]), name="conv2_weights") conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases") conv2 = tf.nn.conv2d(relu1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME') return tf.nn.relu(conv2 + conv2_biases) ``` 你很容易想到,模块集很快就比一个模块变得更为复杂,仅在这里我们就有了四个不同的变量:`conv1_weights`,`conv1_biases`, `conv2_weights`, 和`conv2_biases`. 当我们想重用这个模块时问题还在增多.假设你想把你的图片过滤器运用到两张不同的图片, `image1`和`image2`.你想通过拥有同一个参数的同一个过滤器来过滤两张图片,你可以调用`my_image_filter()`两次,但是这会产生两组变量. ```python # First call creates one set of variables. result1 = my_image_filter(image1) # Another set is created in the second call. result2 = my_image_filter(image2) ``` 通常共享变量的方法就是在单独的代码块中来创建他们并且通过使用他们的函数.如使用字典的例子: ```python variables_dict = { "conv1_weights": tf.Variable(tf.random_normal([5, 5, 32, 32]), name="conv1_weights") "conv1_biases": tf.Variable(tf.zeros([32]), name="conv1_biases") ... etc. ... } def my_image_filter(input_images, variables_dict): conv1 = tf.nn.conv2d(input_images, variables_dict["conv1_weights"], strides=[1, 1, 1, 1], padding='SAME') relu1 = tf.nn.relu(conv1 + variables_dict["conv1_biases"]) conv2 = tf.nn.conv2d(relu1, variables_dict["conv2_weights"], strides=[1, 1, 1, 1], padding='SAME') return tf.nn.relu(conv2 + variables_dict["conv2_biases"]) # The 2 calls to my_image_filter() now use the same variables result1 = my_image_filter(image1, variables_dict) result2 = my_image_filter(image2, variables_dict) ``` 虽然使用上面的方式创建变量是很方便的,但是在这个模块代码之外却破坏了其封装性: * 在构建视图的代码中标明变量的名字,类型,形状来创建. * 当代码改变了,调用的地方也许就会产生或多或少或不同类型的变量. 解决此类问题的方法之一就是使用类来创建模块,在需要的地方使用类来小心地管理他们需要的变量. 一个更高明的做法,不用调用类,而是利用TensorFlow 提供了*变量作用域* 机制,当构建一个视图时,很容易就可以共享命名过的变量. ## 变量作用域实例 变量作用域机制在TensorFlow中主要由两部分组成: * `tf.get_variable(, , )`: 通过所给的名字创建或是返回一个变量. * `tf.variable_scope()`: 通过 `tf.get_variable()`为变量名指定命名空间. 方法 `tf.get_variable()` 用来获取或创建一个变量,而不是直接调用`tf.Variable`.它采用的不是像`tf.Variable这样直接获取值来初始化的方法.一个初始化就是一个方法,创建其形状并且为这个形状提供一个张量.这里有一些在TensorFlow中使用的初始化变量: * `tf.constant_initializer(value)` 初始化一切所提供的值, * `tf.random_uniform_initializer(a, b)`从a到b均匀初始化, * `tf.random_normal_initializer(mean, stddev)` 用所给平均值和标准差初始化均匀分布. 为了了解`tf.get_variable()`怎么解决前面所讨论的问题,让我们在单独的方法里面创建一个卷积来重构一下代码,命名为`conv_relu`: ```python def conv_relu(input, kernel_shape, bias_shape): # Create variable named "weights". weights = tf.get_variable("weights", kernel_shape, initializer=tf.random_normal_initializer()) # Create variable named "biases". biases = tf.get_variable("biases", bias_shape, initializer=tf.constant_intializer(0.0)) conv = tf.nn.conv2d(input, weights, strides=[1, 1, 1, 1], padding='SAME') return tf.nn.relu(conv + biases) ``` 这个方法中用了`"weights"` 和`"biases"`两个简称.而我们更偏向于用`conv1` 和 `conv2`这两个变量的写法,但是不同的变量需要不同的名字.这就是`tf.variable_scope()` 变量起作用的地方.他为变量指定了相应的命名空间. ```python def my_image_filter(input_images): with tf.variable_scope("conv1"): # Variables created here will be named "conv1/weights", "conv1/biases". relu1 = conv_relu(input_images, [5, 5, 32, 32], [32]) with tf.variable_scope("conv2"): # Variables created here will be named "conv2/weights", "conv2/biases". return conv_relu(relu1, [5, 5, 32, 32], [32]) ``` 现在,让我们看看当我们调用 `my_image_filter()` 两次时究竟会发生了什么. ``` result1 = my_image_filter(image1) result2 = my_image_filter(image2) # Raises ValueError(... conv1/weights already exists ...) ``` 就像你看见的一样,`tf.get_variable()`会检测已经存在的变量是否已经共享.如果你想共享他们,你需要像下面使用的一样,通过`reuse_variables()`这个方法来指定. ``` with tf.variable_scope("image_filters") as scope: result1 = my_image_filter(image1) scope.reuse_variables() result2 = my_image_filter(image2) ``` 用这种方式来共享变量是非常好的,轻量级而且安全. ## 变量作用域是怎么工作的? ### 理解 `tf.get_variable()` 为了理解变量作用域,首先完全理解`tf.get_variable()`是怎么工作的是很有必要的. 通常我们就是这样调用`tf.get_variable` 的. ```python v = tf.get_variable(name, shape, dtype, initializer) ``` 此调用做了有关作用域的两件事中的其中之一,方法调入.总的有两种情况. * 情况1:当`tf.get_variable_scope().reuse == False`时,作用域就是为创建新变量所设置的. 这种情况下,`v`将通过`tf.Variable`所提供的形状和数据类型来重新创建.创建变量的全称将会由当前变量作用域名+所提供的`名字`所组成,并且还会检查来确保没有任何变量使用这个全称.如果这个全称已经有一个变量使用了,那么方法将会抛出`ValueError`错误.如果一个变量被创建,他将会用`initializer(shape)`进行初始化.比如: ```python with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) assert v.name == "foo/v:0" ``` * 情况1:当`tf.get_variable_scope().reuse == True`时,作用域是为重用变量所设置 这种情况下,调用就会搜索一个已经存在的变量,他的全称和当前变量的作用域名+所提供的`名字`是否相等.如果不存在相应的变量,就会抛出`ValueError` 错误.如果变量找到了,就返回这个变量.如下: ```python with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) with tf.variable_scope("foo", reuse=True): v1 = tf.get_variable("v", [1]) assert v1 == v ``` ### `tf.variable_scope()` 基础 知道`tf.get_variable()`是怎么工作的,使得理解变量作用域变得很容易.变量作用域的主方法带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签来区分以上的两种情况.嵌套的作用域附加名字所用的规则和文件目录的规则很类似: ```python with tf.variable_scope("foo"): with tf.variable_scope("bar"): v = tf.get_variable("v", [1]) assert v.name == "foo/bar/v:0" ``` 当前变量作用域可以用`tf.get_variable_scope()`进行检索并且`reuse` 标签可以通过调用`tf.get_variable_scope().reuse_variables()`设置为`True` . ```python with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) tf.get_variable_scope().reuse_variables() v1 = tf.get_variable("v", [1]) assert v1 == v ``` 注意你*不能*设置`reuse`标签为`False`.其中的原因就是允许改写创建模块的方法.想一下你前面写得方法`my_image_filter(inputs)`.有人在变量作用域内调用`reuse=True` 是希望所有内部变量都被重用.如果允许在方法体内强制执行`reuse=False`,将会打破内部结构并且用这种方法使得很难再共享参数. 即使你不能直接设置 `reuse` 为 `False` ,但是你可以输入一个重用变量作用域,然后就释放掉,就成为非重用的变量.当打开一个变量作用域时,使用`reuse=True` 作为参数是可以的.但也要注意,同一个原因,`reuse` 参数是不可继承.所以当你打开一个重用变量作用域,那么所有的子作用域也将会被重用. ```python with tf.variable_scope("root"): # At start, the scope is not reusing. assert tf.get_variable_scope().reuse == False with tf.variable_scope("foo"): # Opened a sub-scope, still not reusing. assert tf.get_variable_scope().reuse == False with tf.variable_scope("foo", reuse=True): # Explicitly opened a reusing scope. assert tf.get_variable_scope().reuse == True with tf.variable_scope("bar"): # Now sub-scope inherits the reuse flag. assert tf.get_variable_scope().reuse == True # Exited the reusing scope, back to a non-reusing one. assert tf.get_variable_scope().reuse == False ``` ### 获取变量作用域 在上面的所有例子中,我们共享参数只因为他们的名字是一致的,那是因为我们开启一个变量作用域重用时刚好用了同一个字符串.在更复杂的情况,他可以通过变量作用域对象来使用,而不是通过依赖于右边的名字来使用.为此,变量作用域可以被获取并使用,而不是仅作为当开启一个新的变量作用域的名字. ```python with tf.variable_scope("foo") as foo_scope: v = tf.get_variable("v", [1]) with tf.variable_scope(foo_scope) w = tf.get_variable("w", [1]) with tf.variable_scope(foo_scope, reuse=True) v1 = tf.get_variable("v", [1]) w1 = tf.get_variable("w", [1]) assert v1 == v assert w1 == w ``` 当开启一个变量作用域,使用一个预先已经存在的作用域时,我们会跳过当前变量作用域的前缀而直接成为一个完全不同的作用域.这就是我们做得完全独立的地方. ```python with tf.variable_scope("foo") as foo_scope: assert foo_scope.name == "foo" with tf.variable_scope("bar") with tf.variable_scope("baz") as other_scope: assert other_scope.name == "bar/baz" with tf.variable_scope(foo_scope) as foo_scope2: assert foo_scope2.name == "foo" # Not changed. ``` ### 变量作用域中的初始化器 使用`tf.get_variable()`允许你重写方法来创建或者重用变量,并且可以被外部透明调用.但是如果我们想改变创建变量的初始化器那要怎么做呢?是否我们需要为所有的创建变量方法传递一个额外的参数呢?那在大多数情况下,当我们想在一个地方并且为所有的方法的所有的变量设置一个默认初始化器,那又改怎么做呢?为了解决这些问题,变量作用域可以携带一个默认的初始化器.他可以被子作用域继承并传递给`tf.get_variable()` 调用.但是如果其他初始化器被明确地指定,那么他将会被重写. ```python with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)): v = tf.get_variable("v", [1]) assert v.eval() == 0.4 # Default initializer as set above. w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3)): assert w.eval() == 0.3 # Specific initializer overrides the default. with tf.variable_scope("bar"): v = tf.get_variable("v", [1]) assert v.eval() == 0.4 # Inherited default initializer. with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)): v = tf.get_variable("v", [1]) assert v.eval() == 0.2 # Changed default initializer. ``` ### 在`tf.variable_scope()`中ops的名称 我们讨论 `tf.variable_scope` 怎么处理变量的名字.但是又是如何在作用域中影响到 其他ops的名字的呢?ops在一个变量作用域的内部创建,那么他应该是共享他的名字,这是很自然的想法.出于这样的原因,当我们用`with tf.variable_scope("name")`时,这就间接地开启了一个`tf.name_scope("name")`.比如: ```python with tf.variable_scope("foo"): x = 1.0 + tf.get_variable("v", [1]) assert x.op.name == "foo/add" ``` 名称作用域可以被开启并添加到一个变量作用域中,然后他们只会影响到ops的名称,而不会影响到变量. ```python with tf.variable_scope("foo"): with tf.name_scope("bar"): v = tf.get_variable("v", [1]) x = 1.0 + v assert v.name == "foo/v:0" assert x.op.name == "foo/bar/add" ``` 当用一个引用对象而不是一个字符串去开启一个变量作用域时,我们就不会为ops改变当前的名称作用域. ## 使用实例 这里有一些指向怎么使用变量作用域的文件.特别是,他被大量用于 [时间递归神经网络](https://zh.wikipedia.org/wiki/%E9%80%92%E5%BD%92%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C)和`sequence-to-sequence`模型, File | What's in it? --- | --- `models/image/cifar10.py` |图像中检测对象的模型. `models/rnn/rnn_cell.py` |时间递归神经网络的元方法集. `models/rnn/seq2seq.py` |为创建`sequence-to-sequence`模型的方法集. 原文:[Sharing Variables](http://www.tensorflow.org/how_tos/variable_scope/index.md) 翻译:[nb312](https://github.com/nb312) ================================================ FILE: SOURCE/how_tos/variable_scope.md ================================================ # 共享变量 你可以在[怎么使用变量](tensorflow-zh/SOURCE/how_tos/variables/index.md)中所描述的方式来创建,初始化,保存及加载单一的变量.但是当创建复杂的模块时,通常你需要共享大量变量集并且如果你还想在同一个地方初始化这所有的变量,我们又该怎么做呢.本教程就是演示如何使用`tf.variable_scope()` 和`tf.get_variable()`两个方法来实现这一点. ## 问题 假设你为图片过滤器创建了一个简单的模块,和我们的[卷积神经网络教程](tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md)模块相似,但是这里包括两个卷积(为了简化实例这里只有两个).如果你仅使用`tf.Variable`变量,那么你的模块就如[怎么使用变量](tensorflow-zh/SOURCE/how_tos/variables/index.md)里面所解释的是一样的模块. ```python def my_image_filter(input_images): conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]), name="conv1_weights") conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases") conv1 = tf.nn.conv2d(input_images, conv1_weights, strides=[1, 1, 1, 1], padding='SAME') relu1 = tf.nn.relu(conv1 + conv1_biases) conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]), name="conv2_weights") conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases") conv2 = tf.nn.conv2d(relu1, conv2_weights, strides=[1, 1, 1, 1], padding='SAME') return tf.nn.relu(conv2 + conv2_biases) ``` 你很容易想到,模块集很快就比一个模块变得更为复杂,仅在这里我们就有了四个不同的变量:`conv1_weights`,`conv1_biases`, `conv2_weights`, 和`conv2_biases`. 当我们想重用这个模块时问题还在增多.假设你想把你的图片过滤器运用到两张不同的图片, `image1`和`image2`.你想通过拥有同一个参数的同一个过滤器来过滤两张图片,你可以调用`my_image_filter()`两次,但是这会产生两组变量. ```python # First call creates one set of variables. result1 = my_image_filter(image1) # Another set is created in the second call. result2 = my_image_filter(image2) ``` 通常共享变量的方法就是在单独的代码块中来创建他们并且通过使用他们的函数.如使用字典的例子: ```python variables_dict = { "conv1_weights": tf.Variable(tf.random_normal([5, 5, 32, 32]), name="conv1_weights") "conv1_biases": tf.Variable(tf.zeros([32]), name="conv1_biases") ... etc. ... } def my_image_filter(input_images, variables_dict): conv1 = tf.nn.conv2d(input_images, variables_dict["conv1_weights"], strides=[1, 1, 1, 1], padding='SAME') relu1 = tf.nn.relu(conv1 + variables_dict["conv1_biases"]) conv2 = tf.nn.conv2d(relu1, variables_dict["conv2_weights"], strides=[1, 1, 1, 1], padding='SAME') return tf.nn.relu(conv2 + variables_dict["conv2_biases"]) # The 2 calls to my_image_filter() now use the same variables result1 = my_image_filter(image1, variables_dict) result2 = my_image_filter(image2, variables_dict) ``` 虽然使用上面的方式创建变量是很方便的,但是在这个模块代码之外却破坏了其封装性: * 在构建试图的代码中标明变量的名字,类型,形状来创建. * 当代码改变了,调用的地方也许就会产生或多或少或不同类型的变量. 解决此类问题的方法之一就是使用类来创建模块,在需要的地方使用类来小心地管理他们需要的变量. 一个更高明的做法,不用调用类,而是利用TensorFlow 提供了*变量作用域* 机制,当构建一个视图时,很容易就可以共享命名过的变量. ## 变量作用域实例 变量作用域机制在TensorFlow中主要由两部分组成: * `tf.get_variable(, , )`: 通过所给的名字创建或是返回一个变量. * `tf.variable_scope()`: 通过 `tf.get_variable()`为变量名指定命名空间. 方法 `tf.get_variable()` 用来获取或创建一个变量,而不是直接调用`tf.Variable`.它采用的不是像`tf.Variable这样直接获取值来初始化的方法.一个初始化就是一个方法,创建其形状并且为这个形状提供一个张量.这里有一些在TensorFlow中使用的初始化变量: * `tf.constant_initializer(value)` 初始化一切所提供的值, * `tf.random_uniform_initializer(a, b)`从a到b均匀初始化, * `tf.random_normal_initializer(mean, stddev)` 用所给平均值和标准差初始化均匀分布. 为了了解`tf.get_variable()`怎么解决前面所讨论的问题,让我们在单独的方法里面创建一个卷积来重构一下代码,命名为`conv_relu`: ```python def conv_relu(input, kernel_shape, bias_shape): # Create variable named "weights". weights = tf.get_variable("weights", kernel_shape, initializer=tf.random_normal_initializer()) # Create variable named "biases". biases = tf.get_variable("biases", bias_shape, initializer=tf.constant_intializer(0.0)) conv = tf.nn.conv2d(input, weights, strides=[1, 1, 1, 1], padding='SAME') return tf.nn.relu(conv + biases) ``` 这个方法中用了`"weights"` 和`"biases"`两个简称.而我们更偏向于用`conv1` 和 `conv2`这两个变量的写法,但是不同的变量需要不同的名字.这就是`tf.variable_scope()` 变量起作用的地方.他为变量指定了相应的命名空间. ```python def my_image_filter(input_images): with tf.variable_scope("conv1"): # Variables created here will be named "conv1/weights", "conv1/biases". relu1 = conv_relu(input_images, [5, 5, 32, 32], [32]) with tf.variable_scope("conv2"): # Variables created here will be named "conv2/weights", "conv2/biases". return conv_relu(relu1, [5, 5, 32, 32], [32]) ``` 现在,让我们看看当我们调用 `my_image_filter()` 两次时究竟会发生了什么. ``` result1 = my_image_filter(image1) result2 = my_image_filter(image2) # Raises ValueError(... conv1/weights already exists ...) ``` 就像你看见的一样,`tf.get_variable()`会检测已经存在的变量是否已经共享.如果你想共享他们,你需要像下面使用的一样,通过`reuse_variables()`这个方法来指定. ``` with tf.variable_scope("image_filters") as scope: result1 = my_image_filter(image1) scope.reuse_variables() result2 = my_image_filter(image2) ``` 用这种方式来共享变量是非常好的,轻量级而且安全. ## 变量作用域是怎么工作的? ### 理解 `tf.get_variable()` 为了理解变量作用域,首先完全理解`tf.get_variable()`是怎么工作的是很有必要的. 通常我们就是这样调用`tf.get_variable` 的. ```python v = tf.get_variable(name, shape, dtype, initializer) ``` 此调用做了有关作用域的两件事中的其中之一,方法调入.总的有两种情况. * 情况1:当`tf.get_variable_scope().reuse == False`时,作用域就是为创建新变量所设置的. 这种情况下,`v`将通过`tf.Variable`所提供的形状和数据类型来重新创建.创建变量的全称将会由当前变量作用域名+所提供的`名字`所组成,并且还会检查来确保没有任何变量使用这个全称.如果这个全称已经有一个变量使用了,那么方法将会抛出`ValueError`错误.如果一个变量被创建,他将会用`initializer(shape)`进行初始化.比如: ```python with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) assert v.name == "foo/v:0" ``` * 情况1:当`tf.get_variable_scope().reuse == True`时,作用域是为重用变量所设置 这种情况下,调用就会搜索一个已经存在的变量,他的全称和当前变量的作用域名+所提供的`名字`是否相等.如果不存在相应的变量,就会抛出`ValueError` 错误.如果变量找到了,就返回这个变量.如下: ```python with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) with tf.variable_scope("foo", reuse=True): v1 = tf.get_variable("v", [1]) assert v1 == v ``` ### `tf.variable_scope()` 基础 知道`tf.get_variable()`是怎么工作的,使得理解变量作用域变得很容易.变量作用域的主方法带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签来区分以上的两种情况.嵌套的作用域附加名字所用的规则和文件目录的规则很类似: ```python with tf.variable_scope("foo"): with tf.variable_scope("bar"): v = tf.get_variable("v", [1]) assert v.name == "foo/bar/v:0" ``` 当前变量作用域可以用`tf.get_variable_scope()`进行检索并且`reuse` 标签可以通过调用`tf.get_variable_scope().reuse_variables()`设置为`True` . ```python with tf.variable_scope("foo"): v = tf.get_variable("v", [1]) tf.get_variable_scope().reuse_variables() v1 = tf.get_variable("v", [1]) assert v1 == v ``` 注意你*不能*设置`reuse`标签为`False`.其中的原因就是允许改写创建模块的方法.想一下你前面写得方法`my_image_filter(inputs)`.有人在变量作用域内调用`reuse=True` 是希望所有内部变量都被重用.如果允许在方法体内强制执行`reuse=False`,将会打破内部结构并且用这种方法使得很难再共享参数. 即使你不能直接设置 `reuse` 为 `False` ,但是你可以输入一个重用变量作用域,然后就释放掉,就成为非重用的变量.当打开一个变量作用域时,使用`reuse=True` 作为参数是可以的.但也要注意,同一个原因,`reuse` 参数是不可继承.所以当你打开一个重用变量作用域,那么所有的子作用域也将会被重用. ```python with tf.variable_scope("root"): # At start, the scope is not reusing. assert tf.get_variable_scope().reuse == False with tf.variable_scope("foo"): # Opened a sub-scope, still not reusing. assert tf.get_variable_scope().reuse == False with tf.variable_scope("foo", reuse=True): # Explicitly opened a reusing scope. assert tf.get_variable_scope().reuse == True with tf.variable_scope("bar"): # Now sub-scope inherits the reuse flag. assert tf.get_variable_scope().reuse == True # Exited the reusing scope, back to a non-reusing one. assert tf.get_variable_scope().reuse == False ``` ### 获取变量作用域 在上面的所有例子中,我们共享参数只因为他们的名字是一致的,那是因为我们开启一个变量作用域重用时刚好用了同一个字符串.在更复杂的情况,他可以通过变量作用域对象来使用,而不是通过依赖于右边的名字来使用.为此,变量作用域可以被获取并使用,而不是仅作为当开启一个新的变量作用域的名字. ```python with tf.variable_scope("foo") as foo_scope: v = tf.get_variable("v", [1]) with tf.variable_scope(foo_scope) w = tf.get_variable("w", [1]) with tf.variable_scope(foo_scope, reuse=True) v1 = tf.get_variable("v", [1]) w1 = tf.get_variable("w", [1]) assert v1 == v assert w1 == w ``` 当开启一个变量作用域,使用一个预先已经存在的作用域时,我们会跳过当前变量作用域的前缀而直接成为一个完全不同的作用域.这就是我们做得完全独立的地方. ```python with tf.variable_scope("foo") as foo_scope: assert foo_scope.name == "foo" with tf.variable_scope("bar") with tf.variable_scope("baz") as other_scope: assert other_scope.name == "bar/baz" with tf.variable_scope(foo_scope) as foo_scope2: assert foo_scope2.name == "foo" # Not changed. ``` ### 变量作用域中的初始化器 使用`tf.get_variable()`允许你重写方法来创建或者重用变量,并且可以被外部透明调用.但是如果我们想改变创建变量的初始化器那要怎么做呢?是否我们需要为所有的创建变量方法传递一个额外的参数呢?那在大多数情况下,当我们想在一个地方并且为所有的方法的所有的变量设置一个默认初始化器,那又改怎么做呢?为了解决这些问题,变量作用域可以携带一个默认的初始化器.他可以被子作用域继承并传递给`tf.get_variable()` 调用.但是如果其他初始化器被明确地指定,那么他将会被重写. ```python with tf.variable_scope("foo", initializer=tf.constant_initializer(0.4)): v = tf.get_variable("v", [1]) assert v.eval() == 0.4 # Default initializer as set above. w = tf.get_variable("w", [1], initializer=tf.constant_initializer(0.3)): assert w.eval() == 0.3 # Specific initializer overrides the default. with tf.variable_scope("bar"): v = tf.get_variable("v", [1]) assert v.eval() == 0.4 # Inherited default initializer. with tf.variable_scope("baz", initializer=tf.constant_initializer(0.2)): v = tf.get_variable("v", [1]) assert v.eval() == 0.2 # Changed default initializer. ``` ### 在`tf.variable_scope()`中ops的名称 我们讨论 `tf.variable_scope` 怎么处理变量的名字.但是又是如何在作用域中影响到 其他ops的名字的呢?ops在一个变量作用域的内部创建,那么他应该是共享他的名字,这是很自然的想法.出于这样的原因,当我们用`with tf.variable_scope("name")`时,这就间接地开启了一个`tf.name_scope("name")`.比如: ```python with tf.variable_scope("foo"): x = 1.0 + tf.get_variable("v", [1]) assert x.op.name == "foo/add" ``` 名称作用域可以被开启并添加到一个变量作用域中,然后他们只会影响到ops的名称,而不会影响到变量. ```python with tf.variable_scope("foo"): with tf.name_scope("bar"): v = tf.get_variable("v", [1]) x = 1.0 + v assert v.name == "foo/v:0" assert x.op.name == "foo/bar/add" ``` 当用一个引用对象而不是一个字符串去开启一个变量作用域时,我们就不会为ops改变当前的名称作用域. ## 使用实例 这里有一些指向怎么使用变量作用域的文件.特别是,他被大量用于 [时间递归神经网络](https://zh.wikipedia.org/wiki/%E9%80%92%E5%BD%92%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C)和`sequence-to-sequence`模型, File | What's in it? --- | --- `models/image/cifar10.py` |图像中检测对象的模型. `models/rnn/rnn_cell.py` |时间递归神经网络的元方法集. `models/rnn/seq2seq.py` |为创建`sequence-to-sequence`模型的方法集. 原文:[Sharing Variables](http://www.tensorflow.org/how_tos/variable_scope/index.md) 翻译:[nb312](https://github.com/nb312)校对:[Wiki](https://github.com/jikexueyuanwiki) ================================================ FILE: SOURCE/how_tos/variables/index.md ================================================ # Variables: Creation, Initialization, Saving, and Loading When you train a model, you use [variables](../../api_docs/python/state_ops.md) to hold and update parameters. Variables are in-memory buffers containing tensors. They must be explicitly initialized and can be saved to disk during and after training. You can later restore saved values to exercise or analyse the model. This document references the following TensorFlow classes. Follow the links to their reference manual for a complete description of their API: * The [`tf.Variable`](../../api_docs/python/state_ops.md#Variable) class. * The [`tf.train.Saver`](../../api_docs/python/state_ops.md#Saver) class. ## Creation When you create a [Variable](../../api_docs/python/state_ops.md) you pass a `Tensor` as its initial value to the `Variable()` constructor. TensorFlow provides a collection of ops that produce tensors often used for initialization from [constants or random values](../../api_docs/python/constant_op.md). Note that all these ops require you to specify the shape of the tensors. That shape automatically becomes the shape of the variable. Variables generally have a fixed shape, but TensorFlow provides advanced mechanisms to reshape variables. ```python # Create two variables. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") biases = tf.Variable(tf.zeros([200]), name="biases") ``` Calling `tf.Variable()` adds several ops to the graph: * A `variable` op that holds the variable value. * An initializer op that sets the variable to its initial value. This is actually a `tf.assign` op. * The ops for the initial value, such as the `zeros` op for the `biases` variable in the example are also added to the graph. The value returned by `tf.Variable()` value is an instance of the Python class `tf.Variable`. ## Initialization Variable initializers must be run explicitly before other ops in your model can be run. The easiest way to do that is to add an op that runs all the variable initializers, and run that op before using the model. You can alternatively restore variable values from a checkpoint file, see below. Use `tf.initialize_all_variables()` to add an op to run variable initializers. Only run that op after you have fully constructed your model and launched it in a session. ```python # Create two variables. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") biases = tf.Variable(tf.zeros([200]), name="biases") ... # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Later, when launching the model with tf.Session() as sess: # Run the init operation. sess.run(init_op) ... # Use the model ... ``` ### Initialization from another Variable You sometimes need to initialize a variable from the initial value of another variable. As the op added by `tf.initialize_all_variables()` initializes all variables in parallel you have to be careful when this is needed. To initialize a new variable from the value of another variable use the other variable's `initialized_value()` property. You can use the initialized value directly as the initial value for the new variable, or you can use it as any other tensor to compute a value for the new variable. ```python # Create a variable with a random value. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") # Create another variable with the same value as 'weights'. w2 = tf.Variable(weights.initialized_value(), name="w2") # Create another variable with twice the value of 'weights' w_twice = tf.Variable(weights.initialized_value() * 0.2, name="w_twice") ``` ### Custom Initialization The convenience function `tf.initialize_all_variables()` adds an op to initialize *all variables* in the model. You can also pass it an explicit list of variables to initialize. See the [Variables Documentation](../../api_docs/python/state_ops.md) for more options, including checking if variables are initialized. ## Saving and Restoring The easiest way to save and restore a model is to use a `tf.train.Saver` object. The constructor adds `save` and `restore` ops to the graph for all, or a specified list, of the variables in the graph. The saver object provides methods to run these ops, specifying paths for the checkpoint files to write to or read from. ### Checkpoint Files Variables are saved in binary files that, roughly, contain a map from variable names to tensor values. When you create a `Saver` object, you can optionally choose names for the variables in the checkpoint files. By default, it uses the value of the [`Variable.name`](../../api_docs/python/state_ops.md#Variable.name) property for each variable. ### Saving Variables Create a `Saver` with `tf.train.Saver()` to manage all variables in the model. ```python # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Add ops to save and restore all the variables. saver = tf.train.Saver() # Later, launch the model, initialize the variables, do some work, save the # variables to disk. with tf.Session() as sess: sess.run(init_op) # Do some work with the model. .. # Save the variables to disk. save_path = saver.save(sess, "/tmp/model.ckpt") print "Model saved in file: ", save_path ``` ### Restoring Variables The same `Saver` object is used to restore variables. Note that when you restore variables from a file you do not have to initialize them beforehand. ```python # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add ops to save and restore all the variables. saver = tf.train.Saver() # Later, launch the model, use the saver to restore variables from disk, and # do some work with the model. with tf.Session() as sess: # Restore variables from disk. saver.restore(sess, "/tmp/model.ckpt") print "Model restored." # Do some work with the model ... ``` ### Choosing which Variables to Save and Restore If you do not pass any argument to `tf.train.Saver()` the saver handles all variables in the graph. Each one of them is saved under the name that was passed when the variable was created. It is sometimes useful to explicitly specify names for variables in the checkpoint files. For example, you may have trained a model with a variable named `"weights"` whose value you want to restore in a new variable named `"params"`. It is also sometimes useful to only save or restore a subset of the variables used by a model. For example, you may have trained a neural net with 5 layers, and you now want to train a new model with 6 layers, restoring the parameters from the 5 layers of the previously trained model into the first 5 layers of the new model. You can easily specify the names and variables to save by passing to the `tf.train.Saver()` constructor a Python dictionary: keys are the names to use, values are the variables to manage. Notes: * You can create as many saver objects as you want if you need to save and restore different subsets of the model variables. The same variable can be listed in multiple saver objects, its value is only changed when the saver `restore()` method is run. * If you only restore a subset of the model variables at the start of a session, you have to run an initialize op for the other variables. See [`tf.initialize_variables()`](../../api_docs/python/state_ops.md#initialize_variables) for more information. ```python # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add ops to save and restore only 'v2' using the name "my_v2" saver = tf.train.Saver({"my_v2": v2}) # Use the saver object normally after that. ... ``` ================================================ FILE: SOURCE/how_tos/variables.md ================================================ # 变量:创建、初始化、保存和加载# 当训练模型时,用[变量](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md)来存储和更新参数。变量包含张量 (Tensor)存放于内存的缓存区。建模时它们需要被明确地初始化,模型训练后它们必须被存储到磁盘。这些变量的值可在之后模型训练和分析是被加载。 本文档描述以下两个TensorFlow类。点击以下链接可查看完整的API文档: - [`tf.Variable`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#Variable) 类 - [`tf.train.Saver`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#Saver) 类 ## 创建 当创建一个[变量](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md)时,你将一个`张量`作为初始值传入构造函数`Variable()`。TensorFlow提供了一系列操作符来初始化张量,初始值是[常量或是随机值](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/constant_op.md)。 注意,所有这些操作符都需要你指定张量的shape。那个形状自动成为变量的shape。变量的shape通常是固定的,但TensorFlow提供了高级的机制来重新调整其行列数。 ```python # Create two variables. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") biases = tf.Variable(tf.zeros([200]), name="biases") ``` 调用`tf.Variable()`添加一些操作(Op, operation)到graph: - 一个`Variable`操作存放变量的值。 - 一个初始化op将变量设置为初始值。这事实上是一个`tf.assign`操作. - 初始值的操作,例如示例中对`biases`变量的`zeros`操作也被加入了graph。 `tf.Variable`的返回值是Python的`tf.Variable`类的一个实例。 ## 初始化 变量的初始化必须在模型的其它操作运行之前先明确地完成。最简单的方法就是添加一个给所有变量初始化的操作,并在使用模型之前首先运行那个操作。 你或者可以从检查点文件中重新获取变量值,详见下文。 使用`tf.initialize_all_variables()`添加一个操作对变量做初始化。记得在完全构建好模型并加载之后再运行那个操作。 ```python # Create two variables. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") biases = tf.Variable(tf.zeros([200]), name="biases") ... # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Later, when launching the model with tf.Session() as sess: # Run the init operation. sess.run(init_op) ... # Use the model ... ``` ### 由另一个变量初始化 你有时候会需要用另一个变量的初始化值给当前变量初始化。由于`tf.initialize_all_variables()`是并行地初始化所有变量,所以在有这种需求的情况下需要小心。 用其它变量的值初始化一个新的变量时,使用其它变量的`initialized_value()`属性。你可以直接把已初始化的值作为新变量的初始值,或者把它当做tensor计算得到一个值赋予新变量。 ```python # Create a variable with a random value. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") # Create another variable with the same value as 'weights'. w2 = tf.Variable(weights.initialized_value(), name="w2") # Create another variable with twice the value of 'weights' w_twice = tf.Variable(weights.initialized_value() * 0.2, name="w_twice") ``` ### 自定义初始化 `tf.initialize_all_variables()`函数便捷地添加一个op来初始化模型的所有变量。你也可以给它传入一组变量进行初始化。详情请见[Variables Documentation](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md),包括检查变量是否被初始化。 ## 保存和加载 最简单的保存和恢复模型的方法是使用`tf.train.Saver`对象。构造器给graph的所有变量,或是定义在列表里的变量,添加`save`和`restore`ops。saver对象提供了方法来运行这些ops,定义检查点文件的读写路径。 ### 检查点文件 变量存储在二进制文件里,主要包含从变量名到tensor值的映射关系。 当你创建一个`Saver`对象时,你可以选择性地为检查点文件中的变量挑选变量名。默认情况下,将每个变量[`Variable.name`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#Variable.name)属性的值。 ### 保存变量 用`tf.train.Saver()`创建一个`Saver`来管理模型中的所有变量。 ```python # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Add ops to save and restore all the variables. saver = tf.train.Saver() # Later, launch the model, initialize the variables, do some work, save the # variables to disk. with tf.Session() as sess: sess.run(init_op) # Do some work with the model. .. # Save the variables to disk. save_path = saver.save(sess, "/tmp/model.ckpt") print "Model saved in file: ", save_path ``` ### 恢复变量 用同一个`Saver`对象来恢复变量。注意,当你从文件中恢复变量时,不需要事先对它们做初始化。 ```python # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add ops to save and restore all the variables. saver = tf.train.Saver() # Later, launch the model, use the saver to restore variables from disk, and # do some work with the model. with tf.Session() as sess: # Restore variables from disk. saver.restore(sess, "/tmp/model.ckpt") print "Model restored." # Do some work with the model ... ``` ### 选择存储和恢复哪些变量 如果你不给`tf.train.Saver()`传入任何参数,那么saver将处理graph中的所有变量。其中每一个变量都以变量创建时传入的名称被保存。 有时候在检查点文件中明确定义变量的名称很有用。举个例子,你也许已经训练得到了一个模型,其中有个变量命名为`"weights"`,你想把它的值恢复到一个新的变量`"params"`中。 有时候仅保存和恢复模型的一部分变量很有用。再举个例子,你也许训练得到了一个5层神经网络,现在想训练一个6层的新模型,可以将之前5层模型的参数导入到新模型的前5层中。 你可以通过给`tf.train.Saver()`构造函数传入Python字典,很容易地定义需要保持的变量及对应名称:键对应使用的名称,值对应被管理的变量。 注意: - 如果需要保存和恢复模型变量的不同子集,可以创建任意多个saver对象。同一个变量可被列入多个saver对象中,只有当saver的`restore()`函数被运行时,它的值才会发生改变。 - 如果你仅在session开始时恢复模型变量的一个子集,你需要对剩下的变量执行初始化op。详情请见[`tf.initialize_variables()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/api_docs/python/state_ops.md#initialize_variables)。 ```python # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add ops to save and restore only 'v2' using the name "my_v2" saver = tf.train.Saver({"my_v2": v2}) # Use the saver object normally after that. ... ``` >原文链接: [http://tensorflow.org/how_tos/variables/index.html](http://tensorflow.org/how_tos/variables/index.html) >翻译:[赵屹华](https://github.com/zhyhooo) 校对:[Wiki](https://github.com/jikexueyuanwiki) ================================================ FILE: SOURCE/images/getting_started.dot ================================================ digraph Dependencies { node [shape = oval]; "predictions: MatMul()" -> "data: Concat()" "data: Concat()" -> data_left "data: Concat()" -> data_right "predictions: MatMul()" -> "weight_matrix: Reshape()" "weight_matrix: Reshape()" -> "new_weights: Add()" "new_weights: Add()" -> weights "new_weights: Add()" -> deltas "update: Assign()" -> weights "update: Assign()" -> "new_weights: Add()" "InitializeAllVariables()" -> weights "InitializeAllVariables()" -> init_value } ================================================ FILE: SOURCE/index.md ================================================ # TensorFlow ## Introduction TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code. TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains as well. The following documents show you how to set up and use the TensorFlow system. ## Table of Contents ================================================ FILE: SOURCE/personal.md ================================================ ## TensorFlow 个人学习心得 说明:本章所列链接为个人学习TensorFlow的心得,很多是博客地址,从实践的角度帮助大家更好的理解官方文档的内容 示例: - [我的TensorFlow学习博客](http://wiki.jikexueyuan.com) ================================================ FILE: SOURCE/resource.md ================================================ ## TensorFlow 相关资源 - [Google官方Blog宣布TensorFlow开源](https://googleblog.blogspot.com/2015/11/tensorflow-smarter-machine-learning-for.html) - [TensorFlow WhitePaper(PDF下载)](http://download.tensorflow.org/paper/whitepaper2015.pdf) - [Jeff Dean 介绍 TensorFlow(视频)](https://www.youtube.com/watch?v=90-S1M7Ny_o&t=21m2s) - [TensorFlow 简化版接口 Scikit Flow](https://github.com/google/skflow) - [TensorFlow 使用样例](https://github.com/aymericdamien/TensorFlow-Examples) - [TensorFlow 与 mxnet, caffe 对比 @chenrudan](http://chenrudan.github.io/blog/2015/11/18/comparethreeopenlib.html) - [TensorFlow 与 torch7, caffe, mxnet 在内存使用量以及性能对比 @mli](https://github.com/dmlc/mxnet/blob/master/doc/overview_chn.md) ================================================ FILE: SOURCE/resources/bib.md ================================================ # BibTex 引用 如果你在研究中使用了 TensorFlow,并且希望引用 TensorFlow系统。我们建议你引用一下白皮书。 ``` @misc{tensorflow2015-whitepaper, title={{TensorFlow}: Large-Scale Machine Learning on Heterogeneous Systems}, url={http://tensorflow.org/}, note={Software available from tensorflow.org}, author={ Mart\'{\i}n~Abadi and Ashish~Agarwal and Paul~Barham and Eugene~Brevdo and Zhifeng~Chen and Craig~Citro and Greg~S.~Corrado and Andy~Davis and Jeffrey~Dean and Matthieu~Devin and Sanjay~Ghemawat and Ian~Goodfellow and Andrew~Harp and Geoffrey~Irving and Michael~Isard and Yangqing Jia and Rafal~Jozefowicz and Lukasz~Kaiser and Manjunath~Kudlur and Josh~Levenberg and Dan~Man\'{e} and Rajat~Monga and Sherry~Moore and Derek~Murray and Chris~Olah and Mike~Schuster and Jonathon~Shlens and Benoit~Steiner and Ilya~Sutskever and Kunal~Talwar and Paul~Tucker and Vincent~Vanhoucke and Vijay~Vasudevan and Fernanda~Vi\'{e}gas and Oriol~Vinyals and Pete~Warden and Martin~Wattenberg and Martin~Wicke and Yuan~Yu and Xiaoqiang~Zheng}, year={2015}, } ``` 文本形式如下: ``` Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org. ``` > 原文:http://tensorflow.org/resources/bib.md 翻译:[Jim-Zenn](https://github.com/Jim-Zenn) 校对:[Wiki](https://github.com/jikexueyuanwiki) ================================================ FILE: SOURCE/resources/dims_types.md ================================================ # 张量的阶、形状、数据类型 TensorFlow用张量这种数据结构来表示所有的数据.你可以把一个张量想象成一个n维的数组或列表.一个张量有一个静态类型和动态类型的维数.张量可以在图中的节点之间流通. ## 阶 在TensorFlow系统中,张量的维数来被描述为*阶*.但是张量的阶和矩阵的阶并不是同一个概念.张量的阶(有时是关于如*顺序*或*度数*或者是*n维*)是张量维数的一个数量描述.比如,下面的张量(使用Python中list定义的)就是2阶. ``` t = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] ``` 你可以认为一个二阶张量就是我们平常所说的矩阵,一阶张量可以认为是一个向量.对于一个二阶张量你可以用语句`t[i, j]`来访问其中的任何元素.而对于三阶张量你可以用't[i, j, k]'来访问其中的任何元素. 阶 |数学实例| Python 例子 --- | --- | --- 0 | 纯量 (只有大小) | `s = 483` 1 | 向量(大小和方向) | `v = [1.1, 2.2, 3.3]` 2 | 矩阵(数据表) | `m = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]` 3 | 3阶张量 (数据立体) | `t = [[[2], [4], [6]], [[8], [10], [12]], [[14], [16], [18]]]` n | n阶 (自己想想看) | `....` ## 形状 TensorFlow文档中使用了三种记号来方便地描述张量的维度:阶,形状以及维数.下表展示了他们之间的关系: 阶 | 形状 | 维数 | 实例 --- | --- | --- | --- 0 | [ ] | 0-D | 一个 0维张量. 一个纯量. 1 | [D0] | 1-D | 一个1维张量的形式[5]. 2 | [D0, D1] | 2-D |一个2维张量的形式[3, 4]. 3 | [D0, D1, D2] | 3-D | 一个3维张量的形式 [1, 4, 3]. n | [D0, D1, ... Dn] | n-D | 一个n维张量的形式 [D0, D1, ... Dn]. 形状可以通过Python中的整数列表或元祖(int list或tuples)来表示,也或者用[`TensorShape` class](../api_docs/python/framework.md#TensorShape). ## 数据类型 除了维度,Tensors有一个数据类型属性.你可以为一个张量指定下列数据类型中的任意一个类型: 数据类型 | Python 类型| 描述 --- | --- | --- `DT_FLOAT` | `tf.float32` | 32 位浮点数. `DT_DOUBLE` | `tf.float64` | 64 位浮点数. `DT_INT64` | `tf.int64` | 64 位有符号整型. `DT_INT32` | `tf.int32` | 32 位有符号整型. `DT_INT16` | `tf.int16` | 16 位有符号整型. `DT_INT8` | `tf.int8` | 8 位有符号整型. `DT_UINT8` | `tf.uint8` | 8 位无符号整型. `DT_STRING` | `tf.string` | 可变长度的字节数组.每一个张量元素都是一个字节数组. `DT_BOOL` | `tf.bool` |布尔型. `DT_COMPLEX64` | `tf.complex64` | 由两个32位浮点数组成的复数:实数和虚数. `DT_QINT32` | `tf.qint32` | 用于量化Ops的32位有符号整型. `DT_QINT8` | `tf.qint8` | 用于量化Ops的8位有符号整型. `DT_QUINT8` | `tf.quint8` |用于量化Ops的8位无符号整型. 原文:[Tensor Ranks, Shapes, and Types](http://www.tensorflow.org/resources/dims_types.md) 翻译:[nb312](https://github.com/nb312) 校对:[lonlonago]( https://github.com/lonlonago) ================================================ FILE: SOURCE/resources/faq.md ================================================ # 常见问题 此文档对关于TensorFlow的一些常见问题提供了答案,如果这里没有你问题的答案,你可能会在[社区资源](tensorflow-zh/SOURCE/resoreces/index.md)中找到它。 ## 内容 ### [常见问题](#AUTOGENERATED-frequently-asked-questions) * [建立 TensorFlow graph](#AUTOGENERATED-building-a-tensorflow-graph) * [运行 TensorFlow 计算过程](#AUTOGENERATED-running-a-tensorflow-computation) * [Variables](#AUTOGENERATED-variables) * [Tensor shapes](#AUTOGENERATED-tensor-shapes) * [TensorBoard](#AUTOGENERATED-tensorboard) * [扩展 TensorFlow](#AUTOGENERATED-extending-tensorflow) * [其他问题](#AUTOGENERATED-miscellaneous) ## 建立 TensorFlow graph 参看 [建立 graph 的 API 文档](../api_docs/python/framework.md). #### 为什么`c = tf.matmul(a, b)` 不立即执行矩阵相乘? 在 TensorFlow 的 Python API 中, `a`, `b`, and `c` 都是 [`Tensor`](../api_docs/python/framework.md#Tensor) 对象. 一个 `Tensor` 对象是一个操作(operation)结果的字符别名,它实际上并不储存操作(operation)输出结果的值。 TensorFlow 鼓励用户去建立复杂的表达式(如整个神经网络及其梯度)来形成 data flow graph 。 然后你可以将整个 data flow graph 的计算过程交给一个 TensorFlow 的 [`Session`](../api_docs/python/client.md#Session), 此 `Session` 可以运行整个计算过程,比起操作(operations)一条一条的执行效率高的多。 #### 设备是如何命名的? 对CPU设备而言,支持的设备名是`"/device:CPU:0"` (或 `"/cup:0"`),对第 *i* 个 GPU 设备是`"/device:GPU:i"` (或 `"/gpu:i"`) #### 如何在指定的设备上运行操作(operations)? 在 [`with tf.device(name):`](../api_docs/python/framework.md#device) context 中创建操作(operation),这样可以在指定的设备上运行操作(operation)。 关于 TensorFlow 怎样将操作(operations)分配给设备的细节,参看 [TensorFlow使用 GPU ](../how_tos/using_gpu/index.md); 使用多 GPU 的示范实例参看 [CIFAR-10 教程](../tutorials/deep_cnn/index.md)。 #### 可用的 tensor 有哪些不同的类型? TensorFlow 支持许多种不同的数据类型和 tensor shape ,更多细节请参看 [ranks, shapes, and type reference](../resources/dims_types.md) ## 运行 TensorFlow 计算过程。 参看 [运行 graph 的 API 文档](../api_docs/python/client.md). #### 请详细解释 feeding 和 placeholders? Feeding 是 TensorFlow Session API 的一种机制,它允许你在运行时用不同的值替换一个或多个 tensor 的值。 [`Session.run()`](../api_docs/python/client.md#Session.run) 的参数 `feed_dict` 是一个字典, 它将 [`Tensor`](../api_docs/python/framework.md) 对象映射为 numpy 的数组(和一些其他类型)。 在执行 step 时,这些数组就是 tensor 的值。 你常会碰到某些 tensor 总是有值的,比如 inputs。 [`tf.placeholder()`](../api_docs/python/io_ops.md#placeholder) 操作(operation)允许你定义一种必须提供值的 tensor ,你也可以随意限定它们的 shape。关于如何使用 placelolders 和 feeding 为神经网络提供训练数据的例子,请参看[初学者的 MNIST 教程](../tutorials/mnist/beginners/index.md) #### `Session.run()` 和 `Tensor.eval()` 有什么区别? 如果 `t` 是一个 [`Tensor`](../api_docs/python/framework.md#Tensor) 对象, [`t.eval()`](../api_docs/python/framework.md#Tensor.eval) 就是 [`sess.run(t)`](../api_docs/python/client.md#Session.run) (`sess` 是当前[默认 session](../api_docs/python/client.md#get_default_session))的简写。 以下两段小程序是等效的: ```python # 使用 `Session.run()`. sess = tf.Session() c = tf.constant(5.0) print sess.run(c) # 使用 `Tensor.eval()`. c = tf.constant(5.0) with tf.Session(): print c.eval() ``` 在第二个例子中, session 的作用就象 [context manager](https://docs.python.org/2.7/reference/compound_stmts.html#with) , context manager 在 `with` 块的生存期,将 session 作为默认的 session。对简单应用的情形(如单元测试),context manager 的方法可以得到更简洁的代码; 如果你的代码要处理多个 graph 和 session ,更直白的方式可能是显式调用 `Session.run()`。 #### Sessions 有生存期吗? 调用时产生的 tensors 呢? Session 能够占有资源,例如 [variables](../api_docs/python/state_ops.md#Variable),[queues](../api_docs/python/io_ops.md#QueueBase), 和 [readers](../api_docs/python/io_ops.md#ReaderBase); 这些资源会使用相当大量的内存。 当调用[`Session.close()`](../api_docs/python/client.md#Session.close) 关闭 session 后,这些资源(和相关的内存)就被释放了。 作为调用 [`Session.run()`](../api_docs/python/client.md) 过程的一部分所创建的 tensors, 会在调用时或调用结束前释放。 #### 我可以在多个计算机上运行分布式的训练吗? 最初的 TensorFlow 开源版本支持单一计算机内的多设备(CPUs 和 GPUs)。 我们也正在致力于一个分布式的版本:如果你有兴趣,请告知我们,这样我们可以做相应的调整。 #### 运行时会并行计算图的执行的各个部分(parts of graph execution)吗? TensorFlow 运行时会在许多不同的层面(dimensions)并行图的执行(graph execution): * 在一个CPU中用多核或是一个GPU中用多线程来并行许多单独的操作(operation)。 * 在 TensorFlow graph 中各个独立的节点可以在多个设备上并行,这样就提供了加速的可能。[CIFAR-10 用多 GPU 训练](../tutorials/deep_cnn/index.md). * Session API 允许并行执行多并发的 steps (如 调用 [Session.run()](../api_docs/python/client.md#Session.run))。 如果单一的 step 不使用你计算机中所有的资源,这种方法可以使运行时有更高的吞吐量。 #### TensorFlow 支持哪些客户端编程语言? TensorFlow 被设计成为支持多种客户端语言。当前支持最好的客户端语言是 [Python](../api_docs/python/index.md)。 [C++ 客户端 API](../api_docs/cc/index.md) 提供了启动 graph 和运行 steps 的接口; 我们还有一个 [用 C++ 建立 graph 的 API](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/cc/tutorials/example_trainer.cc),此 API 是实验性的。 从社区的利益出发,我们想要支持更多的客户端语言。 TensorFlow 有一个 [基于 C 的客户端 API](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/tensor_c_api.h),它使得用许多不同的语言创建客户端变得很容易。我们请大家在新语言绑定上做出努力。 #### TensorFlow 会利用我计算机上所有可用的设备(GPUs 和 CPUs)吗? TensorFlow 支持多 GPU 和 CPU。 有关 TensorFlow 如何将操作(operations)分配到设备的细节请参看 [TensorFlow 如何使用 GPUs](../how_tos/using_gpu/index.md) 的文档,有关使用多 GPU 的示范实例请参看 [CIFAR-10 教程](../tutorials/deep_cnn/index.md) 请注意, TensorFlow 只使用计算能力(compute capability)大于 3.5 的 GPU 设备。 #### 当使用一个 reader 或 一个 queue 时,为什么 `Session.run()` 会挂起? [reader](../api_docs/python/io_ops.md#ReaderBase) 类和 [queue](../api_docs/python/io_ops.md#QueueBase) 类提供特殊的操作(operations),这些操作(operations)在有可用的输入(对有界队列则是空闲空间)前会 *阻塞* 。使用这些操作(operations)你可以创建复杂的[输入流水线(input pipelines)](../how_tos/reading_data/index.md) ,不过,这会使 TensorFlow 的计算过程更复杂。有关如何使用这些操作(operations)的更多信息请参看 how-to 文档中的[使用 `QueueRunner` 对象来控制 queues 和 readers](../how_tos/reading_data/index.md#QueueRunners)。 ## Variables 参看 [Variables](../how_tos/variables/index.md) 和 [变量作用域](../how_tos/variable_scope/index.md) 的 how-to 文档,还有[关于变量的 API 文档](../api_docs/python/state_ops.md) #### 变量的生存期是? 在某一 session 中,当你一开始运行 [`tf.Variable.initializer`](../api_docs/python/state_ops.md#Variable.initializer) 操作(operation)时,变量就会被创建。此 [`session 关闭后`](../api_docs/python/client.md#Session.close)它就被摧毁(destroyed)了。 #### 并发读取或存入变量时会是什么情况? 变量可以进行并发的读和写操作(operation)。由于变量是并发(concurrently)更新的, 所以从一个变量中读出的值可能会改变。在不互斥的条件下,对一个变量的并发的许多赋值操作(operation)是默认允许运行的。在对一个变量赋值时,如果想要加锁,可以将 `use_locking=True` 传递给 [`Variable.assign()`](../api_docs/python/state_ops.md#Variable.assign)。 ## Tensor shapes 参看 [`TensorShape` API 文档](../api_docs/python/framework.md#TensorShape). #### 在 Python 中我怎么判断一个 tensor 的 shape ? 在 TensorFlow 中,一个 tensor 具备静态和动态两种 shape 。静态的 shape 可以用 [`tf.Tensor.get_shape()`](../api_docs/python/framework.md#Tensor.get_shape) 方法读出:这种 shape 是由此 tensor 在创建时使用的操作(operations)推导得出的,可能是 [partially complete](../api_docs/python/framework.md#TensorShape) 的。如果静态 shape 没有完整定义(not fully defined)的话,则一个 tensor 的动态 shape 可通过求 [`tf.shape(t)`](../api_docs/python/array_ops.md#shape) 的值得到。 #### `x.set_shape()` 和 `x = tf.reshape(x)` 有什么区别? [`tf.Tensor.set_shape()`](../api_docs/python/framework.md) 方法(method)会更新(updates)一个 `Tensor` 对象的静态 shape ,当静态 shape 信息不能够直接推导得出的时候,此方法常用来提供额外的 shape 信息。它不改变此 tensor 动态 shape 的信息。 [`tf.reshape()`](../api_docs/python/array_ops.md#reshape) 操作(operation)会以不同的动态 shape 创建一个新的 tensor。 #### 我怎么创建这样一个 graph ,它在批次大小可变(variable batch sizes)的情形下也可以正常运作? 如果能够创建一个 graph ,在批次大小可变(variable batch sizes)的情形下也可以正常运作将会是十分有用的,例如可以使用相同的代码完成(小)批量训练((mini-)batch training)和单例推导(single-instance inference)。这样生成的 graph 可以[保存起来当作协议缓存(protocol buffer)](../api_docs/python/framework.md#Graph.as_graph_def),也可以[导入至其他的程序](../api_docs/python/framework.md#import_graph_def)。 创建一个可变大小的 graph 时,要记住最重要的事情是不要将批次大小(batch size)编码成为 Python 常数,而是用一个字符性(symbolic)的 `Tensor` 来表示。下面的提示可能会有用: * 用 [`batch_size = tf.shape(input)[0]`](../api_docs/python/array_ops.md#shape) 从一个叫 `input` 的 `Tensor` 提取批次的维度(batch dimention),再将其存入一个名为 `batch_size` 的 `Tensor` 。 * 用 [`tf.reduce_mean()`](../api_docs/python/math_ops.md#reduce_mean) 而不是 `tf.reduce_sum(...) / batch_size`。 * 如果你使用 [placeholders for feeding input](../how_tos/reading_data/index.md#Feeding),你就可以用 [`tf.placeholder(..., shape=[None, ...])`](../api_docs/python/io_ops.md#placeholder) 通过创建 placeholder 来具体指定一个可变的批次维度(variable batch dimention)。shape 的 `None` 元素与可变大小的维度(a variable-sized dimension)相对应。 ## TensorBoard #### 我怎样视觉化一个 TensorFlow graph ? 参看[ graph 的视觉化教程](../how_tos/graph_viz/index.md). #### 向 TensorBoard 发送数据的最简单的方法是什么? 给你的 TensorFlow graph 增加 summary 操作(ops),接着用 [`SummaryWriter`](../api_docs/python/train.md#SummaryWriter) 将这些 summaries 写入一个 log directory。然后用以下命令启动 TensorBoard 。 `python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory` 更多细节请参看 [Summaries 和 TensorBoard 教程](../how_tos/summaries_and_tensorboard/index.md)。 ## 扩展 TensorFlow 参看有关[向 TensorFlow 添加新操作(oprations)](../how_tos/adding_an_op/index.md) 的 how-to 文档。 #### 我的数据是自定义格式,要怎样用 TensorFlow 来读取它? 有两种主要的操作(operation)来处理自定义格式的数据。 较简单的方法:用 Python 编写一段分词的代码(parsing code),将数据转换成为 numpy array,然后用此数据把一个 [`tf.placeholder()`] (../api_docs/python/io_ops.md#placeholder) 传送给一个 tensor 。更多的细节参见 [使用 placeholders 进行输入](../how_tos/reading_data/index.md#Feeding) 的相关文档。这个方法比较容易实现,不过分词的部分会成为性能的瓶颈。 更高效的方法是[添加一个用 C++ 编写的操作(op)](../how_tos/adding_an_op/index.md),用这个操作(operation)来对你的数据格式进行分词(parse)。 [新数据格式处理指南](../how_tos/new_data_formats/index.md)中有更多相关步骤的信息。 #### 我如何定义操作(operation)使得它能够接受可变数量的输入? TensorFlow 的操作(operation)注册机制允许你定义几种输入:单独的 tensor,一列相同类型的 tensors (例如把一个可变长列表中的 tensors 相加), 一列不同类型的 tensors (例如将一个 tuple 中的 tensors 入队(enqueue))。有关怎样定义这些不同的输入类型的更多细节,请参看[添加具有一列输入或输出的操作(op)](../how_tos/adding_an_op/index.md#list-input-output)的相关文档。 ## 其他问题 #### TensorFlow 能使用 Python 3 吗? 我们只用 Python 2.7 进行了测试。我们了解对 Python 3 的兼容性来说,还需要有一些修改,欢迎大家朝这个方向多努力。 #### TensorFlow 的代码风格有什么规则? TensorFlow Python API 遵循 [PEP8](https://www.python.org/dev/peps/pep-0008/) 惯例。 * 特别的,我们使用 `CamelCase` 格式作为类名, `snake_case` 格式作为方程名, 方法名, 和属性名。我们也遵循 [Google Python style guide](https://google.github.io/styleguide/pyguide.html)。 TensorFlow C++ 代码遵循 [Google C++ style guide](http://google.github.io/styleguide/cppguide.html)。 (* 有一条例外: 我们使用 2 空格缩进而不是 4 空格缩进) 原文:[Frequently Asked Questions](http://tensorflow.org/resources/faq.md) 翻译:[Terence Cooper](https://github.com/TerenceCooper) 校对:[Wiki](https://github.com/jikexueyuanwiki) ================================================ FILE: SOURCE/resources/glossary.md ================================================ # 术语表 ### 广播操作(Broadcasting operation) 一种用[numpy-style broadcasting](http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html)来保证tensor参数的形态兼容的操作。 ### Devices 一块可以用来运算并且拥有自己的地址空间的硬件,比如GPU和CPU。 ### eval Tensor 的一个方法,返回 Tensor 的值。触发任意一个图计算都需要计算出这个值。只能在一个已经启动的会话的图中才能调用该 Tensor 值。 ### Feed TensorFlow 的一个概念:把一个 Tensor 直接连接到一个会话图表中的任意节点。feed 不是在构建图(graph)的时候创建,而是在触发图的执行操作时去申请。一个 feed 临时替代一个带有 Tensor 值的节点。把feed数据作为run( )方法和eval( )方法的参数来初始化运算。方法运行结束后,替换的 feed 就会消失,而最初的节点定义仍然还在。可以通过tf.placeholder( )把特定的节点指定为 feed 节点来创建它们。详见[Basic Usage](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/get_started/basic_usage.md). ### Fetch TensorFlow中的一个概念:为了取回运算操作的输出结果。取回的申请发生在触发执行图操作的时候,而不是发生在建立图的时候。如果要取回一个或多个节点(node)的 Tensor 值,可以通过在 Session 对象上调用run( )方法并将待取回节点(node)的列表作为参数来执行图表(graph)。详见[Basic Usage](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/get_started/basic_usage.md)。 ### Graph(图) 把运算任务描述成一个直接的无环图形(DAG),图表中的节点(node)代表必须要实现的一些操作。图中的边代表数据或者可控的依赖。GratheDef 是系统中描述一个图表的协议(api),它由一个 NodeDefs 集合组成。一个GraphDef可以转化成一个更容易操作的图表对象。 ### IndexedSlices(索引化切片) 在 Python API 中,TensorFlow 仅仅在第一维上对 Tensor 有所体现。如果一个 Tensor 有k维,那么一个 IndexedSlices 实例在逻辑上代表一个沿着这个 Tensor 第一维的(k-1)维切片的集合。切片的索引被连续储存在一个单独的一维向量中,而对应的切片则被拼接成一个单独的k维 Tensor。如果 sparsity 不是受限于第一维空间,请用 SparseTensor。 ### Node(节点) 图中的一个元素。 把启动一个特定操作的方式称为特定运算图表的一个节点,包括任何用来配置这个操作的属性的值。对于那些多形态的操作,这些属性包括能完全决定这个节点(Node)签名的充分信息。详见graph.proto。 ### 操作(Op/operation) 在 TensorFlow 的运行时中,它是一种类似 add 或 matmul 或 concat的运算。可以用[how to add an op](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/adding_an_op/index.md)中的方法来向运行时添加新的操作。 在 Python 的API中,它是图中的一个节点。在[tf.Operation](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#Operation)类中列举出了这些操作。一个操作(Operation)的 type 属性决定这个节点(node)的操作类型,比如add和matmul。 ### Run 在一个运行的图中执行某种操作的行为。要求图必须运行在会话中。 在 Python 的 API 中,它是 Session 类的一个方法[tf.Session.run](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/client.md#Session)。可以通过 Tensors 来订阅或获取run( )操作。 在C++的API中,它是[tensorflow::Session](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/client.md#Session)类 的一个方法。 ### Session(会话) 启动图的第一步是创建一个 Session 对象。Session 提供在图中执行操作的一些方法。 在 Python API中,使用[tf.Session](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/client.md#Session)。 在 C++ 的API中,[tensorflow::Session](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassSession.md)是用来创建一个图并运行操作的类: ### Shape Tensor 的维度和它们的大小。 在一个已经启动的图中,它表示流动在节点(node)之间的 Tensor 的属性。一些操作对 shape 有比较强的要求,如果没有 Shape 属性则会报告错误。 在 Python API中,用创建图的 API 来说明 Tensor 的 Shape 属性。Tensor 的Shape 属性要么只有部分已知,要么全部未知。详见[tf.TensroShape](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#TensorShape) 在C++中,Shape 类用来表示 Tensor 的维度。[tensorflow::TensorShape](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassTensorShape.md)。 ### SparseTensor 在 Python API 中,它用来表示在 TensorFlow 中稀疏散落在任意地方的 Tensor 。SparseTensor 以字典-值格式来储存那些沿着索引的非空值。换言之,m个非空值,就包含一个长度为m的值向量和一个由m列索引(indices)组成的矩阵。为了提升效率,SparseTensor 需要将 indice(索引)按维度的增加来按序存储,比如行主序。如果稀疏值仅沿着第一维度,就用 IndexedSlices。 ### Tensor Tensor是一种特定的多维数组。比如,一个浮点型的四维数组表示一小批由[batch,height,width,channel]组成的图片。 在一个运行的图(graph)中,它是一种流动在节点(node)之间的数据。 在 Python 中,Tensor 类表示添加到图的操作中的输入和输出,见[tf.Tensor](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#Tensor),这样的类不持有数据。 在C++中,Tensor是方法[Session::Run( )](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassSession.md)的返回值,见[tensorflow::Tensor](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/cc/ClassTensor.md),这样的 Tensor 持有数据。 原文:[Glossary](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/resources/glossary.md) 翻译:[leege100](https://github.com/leege100) 校对:[lonlonago](https://github.com/lonlonago) ================================================ FILE: SOURCE/resources/overview.md ================================================ # 其他资源 ## TensorFlow 白皮书 在这份白皮书里,你可以找到关于 TensorFlow 编程模型的更多详情和 TensorFlow 的实现原理。 * [TensorFlow: Large-scale machine learning on heterogeneous systems](http://download.tensorflow.org/paper/whitepaper2015.pdf) ### 引用 如果你在你的研究中使用了 TensorFlow,并且希望在引用中注记 TensorFlow,我们建议你引用上面这篇论文。你可以使用这个 [BibTeX 目录](../resources/bib.md)。 随着项目的继续发展,我们还会在这个建议引用列表添加新的论文。 ## 社区 ### 讨论 * [GitHub ](https://github.com/tensorflow/tensorflow) * [Stack Overflow](https://stackoverflow.com/questions/tagged/tensorflow) * [TensorFlow 讨论邮件列表](https://groups.google.com/a/tensorflow.org/d/forum/discuss) ### 报告 Issues * [TensorFlow issues](https://github.com/tensorflow/tensorflow/issues) ### 开发 * 如果你有兴趣为 TensorFlow 贡献代码请 [阅读这篇代码贡献指南](https://github.com/tensorflow/tensorflow/blob/master/CONTRIBUTING.md). > 原文:[http://tensorflow.org/resources/bib.md](http://tensorflow.org/resources/bib.md) > > 翻译:[Jim-Zenn](https://github.com/Jim-Zenn) > > 校对:[lonlonago]( https://github.com/lonlonago) ================================================ FILE: SOURCE/resources/uses.md ================================================ # 应用实例 本页介绍了一些 TensorFlow 系统当前在实际中的应用。 > 如果您在做研究、教育、或在某些产品中正在使用 TensorFlow, > 我们非常乐意在这里添加一些有关您的使用情况。 > 请随时给我们发电子邮件简要说明您是如何使用TensorFlow的, > 或者给我们发 pull request来添加一个条目到本文件。 下面列出了一些 TensorFlow 的用途。 * **RankBrain** * **组织**: Google * **域名**: Information Retrieval * **描述**: 对www.google.com搜索排名大规模部署的深层神经网络。 * **更多信息**: ["Google Turning Over Its Lucrative Search to AI Machines"](http://www.bloomberg.com/news/articles/2015-10-26/google-turning-its-lucrative-web-search-over-to-ai-machines) * **Inception Image Classification Model** * **组织**: Google * **描述**: 研究高精确的计算机视觉模型,赢得了2014年Imagenet图像分类的挑战 (ILSVRC 2014) * **更多信息**: 关于 Baseline model 的描述 [Arxiv paper](http://arxiv.org/abs/1409.4842) * **SmartReply** * **组织**: Google * **描述**: 基于深度 LSTM 模型的自动生成电子邮件回复 * **更多信息**: [Google research blog post](http://googleresearch.blogspot.com/2015/11/computer-respond-to-this-email.html) * **Massively Multitask Networks for Drug Discovery** * **组织**: Google and Stanford University * **域名**: Drug discovery * **描述**: 基于深度神经网络模型的药物探索 * **更多信息**: [Arxiv paper](http://arxiv.org/abs/1502.02072) * **On-Device Computer Vision for OCR** * **组织**: Google * **描述**: 用设备内置的计算机视觉模型来做‘光学字符识别’(OCR)以实现实时翻译。 * **更多信息**: [Google Research blog post](http://googleresearch.blogspot.com/2015/07/how-google-translate-squeezes-deep.html) 原文:[http://tensorflow.org/resources/uses.md](http://tensorflow.org/resources/uses.md) 翻 译:[andyiac](https://github.com/andyiac) 校 对:[lonlonago](https://github.com/lonlonago) ================================================ FILE: SOURCE/tutorials/BUILD ================================================ # Description: # Top-level tutorials files package(default_visibility = ["//tensorflow:internal"]) licenses(["notice"]) # Apache 2.0 exports_files(["LICENSE"]) filegroup( name = "all_files", srcs = glob( ["**/*"], exclude = [ "**/METADATA", "**/OWNERS", ], ), ) ================================================ FILE: SOURCE/tutorials/__init__.py ================================================ ================================================ FILE: SOURCE/tutorials/deep_cnn/cifar_tensorboard.html ================================================ TensorBoard Demo ================================================ FILE: SOURCE/tutorials/deep_cnn/index.md ================================================ # Convolutional Neural Networks > **NOTE:** This tutorial is intended for *advanced* users of TensorFlow and assumes expertise and experience in machine learning. ## Overview CIFAR-10 classification is a common benchmark problem in machine learning. The problem is to classify RGB 32x32 pixel images across 10 categories: ```airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck.``` ![CIFAR-10 Samples](./cifar_samples.png "CIFAR-10 Samples, from http://www.cs.toronto.edu/~kriz/cifar.html") For more details refer to the [CIFAR-10 page](http://www.cs.toronto.edu/~kriz/cifar.html) and a [Tech Report](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf) by Alex Krizhevsky. ### Goals The goal of this tutorial is to build a relatively small convolutional neural network (CNN) for recognizing images. In the process, this tutorial: 1. Highlights a canonical organization for network architecture, training and evaluation. 2. Provides a template for constructing larger and more sophisticated models. The reason CIFAR-10 was selected was that it is complex enough to exercise much of TensorFlow's ability to scale to large models. At the same time, the model is small enough to train fast, which is ideal for trying out new ideas and experimenting with new techniques. ### Highlights of the Tutorial The CIFAR-10 tutorial demonstrates several important constructs for designing larger and more sophisticated models in TensorFlow: * Core mathematical components including [convolution]( ../../api_docs/python/nn.md#conv2d), [rectified linear activations]( ../../api_docs/python/nn.md#relu), [max pooling]( ../../api_docs/python/nn.md#max_pool) and [local response normalization]( ../../api_docs/python/nn.md#local_response_normalization). * [Visualization](../../how_tos/summaries_and_tensorboard/index.md) of network activities during training, including input images, losses and distributions of activations and gradients. * Routines for calculating the [moving average](../../api_docs/python/train.md#ExponentialMovingAverage) of learned parameters and using these averages during evaluation to boost predictive performance. * Implementation of a [learning rate schedule](../../api_docs/python/train.md#exponential_decay) that systematically decrements over time. * Prefetching [queues](../../api_docs/python/io_ops.md#shuffle_batch) for input data to isolate the model from disk latency and expensive image pre-processing. We also provide a multi-GPU version of the model which demonstrates: * Configuring a model to train across multiple GPU cards in parallel. * Sharing and updating variables among multiple GPUs. We hope that this tutorial provides a launch point for building larger CNNs for vision tasks on TensorFlow. ### Model Architecture The model in this CIFAR-10 tutorial is a multi-layer architecture consisting of alternating convolutions and nonlinearities. These layers are followed by fully connected layers leading into a softmax classifier. The model follows the architecture described by [Alex Krizhevsky](https://code.google.com/p/cuda-convnet/), with a few differences in the top few layers. This model achieves a peak performance of about 86% accuracy within a few hours of training time on a GPU. Please see [below](#evaluating-a-model) and the code for details. It consists of 1,068,298 learnable parameters and requires about 19.5M multiply-add operations to compute inference on a single image. ## Code Organization The code for this tutorial resides in [`tensorflow/models/image/cifar10/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/). File | Purpose --- | --- [`cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py) | Reads the native CIFAR-10 binary file format. [`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py) | Builds the CIFAR-10 model. [`cifar10_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_train.py) | Trains a CIFAR-10 model on a CPU or GPU. [`cifar10_multi_gpu_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py) | Trains a CIFAR-10 model on multiple GPUs. [`cifar10_eval.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_eval.py) | Evaluates the predictive performance of a CIFAR-10 model. ## CIFAR-10 Model The CIFAR-10 network is largely contained in [`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py). The complete training graph contains roughly 765 operations. We find that we can make the code most reusable by constructing the graph with the following modules: 1. [**Model inputs:**](#model-inputs) `inputs()` and `distorted_inputs()` add operations that read and preprocess CIFAR images for evaluation and training, respectively. 1. [**Model prediction:**](#model-prediction) `inference()` adds operations that perform inference, i.e. classification, on supplied images. 1. [**Model training:**](#model-training) `loss()` and `train()` add operations that compute the loss, gradients, variable updates and visualization summaries. ### Model Inputs The input part of the model is built by the functions `inputs()` and `distorted_inputs()` which read images from the CIFAR-10 binary data files. These files contain fixed byte length records, so we use [`tf.FixedLengthRecordReader`](../../api_docs/python/io_ops.md#FixedLengthRecordReader). See [Reading Data](../../how_tos/reading_data/index.md#reading-from-files) to learn more about how the `Reader` class works. The images are processed as follows: * They are cropped to 24 x 24 pixels, centrally for evaluation or [randomly](../../api_docs/python/image.md#random_crop) for training. * They are [approximately whitened](../../api_docs/python/image.md#per_image_whitening) to make the model insensitive to dynamic range. For training, we additionally apply a series of random distortions to artificially increase the data set size: * [Randomly flip](../../api_docs/python/image.md#random_flip_left_right) the image from left to right. * Randomly distort the [image brightness](../../api_docs/python/image.md#random_brightness). * Randomly distort the [image contrast](../../api_docs/python/image.md#tf_image_random_contrast). Please see the [Images](../../api_docs/python/image.md) page for the list of available distortions. We also attach an [`image_summary`](../../api_docs/python/train.md#image_summary) to the images so that we may visualize them in TensorBoard. This is a good practice to verify that inputs are built correctly.
Reading images from disk and distorting them can use a non-trivial amount of processing time. To prevent these operations from slowing down training, we run them inside 16 separate threads which continuously fill a TensorFlow [queue](../../api_docs/python/io_ops.md#shuffle_batch). ### Model Prediction The prediction part of the model is constructed by the `inference()` function which adds operations to compute the *logits* of the predictions. That part of the model is organized as follows: Layer Name | Description --- | --- `conv1` | [convolution](../../api_docs/python/nn.md#conv2d) and [rectified linear](../../api_docs/python/nn.md#relu) activation. `pool1` | [max pooling](../../api_docs/python/nn.md#max_pool). `norm1` | [local response normalization](../../api_docs/python/nn.md#local_response_normalization). `conv2` | [convolution](../../api_docs/python/nn.md#conv2d) and [rectified linear](../../api_docs/python/nn.md#relu) activation. `norm2` | [local response normalization](../../api_docs/python/nn.md#local_response_normalization). `pool2` | [max pooling](../../api_docs/python/nn.md#max_pool). `local3` | [fully connected layer with rectified linear activation](../../api_docs/python/nn.md). `local4` | [fully connected layer with rectified linear activation](../../api_docs/python/nn.md). `softmax_linear` | linear transformation to produce logits. Here is a graph generated from TensorBoard describing the inference operation:
> **EXERCISE**: The output of `inference` are un-normalized logits. Try editing the network architecture to return normalized predictions using [`tf.softmax()`] (../../api_docs/python/nn.md#softmax). The `inputs()` and `inference()` functions provide all the components necessary to perform evaluation on a model. We now shift our focus towards building operations for training a model. > **EXERCISE:** The model architecture in `inference()` differs slightly from the CIFAR-10 model specified in [cuda-convnet](https://code.google.com/p/cuda-convnet/). In particular, the top layers are locally connected and not fully connected. Try editing the architecture to exactly replicate that fully connected model. ### Model Training The usual method for training a network to perform N-way classification is [multinomial logistic regression](https://en.wikipedia.org/wiki/Multinomial_logistic_regression), aka. *softmax regression*. Softmax regression applies a [softmax](../../api_docs/python/nn.md#softmax) nonlinearity to the output of the network and calculates the [cross-entropy](../../api_docs/python/nn.md#softmax_cross_entropy_with_logits) between the normalized predictions and a [1-hot encoding](../../api_docs/python/sparse_ops.md#sparse_to_dense) of the label. For regularization, we also apply the usual [weight decay](../../api_docs/python/nn.md#l2_loss) losses to all learned variables. The objective function for the model is the sum of the cross entropy loss and all these weight decay terms, as returned by the `loss()` function. We visualize it in TensorBoard with a [scalar_summary](../../api_docs/python/train.md#scalar_summary): ![CIFAR-10 Loss](./cifar_loss.png "CIFAR-10 Total Loss") We train the model using standard [gradient descent](https://en.wikipedia.org/wiki/Gradient_descent) algorithm (see [Training](../../api_docs/python/train.md) for other methods) with a learning rate that [exponentially decays](../../api_docs/python/train.md#exponential_decay) over time. ![CIFAR-10 Learning Rate Decay](./cifar_lr_decay.png "CIFAR-10 Learning Rate Decay") The `train()` function adds the operations needed to minimize the objective by calculating the gradient and updating the learned variables (see [`GradientDescentOptimizer`](../../api_docs/python/train.md#GradientDescentOptimizer) for details). It returns an operation that executes all the calculations needed to train and update the model for one batch of images. ## Launching and Training the Model We have built the model, let's now launch it and run the training operation with the script `cifar10_train.py`. ```shell python cifar10_train.py ``` **NOTE:** The first time you run any target in the CIFAR-10 tutorial, the CIFAR-10 dataset is automatically downloaded. The data set is ~160MB so you may want to grab a quick cup of coffee for your first run. You should see the output: ```shell Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. 2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) 2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) 2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) 2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) 2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) 2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) ... ``` The script reports the total loss every 10 steps as well the speed at which the last batch of data was processed. A few comments: * The first batch of data can be inordinately slow (e.g. several minutes) as the preprocessing threads fill up the shuffling queue with 20,000 processed CIFAR images. * The reported loss is the average loss of the most recent batch. Remember that this loss is the sum of the cross entropy and all weight decay terms. * Keep an eye on the processing speed of a batch. The numbers shown above were obtained on a Tesla K40c. If you are running on a CPU, expect slower performance. > **EXERCISE:** When experimenting, it is sometimes annoying that the first training step can take so long. Try decreasing the number of images initially that initially fill up the queue. Search for `NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN` in `cifar10.py`. `cifar10_train.py` periodically [saves](../../api_docs/python/state_ops.md#Saver) all model parameters in [checkpoint files](../../how_tos/variables/index.md#saving-and-restoring) but it does *not* evaluate the model. The checkpoint file will be used by `cifar10_eval.py` to measure the predictive performance (see [Evaluating a Model](#evaluating-a-model) below). If you followed the previous steps, then you have now started training a CIFAR-10 model. [Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0) The terminal text returned from `cifar10_train.py` provides minimal insight into how the model is training. We want more insight into the model during training: * Is the loss *really* decreasing or is that just noise? * Is the model being provided appropriate images? * Are the gradients, activations and weights reasonable? * What is the learning rate currently at? [TensorBoard](../../how_tos/summaries_and_tensorboard/index.md) provides this functionality, displaying data exported periodically from `cifar10_train.py` via a [`SummaryWriter`](../../api_docs/python/train.md#SummaryWriter). For instance, we can watch how the distribution of activations and degree of sparsity in `local3` features evolve during training:
Individual loss functions, as well as the total loss, are particularly interesting to track over time. However, the loss exhibits a considerable amount of noise due to the small batch size employed by training. In practice we find it extremely useful to visualize their moving averages in addition to their raw values. See how the scripts use [`ExponentialMovingAverage`](../../api_docs/python/train.md#ExponentialMovingAverage) for this purpose. ## Evaluating a Model Let us now evaluate how well the trained model performs on a hold-out data set. the model is evaluated by the script `cifar10_eval.py`. It constructs the model with the `inference()` function and uses all 10,000 images in the evaluation set of CIFAR-10. It calculates the *precision at 1:* how often the top prediction matches the true label of the image. To monitor how the model improves during training, the evaluation script runs periodically on the latest checkpoint files created by the `cifar10_train.py`. ```shell python cifar10_eval.py ``` > Be careful not to run the evaluation and training binary on the same GPU or else you might run out of memory. Consider running the evaluation on a separate GPU if available or suspending the training binary while running the evaluation on the same GPU. You should see the output: ```shell 2015-11-06 08:30:44.391206: precision @ 1 = 0.860 ... ``` The script merely returns the precision @ 1 periodically -- in this case it returned 86% accuracy. `cifar10_eval.py` also exports summaries that may be visualized in TensorBoard. These summaries provide additional insight into the model during evaluation. The training script calculates the [moving average](../../api_docs/python/train.md#ExponentialMovingAverage) version of all learned variables. The evaluation script substitutes all learned model parameters with the moving average version. This substitution boosts model performance at evaluation time. > **EXERCISE:** Employing averaged parameters may boost predictive performance by about 3% as measured by precision @ 1. Edit `cifar10_eval.py` to not employ the averaged parameters for the model and verify that the predictive performance drops. ## Training a Model Using Multiple GPU Cards Modern workstations may contain multiple GPUs for scientific computation. TensorFlow can leverage this environment to run the training operation concurrently across multiple cards. Training a model in a parallel, distributed fashion requires coordinating training processes. For what follows we term *model replica* to be one copy of a model training on a subset of data. Naively employing asynchronous updates of model parameters leads to sub-optimal training performance because an individual model replica might be trained on a stale copy of the model parameters. Conversely, employing fully synchronous updates will be as slow as the slowest model replica. In a workstation with multiple GPU cards, each GPU will have similar speed and contain enough memory to run an entire CIFAR-10 model. Thus, we opt to design our training system in the following manner: * Place an individual model replica on each GPU. * Update model parameters synchronously by waiting for all GPUs to finish processing a batch of data. Here is a diagram of this model:
Note that each GPU computes inference as well as the gradients for a unique batch of data. This setup effectively permits dividing up a larger batch of data across the GPUs. This setup requires that all GPUs share the model parameters. A well-known fact is that transferring data to and from GPUs is quite slow. For this reason, we decide to store and update all model parameters on the CPU (see green box). A fresh set of model parameters is transferred to the GPU when a new batch of data is processed by all GPUs. The GPUs are synchronized in operation. All gradients are accumulated from the GPUs and averaged (see green box). The model parameters are updated with the gradients averaged across all model replicas. ### Placing Variables and Operations on Devices Placing operations and variables on devices requires some special abstractions. The first abstraction we require is a function for computing inference and gradients for a single model replica. In the code we term this abstraction a "tower". We must set two attributes for each tower: * A unique name for all operations within a tower. [`tf.name_scope()`](../../api_docs/python/framework.md#name_scope) provides this unique name by prepending a scope. For instance, all operations in the first tower are prepended with `tower_0`, e.g. `tower_0/conv1/Conv2D`. * A preferred hardware device to run the operation within a tower. [`tf.device()`](../../api_docs/python/framework.md#device) specifies this. For instance, all operations in the first tower reside within `device('/gpu:0')` scope indicating that they should be run on the first GPU. All variables are pinned to the CPU and accessed via [`tf.get_variable()`](../../api_docs/python/state_ops.md#get_variable) in order to share them in a multi-GPU version. See how-to on [Sharing Variables](../../how_tos/variable_scope/index.md). ### Launching and Training the Model on Multiple GPU cards If you have several GPU cards installed on your machine you can use them to train the model faster with the `cifar10_multi_gpu_train.py` script. It is a variation of the training script that parallelizes the model across multiple GPU cards. ```shell python cifar10_multi_gpu_train.py --num_gpus=2 ``` The training script should output: ```shell Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. 2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) 2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) 2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) 2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) 2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) 2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) ... ``` Note that the number of GPU cards used defaults to 1. Additionally, if only 1 GPU is available on your machine, all computations will be placed on it, even if you ask for more. > **EXERCISE:** The default settings for `cifar10_train.py` is to run on a batch size of 128. Try running `cifar10_multi_gpu_train.py` on 2 GPUs with a batch size of 64 and compare the training speed. ## Next Steps [Congratulations!](https://www.youtube.com/watch?v=9bZkp7q19f0) You have completed the CIFAR-10 tutorial. If you are now interested in developing and training your own image classification system, we recommend forking this tutorial and replacing components to build address your image classification problem. > **EXERCISE:** Download the [Street View House Numbers (SVHN)](http://ufldl.stanford.edu/housenumbers/) data set. Fork the CIFAR-10 tutorial and swap in the SVHN as the input data. Try adapting the network architecture to improve predictive performance. ================================================ FILE: SOURCE/tutorials/deep_cnn.md ================================================ # 卷积神经网络 > **注意:** 本教程适用于对Tensorflow有丰富经验的用户,并假定用户有机器学习相关领域的专业知识和经验。 ## 概述 对CIFAR-10 数据集的分类是机器学习中一个公开的基准测试问题,其任务是对一组大小为32x32的RGB图像进行分类,这些图像涵盖了10个类别: ```飞机, 汽车, 鸟, 猫, 鹿, 狗, 青蛙, 马, 船以及卡车。``` ![CIFAR-10 Samples](../images/cifar_samples.png "CIFAR-10 Samples, from http://www.cs.toronto.edu/~kriz/cifar.html") 想了解更多信息请参考[CIFAR-10 page](http://www.cs.toronto.edu/~kriz/cifar.html),以及Alex Krizhevsky写的[技术报告](http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf) ### 目标 本教程的目标是建立一个用于识别图像的相对较小的卷积神经网络,在这一过程中,本教程会: 1. 着重于建立一个规范的网络组织结构,训练并进行评估; 2. 为建立更大规模更加复杂的模型提供一个范例 选择CIFAR-10是因为它的复杂程度足以用来检验TensorFlow中的大部分功能,并可将其扩展为更大的模型。与此同时由于模型较小所以训练速度很快,比较适合用来测试新的想法,检验新的技术。 ### 本教程的重点 CIFAR-10 教程演示了在TensorFlow上构建更大更复杂模型的几个种重要内容: * 相关核心数学对象,如[卷积](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#conv2d)、[修正线性激活](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#relu)、[最大池化](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#max_pool)以及[局部响应归一化](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#local_response_normalization); * 训练过程中一些网络行为的[可视化](https://github.com/jikexueyuanwiki/tensorflow-zh/tree/master/SOURCE/how_tos/summaries_and_tensorboard/index.md),这些行为包括输入图像、损失情况、网络行为的分布情况以及梯度; * 算法学习参数的[移动平均值](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#ExponentialMovingAverage)的计算函数,以及在评估阶段使用这些平均值提高预测性能; * 实现了一种机制,使得[学习率](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#exponential_decay)随着时间的推移而递减; * 为输入数据设计预存取[队列](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/io_ops.md#shuffle_batch),将磁盘延迟和高开销的图像预处理操作与模型分离开来处理; 我们也提供了模型的多GPU版本,用以表明: * 可以配置模型后使其在多个GPU上并行的训练 * 可以在多个GPU之间共享和更新变量值 我们希望本教程给大家开了个头,使得在Tensorflow上可以为视觉相关工作建立更大型的CNN模型 ### 模型架构 本教程中的模型是一个多层架构,由卷积层和非线性层(nonlinearities)交替多次排列后构成。这些层最终通过全连通层对接到softmax分类器上。这一模型除了最顶部的几层外,基本跟[Alex Krizhevsky](https://code.google.com/p/cuda-convnet/)提出的模型一致。 在一个GPU上经过几个小时的训练后,该模型最高可以达到86%的精度。细节请查看[下面](#evaluating-a-model)的描述以及代码。模型中包含了1,068,298个学习参数,对一副图像进行分类大概需要19.5M个乘加操作。 ## 代码组织 本教程的代码位于[`tensorflow/models/image/cifar10/`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/). 文件 | 作用 --- | --- [`cifar10_input.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py) | 读取本地CIFAR-10的二进制文件格式的内容。 [`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py) | 建立CIFAR-10的模型。 [`cifar10_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_train.py) | 在CPU或GPU上训练CIFAR-10的模型。 [`cifar10_multi_gpu_train.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py) | 在多GPU上训练CIFAR-10的模型。 [`cifar10_eval.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_eval.py) | 评估CIFAR-10模型的预测性能。 ## CIFAR-10 模型 CIFAR-10 网络模型部分的代码位于 [`cifar10.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py). 完整的训练图中包含约765个操作。但是我们发现通过下面的模块来构造训练图可以最大限度的提高代码复用率: 1. [**模型输入:**](#model-inputs) 包括`inputs()` 、 `distorted_inputs()`等一些操作,分别用于读取CIFAR的图像并进行预处理,做为后续评估和训练的输入; 2. [**模型预测:**](#model-prediction) 包括`inference()`等一些操作,用于进行统计计算,比如在提供的图像进行分类; adds operations that perform inference, i.e. classification, on supplied images. 3. [**模型训练:**](#model-training) 包括`loss()` and `train()`等一些操作,用于计算损失、计算梯度、进行变量更新以及呈现最终结果。 ### 模型输入 输入模型是通过 `inputs()` 和`distorted_inputs()`函数建立起来的,这2个函数会从CIFAR-10二进制文件中读取图片文件,由于每个图片的存储字节数是固定的,因此可以使用[`tf.FixedLengthRecordReader`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/io_ops.md#FixedLengthRecordReader)函数。更多的关于`Reader`类的功能可以查看[Reading Data](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/reading_data/index.md#reading-from-files)。 图片文件的处理流程如下: * 图片会被统一裁剪到24x24像素大小,裁剪中央区域用于评估或[随机](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#random_crop)裁剪用于训练; * 图片会进行[近似的白化处理](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#per_image_whitening),使得模型对图片的动态范围变化不敏感。 对于训练,我们另外采取了一系列随机变换的方法来人为的增加数据集的大小: * 对图像进行[随机的左右翻转](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#random_flip_left_right); * 随机变换[图像的亮度](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#random_brightness); * 随机变换[图像的对比度](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md#tf_image_random_contrast); 可以在[Images](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md)页的列表中查看所有可用的变换,对于每个原始图我们还附带了一个[`image_summary`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#image_summary),以便于在TensorBoard中查看。这对于检查输入图像是否正确十分有用。
从磁盘上加载图像并进行变换需要花费不少的处理时间。为了避免这些操作减慢训练过程,我们在16个独立的线程中并行进行这些操作,这16个线程被连续的安排在一个TensorFlow[队列](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/io_ops.md#shuffle_batch)中。 ### 模型预测 模型的预测流程由`inference()`构造,该函数会添加必要的操作步骤用于计算预测值的 *logits*,其对应的模型组织方式如下所示: Layer 名称 | 描述 --- | --- `conv1` | 实现[卷积](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#conv2d) 以及 [rectified linear](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#relu) activation. `pool1` | [max pooling](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#max_pool). `norm1` | [局部响应归一化](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#local_response_normalization). `conv2` | [卷积](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#conv2d) and [rectified linear](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#relu) activation. `norm2` | [局部响应归一化](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#local_response_normalization). `pool2` | [max pooling](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#max_pool). `local3` | [基于修正线性激活的全连接层](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md). `local4` | [基于修正线性激活的全连接层](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md). `softmax_linear` | 进行线性变换以输出 logits. 这里有一个由TensorBoard绘制的图形,用于描述模型建立过程中经过的步骤:
> **练习**: `inference`的输出是未归一化的logits,尝试使用[`tf.softmax()`](tensorflow-zh/SOURCE/api_docs/python/nn.md#softmax)修改网络架构后返回归一化的预测值。 `inputs()` 和 `inference()` 函数提供了评估模型时所需的所有构件,现在我们把讲解的重点从构建一个模型转向训练一个模型。 > **练习:** `inference()` 中的模型跟[cuda-convnet](https://code.google.com/p/cuda-convnet/)中描述的CIFAR-10模型有些许不同,其差异主要在于其顶层不是全连接层而是局部连接层,可以尝试修改网络架构来准确的复制全连接模型。 ### 模型训练 训练一个可进行N维分类的网络的常用方法是使用[多项式逻辑回归](https://en.wikipedia.org/wiki/Multinomial_logistic_regression),又被叫做*softmax 回归*。Softmax 回归在网络的输出层上附加了一个[softmax](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#softmax) nonlinearity,并且计算归一化的预测值和label的[1-hot encoding](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/sparse_ops.md#sparse_to_dense)的[交叉熵](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#softmax_cross_entropy_with_logits)。在正则化过程中,我们会对所有学习变量应用[权重衰减损失](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md#l2_loss)。模型的目标函数是求交叉熵损失和所有权重衰减项的和,`loss()`函数的返回值就是这个值。 在TensorBoard中使用[scalar_summary](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#scalar_summary)来查看该值的变化情况: ![CIFAR-10 Loss](../images/cifar_loss.png "CIFAR-10 Total Loss") 我们使用标准的梯度下降算法来训练模型(也可以在[Training](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md)中看看其他方法),其学习率随时间以指数形式衰减。 ![CIFAR-10 Learning Rate Decay](../images/cifar_lr_decay.png "CIFAR-10 Learning Rate Decay") `train()` 函数会添加一些操作使得目标函数最小化,这些操作包括计算梯度、更新学习变量(详细信息请查看[`GradientDescentOptimizer`](tensorflow-zh/SOURCE/api_docs/python/train.md#GradientDescentOptimizer))。`train()` 函数最终会返回一个用以对一批图像执行所有计算的操作步骤,以便训练并更新模型。 ## 开始执行并训练模型 我们已经把模型建立好了,现在通过执行脚本`cifar10_train.py`来启动训练过程。 ```shell python cifar10_train.py ``` **注意:** 当第一次在CIFAR-10教程上启动任何任务时,会自动下载CIFAR-10数据集,该数据集大约有160M大小,因此第一次运行时泡杯咖啡小栖一会吧。 你应该可以看到如下类似的输出: ```shell Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. 2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) 2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) 2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) 2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) 2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) 2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) ... ``` 脚本会在每10步训练过程后打印出总损失值,以及最后一批数据的处理速度。下面是几点注释: * 第一批数据会非常的慢(大概要几分钟时间),因为预处理线程要把20,000个待处理的CIFAR图像填充到重排队列中; * 打印出来的损失值是最近一批数据的损失值的均值。请记住损失值是交叉熵和权重衰减项的和; * 上面打印结果中关于一批数据的处理速度是在Tesla K40C上统计出来的,如果你运行在CPU上,性能会比此要低; > **练习:** 当实验时,第一阶段的训练时间有时会非常的长,长到足以让人生厌。可以尝试减少初始化时初始填充到队列中图片数量来改变这种情况。在`cifar10.py`中搜索`NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN`并修改之。 `cifar10_train.py` 会周期性的在[检查点文件](https://github.com/jikexueyuanwiki/tensorflow-zh/tree/master/SOURCE/how_tos/variables/index.md#saving-and-restoring)中[保存](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/state_ops.md#Saver)模型中的所有参数,但是*不会*对模型进行评估。`cifar10_eval.py`会使用该检查点文件来测试预测性能(详见下面的描述:[评估模型](#评估模型))。 如果按照上面的步骤做下来,你应该已经开始训练一个CIFAR-10模型了。[恭喜你!](https://www.youtube.com/watch?v=9bZkp7q19f0) `cifar10_train.py`输出的终端信息中提供了关于模型如何训练的一些信息,但是我们可能希望了解更多关于模型训练时的信息,比如: * 损失是*真的*在减小还是看到的只是噪声数据? * 为模型提供的图片是否合适? * 梯度、激活、权重的值是否合理? * 当前的学习率是多少? [TensorBoard](https://github.com/jikexueyuanwiki/tensorflow-zh/tree/master/SOURCE/how_tos/summaries_and_tensorboard/index.md)提供了该功能,可以通过`cifar10_train.py`中的[`SummaryWriter`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#SummaryWriter)周期性的获取并显示这些数据。 比如我们可以在训练过程中查看`local3`的激活情况,以及其特征维度的稀疏情况:
相比于总损失,在训练过程中的单项损失尤其值得人们的注意。但是由于训练中使用的数据批量比较小,损失值中夹杂了相当多的噪声。在实践过程中,我们也发现相比于原始值,损失值的移动平均值显得更为有意义。请参阅脚本[`ExponentialMovingAverage`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#ExponentialMovingAverage)了解如何实现。 ## 评估模型 现在可以在另一部分数据集上来评估训练模型的性能。脚本文件`cifar10_eval.py`对模型进行了评估,利用 `inference()`函数重构模型,并使用了在评估数据集所有10,000张CIFAR-10图片进行测试。最终计算出的精度为*1:N*,N=预测值中置信度最高的一项与图片真实label匹配的频次。(It calculates the *precision at 1:* how often the top prediction matches the true label of the image)。 为了监控模型在训练过程中的改进情况,评估用的脚本文件会周期性的在最新的检查点文件上运行,这些检查点文件是由`cifar10_train.py`产生。 ```shell python cifar10_eval.py ``` >注意:不要在同一块GPU上同时运行训练程序和评估程序,因为可能会导致内存耗尽。尽可能的在其它单独的GPU上运行评估程序,或者在同一块GPU上运行评估程序时先挂起训练程序。 你可能会看到如下所示输出: ```shell 2015-11-06 08:30:44.391206: precision @ 1 = 0.860 ... ``` 评估脚本只是周期性的返回precision@1 (The script merely returns the precision @ 1 periodically)--在该例中返回的准确率是86%。`cifar10_eval.py` 同时也返回其它一些可以在TensorBoard中进行可视化的简要信息。可以通过这些简要信息在评估过程中进一步的了解模型。 训练脚本会为所有学习变量计算其[移动均值](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md#ExponentialMovingAverage),评估脚本则直接将所有学习到的模型参数替换成对应的移动均值。这一替代方式可以在评估过程中提升模型的性能。 > **练习:** 通过precision @ 1测试发现,使用均值参数可以将预测性能提高约3%,在`cifar10_eval.py`中尝试修改为不采用均值参数的方式,并确认由此带来的预测性能下降。 ## 在多个GPU板卡上训练模型 现代的工作站可能包含多个GPU进行科学计算。TensorFlow可以利用这一环境在多个GPU卡上运行训练程序。 在并行、分布式的环境中进行训练,需要对训练程序进行协调。对于接下来的描述,术语*模型拷贝*(*model replica*)特指在一个数据子集中训练出来的模型的一份拷贝。 如果天真的对模型参数的采用异步方式更新将会导致次优的训练性能,这是因为我们可能会基于一个旧的模型参数的拷贝去训练一个模型。但与此相反采用完全同步更新的方式,其速度将会变得和最慢的模型一样慢(Conversely, employing fully synchronous updates will be as slow as the slowest model replica.)。 在具有多个GPU的工作站中,每个GPU的速度基本接近,并且都含有足够的内存来运行整个CIFAR-10模型。因此我们选择以下方式来设计我们的训练系统: * 在每个GPU上放置单独的模型副本; * 等所有GPU处理完一批数据后再同步更新模型的参数; 下图示意了该模型的结构::
可以看到,每一个GPU会用一批独立的数据计算梯度和估计值。这种设置可以非常有效的将一大批数据分割到各个GPU上。 这一机制要求所有GPU能够共享模型参数。但是众所周知在GPU之间传输数据非常的慢,因此我们决定在CPU上存储和更新所有模型的参数(对应图中绿色矩形的位置)。这样一来,GPU在处理一批新的数据之前会更新一遍的参数。 图中所有的GPU是同步运行的。所有GPU中的梯度会累积并求平均值(绿色方框部分)。模型参数会利用所有模型副本梯度的均值来更新。 ### 在多个设备中设置变量和操作 在多个设备中设置变量和操作时需要做一些特殊的抽象。 我们首先需要把在单个模型拷贝中计算估计值和梯度的行为抽象到一个函数中。在代码中,我们称这个抽象对象为“tower”。对于每一个“tower”我们都需要设置它的两个属性: * 在一个tower中为所有操作设定一个唯一的名称。[`tf.name_scope()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#name_scope)通过添加一个范围前缀来提供该唯一名称。比如,第一个tower中的所有操作都会附带一个前缀`tower_0`,示例:`tower_0/conv1/Conv2D`; * 在一个tower中运行操作的优先硬件设备。 [`tf.device()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md#device) 提供该信息。比如,在第一个tower中的所有操作都位于 `device('/gpu:0')`范围中,暗含的意思是这些操作应该运行在第一块GPU上; 为了在多个GPU上共享变量,所有的变量都绑定在CPU上,并通过[`tf.get_variable()`](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/state_ops.md#get_variable)访问。可以查看[Sharing Variables](https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/variables/index.md)以了解如何共享变量。 ### 启动并在多个GPU上训练模型 如果你的机器上安装有多块GPU,你可以通过使用`cifar10_multi_gpu_train.py`脚本来加速模型训练。该脚本是训练脚本的一个变种,使用多个GPU实现模型并行训练。 ```shell python cifar10_multi_gpu_train.py --num_gpus=2 ``` 训练脚本的输出如下所示: ```shell Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. 2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) 2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) 2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) 2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) 2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) 2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) ... ``` 需要注意的是默认的GPU使用数是1,此外,如果你的机器上只有一个GPU,那么所有的计算都只会在一个GPU上运行,即便你可能设置的是N个。 > **练习:** `cifar10_train.py`中的批处理大小默认配置是128。尝试在2个GPU上运行`cifar10_multi_gpu_train.py`脚本,并且设定批处理大小为64,然后比较2种方式的训练速度。 ## 下一步 [恭喜你!](https://www.youtube.com/watch?v=9bZkp7q19f0) 你已经完成了CIFAR-10教程。 如果你对开发和训练自己的图像分类系统感兴趣,我们推荐你新建一个基于该教程的分支,并修改其中的内容以建立解决您问题的图像分类系统。 > **练习:** 下载[Street View House Numbers (SVHN)](http://ufldl.stanford.edu/housenumbers/) 数据集。新建一个CIFAR-10教程的分支,并将输入数据替换成SVHN。尝试改变网络结构以提高预测性能。 > 原文:[Convolutional Neural Networks](http://tensorflow.org/tutorials/deep_cnn/index.md) 翻译:[oskycar](https://github.com/oskycar) 校对:[KK4SBB](https://github.com/zhyhooo) ================================================ FILE: SOURCE/tutorials/mandelbrot/index.md ================================================ # Mandelbrot Set Visualizing the Mandelbrot set doesn't have anything to do with machine learning, but it makes for a fun example of how one can use TensorFlow for general mathematics. This is actually a pretty naive implementation of the visualization, but it makes the point. (We may end up providing a more elaborate implementation down the line to produce more truly beautiful images.) Note: This tutorial was originally prepared as an IPython notebook. ## Basic Setup We'll need a few imports to get started. ```python # Import libraries for simulation import tensorflow as tf import numpy as np # Imports for visualization import PIL.Image from cStringIO import StringIO from IPython.display import clear_output, Image, display import scipy.ndimage as nd ``` Now we'll define a function to actually display the image once we have iteration counts. ```python def DisplayFractal(a, fmt='jpeg'): """Display an array of iteration counts as a colorful picture of a fractal.""" a_cyclic = (6.28*a/20.0).reshape(list(a.shape)+[1]) img = np.concatenate([10+20*np.cos(a_cyclic), 30+50*np.sin(a_cyclic), 155-80*np.cos(a_cyclic)], 2) img[a==a.max()] = 0 a = img a = np.uint8(np.clip(a, 0, 255)) f = StringIO() PIL.Image.fromarray(a).save(f, fmt) display(Image(data=f.getvalue())) ``` ## Session and Variable Initialization For playing around like this, we often use an interactive session, but a regular session would work as well. ```python sess = tf.InteractiveSession() ``` It's handy that we can freely mix NumPy and TensorFlow. ```python # Use NumPy to create a 2D array of complex numbers on [-2,2]x[-2,2] Y, X = np.mgrid[-1.3:1.3:0.005, -2:1:0.005] Z = X+1j*Y ``` Now we define and initialize TensorFlow tensors. ```python xs = tf.constant(Z.astype("complex64")) zs = tf.Variable(xs) ns = tf.Variable(tf.zeros_like(xs, "float32")) ``` TensorFlow requires that you explicitly initialize variables before using them. ```python tf.initialize_all_variables().run() ``` ## Defining and Running the Computation Now we specify more of the computation... ```python # Compute the new values of z: z^2 + x zs_ = zs*zs + xs # Have we diverged with this new value? not_diverged = tf.complex_abs(zs_) < 4 # Operation to update the zs and the iteration count. # # Note: We keep computing zs after they diverge! This # is very wasteful! There are better, if a little # less simple, ways to do this. # step = tf.group( zs.assign(zs_), ns.assign_add(tf.cast(not_diverged, "float32")) ) ``` ... and run it for a couple hundred steps ```python for i in range(200): step.run() ``` Let's see what we've got. ```python DisplayFractal(ns.eval()) ``` ![jpeg](mandelbrot_output.jpg) Not bad! ================================================ FILE: SOURCE/tutorials/mandelbrot.md ================================================ # 曼德布洛特(Mandelbrot)集合 虽然可视化曼德布洛特(Mandelbrot)集合与机器学习没有任何关系,但这对于将TensorFlow应用在数学更广泛的领域是一个有趣的例子。实际上,这是tensorflow一个非常直截了当的可视化运用。(我们最终也许会提供一种更加精心设计的运用方式来生成真正更加美丽的图像。) 说明:本教程使用了IPython的notebook。 ## 基本步骤 首先,我们需要导入一些库。 ```python # 导入仿真库 import tensorflow as tf import numpy as np # 导入可视化库 import PIL.Image from cStringIO import StringIO from IPython.display import clear_output, Image, display import scipy.ndimage as nd ``` 现在我们将定义一个函数来显示迭代计算出的图像。 ```python def DisplayFractal(a, fmt='jpeg'): """显示迭代计算出的彩色分形图像。""" a_cyclic = (6.28*a/20.0).reshape(list(a.shape)+[1]) img = np.concatenate([10+20*np.cos(a_cyclic), 30+50*np.sin(a_cyclic), 155-80*np.cos(a_cyclic)], 2) img[a==a.max()] = 0 a = img a = np.uint8(np.clip(a, 0, 255)) f = StringIO() PIL.Image.fromarray(a).save(f, fmt) display(Image(data=f.getvalue())) ``` ## 会话(session)和变量(variable)初始化 为了操作的方便,我们常常使用交互式会话(interactive session),但普通会话(regular session)也能正常使用。 ```python sess = tf.InteractiveSession() ``` 我们可以自由的混合使用NumPy和TensorFlow,这一点非常方便。 ```python # 使用NumPy创建一个在[-2,2]x[-2,2]范围内的2维复数数组 Y, X = np.mgrid[-1.3:1.3:0.005, -2:1:0.005] Z = X+1j*Y ``` 现在我们定义并初始化一组TensorFlow的张量 (tensors)。 ```python xs = tf.constant(Z.astype("complex64")) zs = tf.Variable(xs) ns = tf.Variable(tf.zeros_like(xs, "float32")) ``` TensorFlow在使用之前需要你明确给定变量的初始值。 ```python tf.initialize_all_variables().run() ``` ## 定义并运行计算 现在我们指定更多的计算... ```python # 计算一个新值z: z^2 + x zs_ = zs*zs + xs # 这个新值会发散吗? not_diverged = tf.complex_abs(zs_) < 4 # 更新zs并且迭代计算。 # # 说明:在这些值发散之后,我们仍然在计算zs,这个计算消耗特别大! # 如果稍微简单点,这里有更好的方法来处理。 # step = tf.group( zs.assign(zs_), ns.assign_add(tf.cast(not_diverged, "float32")) ) ``` ...继续执行几百个步骤 ```python for i in range(200): step.run() ``` 让我们看看我们得到了什么。 ```python DisplayFractal(ns.eval()) ``` ![jpeg](../images/mandelbrot_output.jpg) 结果不错! > 原文:[Mandelbrot Set](http://tensorflow.org/tutorials/mandelbrot/index.md) 翻译:[ericxk](https://github.com/ericxk) 校对:[tensorfly](https://github.com/tensorfly) ================================================ FILE: SOURCE/tutorials/mnist/__init__.py ================================================ ================================================ FILE: SOURCE/tutorials/mnist/beginners/index.md ================================================ # MNIST For ML Beginners *This tutorial is intended for readers who are new to both machine learning and TensorFlow. If you already know what MNIST is, and what softmax (multinomial logistic) regression is, you might prefer this [faster paced tutorial](../../../tutorials/mnist/pros/index.md).* When one learns how to program, there's a tradition that the first thing you do is print "Hello World." Just like programming has Hello World, machine learning has MNIST. MNIST is a simple computer vision dataset. It consists of images of handwritten digits like these:
It also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1. In this tutorial, we're going to train a model to look at images and predict what digits they are. Our goal isn't to train a really elaborate model that achieves state-of-the-art performance -- although we'll give you code to do that later! -- but rather to dip a toe into using TensorFlow. As such, we're going to start with a very simple model, called a Softmax Regression. The actual code for this tutorial is very short, and all the interesting stuff happens in just three lines. However, it is very important to understand the ideas behind it: both how TensorFlow works and the core machine learning concepts. Because of this, we are going to very carefully work through the code. ## The MNIST Data The MNIST data is hosted on [Yann LeCun's website](http://yann.lecun.com/exdb/mnist/). For your convenience, we've included some python code to download and install the data automatically. You can either download [the code](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py) and import it as below, or simply copy and paste it in. ```python import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) ``` The downloaded data is split into two parts, 60,000 data points of training data (`mnist.train`) and 10,000 points of test data (`mnist.test`). This split is very important: it's essential in machine learning that we have separate data which we don't learn from so that we can make sure that what we've learned actually generalizes! As mentioned earlier, every MNIST data point has two parts: an image of a handwritten digit and a corresponding label. We will call the images "xs" and the labels "ys". Both the training set and test set contain xs and ys, for example the training images are `mnist.train.images` and the train labels are `mnist.train.labels`. Each image is 28 pixels by 28 pixels. We can interpret this as a big array of numbers:
We can flatten this array into a vector of 28x28 = 784 numbers. It doesn't matter how we flatten the array, as long as we're consistent between images. From this perspective, the MNIST images are just a bunch of points in a 784-dimensional vector space, with a [very rich structure](http://colah.github.io/posts/2014-10-Visualizing-MNIST/) (warning: computationally intensive visualizations). Flattening the data throws away information about the 2D structure of the image. Isn't that bad? Well, the best computer vision methods do exploit this structure, and we will in later tutorials. But the simple method we will be using here, a softmax regression, won't. The result is that `mnist.train.images` is a tensor (an n-dimensional array) with a shape of `[60000, 784]`. The first dimension indexes the images and the second dimension indexes the pixels in each image. Each entry in the tensor is the pixel intensity between 0 and 1, for a particular pixel in a particular image.
The corresponding labels in MNIST are numbers between 0 and 9, describing which digit a given image is of. For the purposes of this tutorial, we're going to want our labels as "one-hot vectors". A one-hot vector is a vector which is 0 in most dimensions, and 1 in a single dimension. In this case, the \\(n\\)th digit will be represented as a vector which is 1 in the \\(n\\)th dimensions. For example, 3 would be \\([0,0,0,1,0,0,0,0,0,0]\\). Consequently, `mnist.train.labels` is a `[60000, 10]` array of floats.
We're now ready to actually make our model! ## Softmax Regressions We know that every image in MNIST is a digit, whether it's a zero or a nine. We want to be able to look at an image and give probabilities for it being each digit. For example, our model might look at a picture of a nine and be 80% sure it's a nine, but give a 5% chance to it being an eight (because of the top loop) and a bit of probability to all the others because it isn't sure. This is a classic case where a softmax regression is a natural, simple model. If you want to assign probabilities to an object being one of several different things, softmax is the thing to do. Even later on, when we train more sophisticated models, the final step will be a layer of softmax. A softmax regression has two steps: first we add up the evidence of our input being in certain classes, and then we convert that evidence into probabilities. To tally up the evidence that a given image is in a particular class, we do a weighted sum of the pixel intensities. The weight is negative if that pixel having a high intensity is evidence against the image being in that class, and positive if it is evidence in favor. The following diagram shows the weights one model learned for each of these classes. Red represents negative weights, while blue represents positive weights.
We also add some extra evidence called a bias. Basically, we want to be able to say that some things are more likely independent of the input. The result is that the evidence for a class \\(i\\) given an input \\(x\\) is: $$\text{evidence}_i = \sum_j W_{i,~ j} x_j + b_i$$ where \\(W\_i\\) is the weights and \\(b\_i\\) is the bias for class \\(i\\), and \\(j\\) is an index for summing over the pixels in our input image \\(x\\). We then convert the evidence tallies into our predicted probabilities \\(y\\) using the "softmax" function: $$y = \text{softmax}(\text{evidence})$$ Here softmax is serving as an "activation" or "link" function, shaping the output of our linear function into the form we want -- in this case, a probability distribution over 10 cases. You can think of it as converting tallies of evidence into probabilities of our input being in each class. It's defined as: $$\text{softmax}(x) = \text{normalize}(\exp(x))$$ If you expand that equation out, you get: $$\text{softmax}(x)_i = \frac{\exp(x_i)}{\sum_j \exp(x_j)}$$ But it's often more helpful to think of softmax the first way: exponentiating its inputs and then normalizing them. The exponentiation means that one unit more evidence increases the weight given to any hypothesis multiplicatively. And conversely, having one less unit of evidence means that a hypothesis gets a fraction of its earlier weight. No hypothesis ever has zero or negative weight. Softmax then normalizes these weights, so that they add up to one, forming a valid probability distribution. (To get more intuition about the softmax function, check out the [section](http://neuralnetworksanddeeplearning.com/chap3.html#softmax) on it in Michael Nieslen's book, complete with an interactive visualization.) You can picture our softmax regression as looking something like the following, although with a lot more \\(x\\)s. For each output, we compute a weighted sum of the \\(x\\)s, add a bias, and then apply softmax.
If we write that out as equations, we get:
We can "vectorize" this procedure, turning it into a matrix multiplication and vector addition. This is helpful for computational efficiency. (It's also a useful way to think.)
More compactly, we can just write: $$y = \text{softmax}(Wx + b)$$ ## Implementing the Regression To do efficient numerical computing in Python, we typically use libraries like NumPy that do expensive operations such as matrix multiplication outside Python, using highly efficient code implemented in another language. Unfortunately, there can still be a lot of overhead from switching back to Python every operation. This overhead is especially bad if you want to run computations on GPUs or in a distributed manner, where there can be a high cost to transferring data. TensorFlow also does its heavy lifting outside python, but it takes things a step further to avoid this overhead. Instead of running a single expensive operation independently from Python, TensorFlow lets us describe a graph of interacting operations that run entirely outside Python. (Approaches like this can be seen in a few machine learning libraries.) To use TensorFlow, we need to import it. ```python import tensorflow as tf ``` We describe these interacting operations by manipulating symbolic variables. Let's create one: ```python x = tf.placeholder("float", [None, 784]) ``` `x` isn't a specific value. It's a `placeholder`, a value that we'll input when we ask TensorFlow to run a computation. We want to be able to input any number of MNIST images, each flattened into a 784-dimensional vector. We represent this as a 2d tensor of floating point numbers, with a shape `[None, 784]`. (Here `None` means that a dimension can be of any length.) We also need the weights and biases for our model. We could imagine treating these like additional inputs, but TensorFlow has an even better way to handle it: `Variable`. A `Variable` is a modifiable tensor that lives in TensorFlow's graph of interacting operations. It can be used and even modified by the computation. For machine learning applications, one generally has the model parameters be `Variable`s. ```python W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) ``` We create these `Variable`s by giving `tf.Variable` the initial value of the `Variable`: in this case, we initialize both `W` and `b` as tensors full of zeros. Since we are going to learn `W` and `b`, it doesn't matter very much what they initially are. Notice that `W` has a shape of [784, 10] because we want to multiply the 784-dimensional image vectors by it to produce 10-dimensional vectors of evidence for the difference classes. `b` has a shape of [10] so we can add it to the output. We can now implement our model. It only takes one line! ```python y = tf.nn.softmax(tf.matmul(x,W) + b) ``` First, we multiply `x` by `W` with the expression `tf.matmul(x,W)`. This is flipped from when we multiplied them in our equation, where we had \\(Wx\\), as a small trick to deal with `x` being a 2D tensor with multiple inputs. We then add `b`, and finally apply `tf.nn.softmax`. That's it. It only took us one line to define our model, after a couple short lines of setup. That isn't because TensorFlow is designed to make a softmax regression particularly easy: it's just a very flexible way to describe many kinds of numerical computations, from machine learning models to physics simulations. And once defined, our model can be run on different devices: your computer's CPU, GPUs, and even phones! ## Training In order to train our model, we need to define what it means for the model to be good. Well, actually, in machine learning we typically define what it means for a model to be bad, called the cost or loss, and then try to minimize how bad it is. But the two are equivalent. One very common, very nice cost function is "cross-entropy." Surprisingly, cross-entropy arises from thinking about information compressing codes in information theory but it winds up being an important idea in lots of areas, from gambling to machine learning. It's defined: $$H_{y'}(y) = -\sum_i y'_i \log(y_i)$$ Where \\(y\\) is our predicted probability distribution, and \\(y'\\) is the true distribution (the one-hot vector we'll input). In some rough sense, the cross-entropy is measuring how inefficient our predictions are for describing the truth. Going into more detail about cross-entropy is beyond the scope of this tutorial, but it's well worth [understanding](http://colah.github.io/posts/2015-09-Visual-Information/). To implement cross-entropy we need to first add a new placeholder to input the correct answers: ```python y_ = tf.placeholder("float", [None,10]) ``` Then we can implement the cross-entropy, \\(-\sum y'\log(y)\\): ```python cross_entropy = -tf.reduce_sum(y_*tf.log(y)) ``` First, `tf.log` computes the logarithm of each element of `y`. Next, we multiply each element of `y_` with the corresponding element of `tf.log(y)`. Finally, `tf.reduce_sum` adds all the elements of the tensor. (Note that this isn't just the cross-entropy of the truth with a single prediction, but the sum of the cross-entropies for all 100 images we looked at. How well we are doing on 100 data points is a much better description of how good our model is than a single data point.) Now that we know what we want our model to do, it's very easy to have TensorFlow train it to do so. Because TensorFlow knows the entire graph of your computations, it can automatically use the [backpropagation algorithm](http://colah.github.io/posts/2015-08-Backprop/) to efficiently determine how your variables affect the cost you ask it minimize. Then it can apply your choice of optimization algorithm to modify the variables and reduce the cost. ```python train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) ``` In this case, we ask TensorFlow to minimize `cross_entropy` using the gradient descent algorithm with a learning rate of 0.01. Gradient descent is a simple procedure, where TensorFlow simply shifts each variable a little bit in the direction that reduces the cost. But TensorFlow also provides [many other optimization algorithms] (../../../api_docs/python/train.md#optimizers): using one is as simple as tweaking one line. What TensorFlow actually does here, behind the scenes, is it adds new operations to your graph which implement backpropagation and gradient descent. Then it gives you back a single operation which, when run, will do a step of gradient descent training, slightly tweaking your variables to reduce the cost. Now we have our model set up to train. One last thing before we launch it, we have to add an operation to initialize the variables we created: ```python init = tf.initialize_all_variables() ``` We can now launch the model in a `Session`, and run the operation that initializes the variables: ```python sess = tf.Session() sess.run(init) ``` Let's train -- we'll run the training step 1000 times! ```python for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) ``` Each step of the loop, we get a "batch" of one hundred random data points from our training set. We run `train_step` feeding in the batches data to replace the `placeholder`s. Using small batches of random data is called stochastic training -- in this case, stochastic gradient descent. Ideally, we'd like to use all our data for every step of training because that would give us a better sense of what we should be doing, but that's expensive. So, instead, we use a different subset every time. Doing this is cheap and has much of the same benefit. ## Evaluating Our Model How well does our model do? Well, first let's figure out where we predicted the correct label. `tf.argmax` is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, `tf.argmax(y,1)` is the label our model thinks is most likely for each input, while `tf.argmax(y_,1)` is the correct label. We can use `tf.equal` to check if our prediction matches the truth. ```python correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) ``` That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, `[True, False, True, True]` would become `[1,0,1,1]` which would become `0.75`. ```python accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) ``` Finally, we ask for our accuracy on our test data. ```python print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}) ``` This should be about 91%. Is that good? Well, not really. In fact, it's pretty bad. This is because we're using a very simple model. With some small changes, we can get to 97%. The best models can get to over 99.7% accuracy! (For more information, have a look at this [list of results](http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html).) What matters is that we learned from this model. Still, if you're feeling a bit down about these results, check out [the next tutorial](../../../tutorials/overview.md) where we do a lot better, and learn how to build more sophisticated models using TensorFlow! ================================================ FILE: SOURCE/tutorials/mnist/download/index.md ================================================ # MNIST Data Download Code: [tensorflow/g3doc/tutorials/mnist/](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/) The goal of this tutorial is to show how to download the dataset files required for handwritten digit classification using the (classic) MNIST data set. ## Tutorial Files This tutorial references the following files: File | Purpose --- | --- [`input_data.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py) | The code to download the MNIST dataset for training and evaluation. ## Prepare the Data MNIST is a classic problem in machine learning. The problem is to look at greyscale 28x28 pixel images of handwritten digits and determine which digit the image represents, for all the digits from zero to nine. ![MNIST Digits](../tf/mnist_digits.png "MNIST Digits") For more information, refer to [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/) or [Chris Olah's visualizations of MNIST](http://colah.github.io/posts/2014-10-Visualizing-MNIST/). ### Download [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/) also hosts the training and test data for download. File | Purpose --- | --- [`train-images-idx3-ubyte.gz`](http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz) | training set images - 55000 training images, 5000 validation images [`train-labels-idx1-ubyte.gz`](http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz) | training set labels matching the images [`t10k-images-idx3-ubyte.gz`](http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz) | test set images - 10000 images [`t10k-labels-idx1-ubyte.gz`](http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz) | test set labels matching the images In the `input_data.py` file, the `maybe_download()` function will ensure these files are downloaded into a local data folder for training. The folder name is specified in a flag variable at the top of the `fully_connected_feed.py` file and may be changed to fit your needs. ### Unpack and Reshape The files themselves are not in any standard image format and are manually unpacked (following the instructions available at the website) by the `extract_images()` and `extract_labels()` functions in `input_data.py`. The image data is extracted into a 2d tensor of: `[image index, pixel index]` where each entry is the intensity value of a specific pixel in a specific image, rescaled from `[0, 255]` to `[-0.5, 0.5]`. The "image index" corresponds to an image in the dataset, counting up from zero to the size of the dataset. And the "pixel index" corresponds to a specific pixel in that image, ranging from zero to the number of pixels in the image. The 60000 examples in the `train-*` files are then split into 55000 examples for training and 5000 examples for validation. For all of the 28x28 pixel greyscale images in the datasets the image size is 784 and so the output tensor for the training set images is of shape `[55000, 784]`. The label data is extracted into a 1d tensor of: `[image index]` with the class identifier for each example as the value. For the training set labels, this would then be of shape `[55000]`. ### DataSet Object The underlying code will download, unpack, and reshape images and labels for the following datasets: Dataset | Purpose --- | --- `data_sets.train` | 55000 images and labels, for primary training. `data_sets.validation` | 5000 images and labels, for iterative validation of training accuracy. `data_sets.test` | 10000 images and labels, for final testing of trained accuracy. The `read_data_sets()` function will return a dictionary with a `DataSet` instance for each of these three sets of data. The `DataSet.next_batch()` method can be used to fetch a tuple consisting of `batch_size` lists of images and labels to be fed into the running TensorFlow session. ```python images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size) ``` ================================================ FILE: SOURCE/tutorials/mnist/fully_connected_feed.py ================================================ """Trains and Evaluates the MNIST network using a feed dictionary. TensorFlow install instructions: https://tensorflow.org/get_started/os_setup.html MNIST tutorial: https://tensorflow.org/tutorials/mnist/tf/index.html """ from __future__ import print_function # pylint: disable=missing-docstring import os.path import time import tensorflow.python.platform import numpy import tensorflow as tf from tensorflow.g3doc.tutorials.mnist import input_data from tensorflow.g3doc.tutorials.mnist import mnist # Basic model parameters as external flags. flags = tf.app.flags FLAGS = flags.FLAGS flags.DEFINE_float('learning_rate', 0.01, 'Initial learning rate.') flags.DEFINE_integer('max_steps', 2000, 'Number of steps to run trainer.') flags.DEFINE_integer('hidden1', 128, 'Number of units in hidden layer 1.') flags.DEFINE_integer('hidden2', 32, 'Number of units in hidden layer 2.') flags.DEFINE_integer('batch_size', 100, 'Batch size. ' 'Must divide evenly into the dataset sizes.') flags.DEFINE_string('train_dir', 'data', 'Directory to put the training data.') flags.DEFINE_boolean('fake_data', False, 'If true, uses fake data ' 'for unit testing.') def placeholder_inputs(batch_size): """Generate placeholder variables to represent the the input tensors. These placeholders are used as inputs by the rest of the model building code and will be fed from the downloaded data in the .run() loop, below. Args: batch_size: The batch size will be baked into both placeholders. Returns: images_placeholder: Images placeholder. labels_placeholder: Labels placeholder. """ # Note that the shapes of the placeholders match the shapes of the full # image and label tensors, except the first dimension is now batch_size # rather than the full size of the train or test data sets. images_placeholder = tf.placeholder(tf.float32, shape=(batch_size, mnist.IMAGE_PIXELS)) labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size)) return images_placeholder, labels_placeholder def fill_feed_dict(data_set, images_pl, labels_pl): """Fills the feed_dict for training the given step. A feed_dict takes the form of: feed_dict = { : , .... } Args: data_set: The set of images and labels, from input_data.read_data_sets() images_pl: The images placeholder, from placeholder_inputs(). labels_pl: The labels placeholder, from placeholder_inputs(). Returns: feed_dict: The feed dictionary mapping from placeholders to values. """ # Create the feed_dict for the placeholders filled with the next # `batch size ` examples. images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size, FLAGS.fake_data) feed_dict = { images_pl: images_feed, labels_pl: labels_feed, } return feed_dict def do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_set): """Runs one evaluation against the full epoch of data. Args: sess: The session in which the model has been trained. eval_correct: The Tensor that returns the number of correct predictions. images_placeholder: The images placeholder. labels_placeholder: The labels placeholder. data_set: The set of images and labels to evaluate, from input_data.read_data_sets(). """ # And run one epoch of eval. true_count = 0 # Counts the number of correct predictions. steps_per_epoch = int(data_set.num_examples / FLAGS.batch_size) num_examples = steps_per_epoch * FLAGS.batch_size for step in xrange(steps_per_epoch): feed_dict = fill_feed_dict(data_set, images_placeholder, labels_placeholder) true_count += sess.run(eval_correct, feed_dict=feed_dict) precision = float(true_count) / float(num_examples) print(' Num examples: %d Num correct: %d Precision @ 1: %0.04f' % (num_examples, true_count, precision)) def run_training(): """Train MNIST for a number of steps.""" # Get the sets of images and labels for training, validation, and # test on MNIST. data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data) # Tell TensorFlow that the model will be built into the default Graph. with tf.Graph().as_default(): # Generate placeholders for the images and labels. images_placeholder, labels_placeholder = placeholder_inputs( FLAGS.batch_size) # Build a Graph that computes predictions from the inference model. logits = mnist.inference(images_placeholder, FLAGS.hidden1, FLAGS.hidden2) # Add to the Graph the Ops for loss calculation. loss = mnist.loss(logits, labels_placeholder) # Add to the Graph the Ops that calculate and apply gradients. train_op = mnist.training(loss, FLAGS.learning_rate) # Add the Op to compare the logits to the labels during evaluation. eval_correct = mnist.evaluation(logits, labels_placeholder) # Build the summary operation based on the TF collection of Summaries. summary_op = tf.merge_all_summaries() # Create a saver for writing training checkpoints. saver = tf.train.Saver() # Create a session for running Ops on the Graph. sess = tf.Session() # Run the Op to initialize the variables. init = tf.initialize_all_variables() sess.run(init) # Instantiate a SummaryWriter to output summaries and the Graph. summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, graph_def=sess.graph_def) # And then after everything is built, start the training loop. for step in xrange(FLAGS.max_steps): start_time = time.time() # Fill a feed dictionary with the actual set of images and labels # for this particular training step. feed_dict = fill_feed_dict(data_sets.train, images_placeholder, labels_placeholder) # Run one step of the model. The return values are the activations # from the `train_op` (which is discarded) and the `loss` Op. To # inspect the values of your Ops or variables, you may include them # in the list passed to sess.run() and the value tensors will be # returned in the tuple from the call. _, loss_value = sess.run([train_op, loss], feed_dict=feed_dict) duration = time.time() - start_time # Write the summaries and print an overview fairly often. if step % 100 == 0: # Print status to stdout. print('Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration)) # Update the events file. summary_str = sess.run(summary_op, feed_dict=feed_dict) summary_writer.add_summary(summary_str, step) # Save a checkpoint and evaluate the model periodically. if (step + 1) % 1000 == 0 or (step + 1) == FLAGS.max_steps: saver.save(sess, FLAGS.train_dir, global_step=step) # Evaluate against the training set. print('Training Data Eval:') do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.train) # Evaluate against the validation set. print('Validation Data Eval:') do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.validation) # Evaluate against the test set. print('Test Data Eval:') do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.test) def main(_): run_training() if __name__ == '__main__': tf.app.run() ================================================ FILE: SOURCE/tutorials/mnist/input_data.py ================================================ """Functions for downloading and reading MNIST data.""" from __future__ import print_function import gzip import os import urllib import numpy SOURCE_URL = 'http://yann.lecun.com/exdb/mnist/' def maybe_download(filename, work_directory): """Download the data from Yann's website, unless it's already here.""" if not os.path.exists(work_directory): os.mkdir(work_directory) filepath = os.path.join(work_directory, filename) if not os.path.exists(filepath): filepath, _ = urllib.urlretrieve(SOURCE_URL + filename, filepath) statinfo = os.stat(filepath) print('Succesfully downloaded', filename, statinfo.st_size, 'bytes.') return filepath def _read32(bytestream): dt = numpy.dtype(numpy.uint32).newbyteorder('>') return numpy.frombuffer(bytestream.read(4), dtype=dt) def extract_images(filename): """Extract the images into a 4D uint8 numpy array [index, y, x, depth].""" print('Extracting', filename) with gzip.open(filename) as bytestream: magic = _read32(bytestream) if magic != 2051: raise ValueError( 'Invalid magic number %d in MNIST image file: %s' % (magic, filename)) num_images = _read32(bytestream) rows = _read32(bytestream) cols = _read32(bytestream) buf = bytestream.read(rows * cols * num_images) data = numpy.frombuffer(buf, dtype=numpy.uint8) data = data.reshape(num_images, rows, cols, 1) return data def dense_to_one_hot(labels_dense, num_classes=10): """Convert class labels from scalars to one-hot vectors.""" num_labels = labels_dense.shape[0] index_offset = numpy.arange(num_labels) * num_classes labels_one_hot = numpy.zeros((num_labels, num_classes)) labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1 return labels_one_hot def extract_labels(filename, one_hot=False): """Extract the labels into a 1D uint8 numpy array [index].""" print('Extracting', filename) with gzip.open(filename) as bytestream: magic = _read32(bytestream) if magic != 2049: raise ValueError( 'Invalid magic number %d in MNIST label file: %s' % (magic, filename)) num_items = _read32(bytestream) buf = bytestream.read(num_items) labels = numpy.frombuffer(buf, dtype=numpy.uint8) if one_hot: return dense_to_one_hot(labels) return labels class DataSet(object): def __init__(self, images, labels, fake_data=False): if fake_data: self._num_examples = 10000 else: assert images.shape[0] == labels.shape[0], ( "images.shape: %s labels.shape: %s" % (images.shape, labels.shape)) self._num_examples = images.shape[0] # Convert shape from [num examples, rows, columns, depth] # to [num examples, rows*columns] (assuming depth == 1) assert images.shape[3] == 1 images = images.reshape(images.shape[0], images.shape[1] * images.shape[2]) # Convert from [0, 255] -> [0.0, 1.0]. images = images.astype(numpy.float32) images = numpy.multiply(images, 1.0 / 255.0) self._images = images self._labels = labels self._epochs_completed = 0 self._index_in_epoch = 0 @property def images(self): return self._images @property def labels(self): return self._labels @property def num_examples(self): return self._num_examples @property def epochs_completed(self): return self._epochs_completed def next_batch(self, batch_size, fake_data=False): """Return the next `batch_size` examples from this data set.""" if fake_data: fake_image = [1.0 for _ in xrange(784)] fake_label = 0 return [fake_image for _ in xrange(batch_size)], [ fake_label for _ in xrange(batch_size)] start = self._index_in_epoch self._index_in_epoch += batch_size if self._index_in_epoch > self._num_examples: # Finished epoch self._epochs_completed += 1 # Shuffle the data perm = numpy.arange(self._num_examples) numpy.random.shuffle(perm) self._images = self._images[perm] self._labels = self._labels[perm] # Start next epoch start = 0 self._index_in_epoch = batch_size assert batch_size <= self._num_examples end = self._index_in_epoch return self._images[start:end], self._labels[start:end] def read_data_sets(train_dir, fake_data=False, one_hot=False): class DataSets(object): pass data_sets = DataSets() if fake_data: data_sets.train = DataSet([], [], fake_data=True) data_sets.validation = DataSet([], [], fake_data=True) data_sets.test = DataSet([], [], fake_data=True) return data_sets TRAIN_IMAGES = 'train-images-idx3-ubyte.gz' TRAIN_LABELS = 'train-labels-idx1-ubyte.gz' TEST_IMAGES = 't10k-images-idx3-ubyte.gz' TEST_LABELS = 't10k-labels-idx1-ubyte.gz' VALIDATION_SIZE = 5000 local_file = maybe_download(TRAIN_IMAGES, train_dir) train_images = extract_images(local_file) local_file = maybe_download(TRAIN_LABELS, train_dir) train_labels = extract_labels(local_file, one_hot=one_hot) local_file = maybe_download(TEST_IMAGES, train_dir) test_images = extract_images(local_file) local_file = maybe_download(TEST_LABELS, train_dir) test_labels = extract_labels(local_file, one_hot=one_hot) validation_images = train_images[:VALIDATION_SIZE] validation_labels = train_labels[:VALIDATION_SIZE] train_images = train_images[VALIDATION_SIZE:] train_labels = train_labels[VALIDATION_SIZE:] data_sets.train = DataSet(train_images, train_labels) data_sets.validation = DataSet(validation_images, validation_labels) data_sets.test = DataSet(test_images, test_labels) return data_sets ================================================ FILE: SOURCE/tutorials/mnist/mnist.py ================================================ """Builds the MNIST network. Implements the inference/loss/training pattern for model building. 1. inference() - Builds the model as far as is required for running the network forward to make predictions. 2. loss() - Adds to the inference model the layers required to generate loss. 3. training() - Adds to the loss model the Ops required to generate and apply gradients. This file is used by the various "fully_connected_*.py" files and not meant to be run. TensorFlow install instructions: https://tensorflow.org/get_started/os_setup.html MNIST tutorial: https://tensorflow.org/tutorials/mnist/tf/index.html """ import math import tensorflow.python.platform import tensorflow as tf # The MNIST dataset has 10 classes, representing the digits 0 through 9. NUM_CLASSES = 10 # The MNIST images are always 28x28 pixels. IMAGE_SIZE = 28 IMAGE_PIXELS = IMAGE_SIZE * IMAGE_SIZE def inference(images, hidden1_units, hidden2_units): """Build the MNIST model up to where it may be used for inference. Args: images: Images placeholder, from inputs(). hidden1: Size of the first hidden layer. hidden2: Size of the second hidden layer. Returns: softmax_linear: Output tensor with the computed logits. """ # Hidden 1 with tf.name_scope('hidden1') as scope: weights = tf.Variable( tf.truncated_normal([IMAGE_PIXELS, hidden1_units], stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))), name='weights') biases = tf.Variable(tf.zeros([hidden1_units]), name='biases') hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases) # Hidden 2 with tf.name_scope('hidden2') as scope: weights = tf.Variable( tf.truncated_normal([hidden1_units, hidden2_units], stddev=1.0 / math.sqrt(float(hidden1_units))), name='weights') biases = tf.Variable(tf.zeros([hidden2_units]), name='biases') hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases) # Linear with tf.name_scope('softmax_linear') as scope: weights = tf.Variable( tf.truncated_normal([hidden2_units, NUM_CLASSES], stddev=1.0 / math.sqrt(float(hidden2_units))), name='weights') biases = tf.Variable(tf.zeros([NUM_CLASSES]), name='biases') logits = tf.matmul(hidden2, weights) + biases return logits def loss(logits, labels): """Calculates the loss from the logits and the labels. Args: logits: Logits tensor, float - [batch_size, NUM_CLASSES]. labels: Labels tensor, int32 - [batch_size]. Returns: loss: Loss tensor of type float. """ # Convert from sparse integer labels in the range [0, NUM_CLASSSES) # to 1-hot dense float vectors (that is we will have batch_size vectors, # each with NUM_CLASSES values, all of which are 0.0 except there will # be a 1.0 in the entry corresponding to the label). batch_size = tf.size(labels) labels = tf.expand_dims(labels, 1) indices = tf.expand_dims(tf.range(0, batch_size, 1), 1) concated = tf.concat(1, [indices, labels]) onehot_labels = tf.sparse_to_dense( concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, onehot_labels, name='xentropy') loss = tf.reduce_mean(cross_entropy, name='xentropy_mean') return loss def training(loss, learning_rate): """Sets up the training Ops. Creates a summarizer to track the loss over time in TensorBoard. Creates an optimizer and applies the gradients to all trainable variables. The Op returned by this function is what must be passed to the `sess.run()` call to cause the model to train. Args: loss: Loss tensor, from loss(). learning_rate: The learning rate to use for gradient descent. Returns: train_op: The Op for training. """ # Add a scalar summary for the snapshot loss. tf.scalar_summary(loss.op.name, loss) # Create the gradient descent optimizer with the given learning rate. optimizer = tf.train.GradientDescentOptimizer(learning_rate) # Create a variable to track the global step. global_step = tf.Variable(0, name='global_step', trainable=False) # Use the optimizer to apply the gradients that minimize the loss # (and also increment the global step counter) as a single training step. train_op = optimizer.minimize(loss, global_step=global_step) return train_op def evaluation(logits, labels): """Evaluate the quality of the logits at predicting the label. Args: logits: Logits tensor, float - [batch_size, NUM_CLASSES]. labels: Labels tensor, int32 - [batch_size], with values in the range [0, NUM_CLASSES). Returns: A scalar int32 tensor with the number of examples (out of batch_size) that were predicted correctly. """ # For a classifier model, we can use the in_top_k Op. # It returns a bool tensor with shape [batch_size] that is true for # the examples where the label's is was in the top k (here k=1) # of all logits for that example. correct = tf.nn.in_top_k(logits, labels, 1) # Return the number of true entries. return tf.reduce_sum(tf.cast(correct, tf.int32)) ================================================ FILE: SOURCE/tutorials/mnist/mnist_softmax.py ================================================ """A very simple MNIST classifer. See extensive documentation at ??????? (insert public URL) """ from __future__ import print_function # Import data import input_data mnist = input_data.read_data_sets("/tmp/data/", one_hot=True) import tensorflow as tf sess = tf.InteractiveSession() # Create the model x = tf.placeholder("float", [None, 784]) W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) #y = tf.nn.softmax(tf.matmul(x,W) + b) # this will be lead an error because of log(0) y = tf.nn.log_softmax(tf.matmul(x,W) + b) # Define loss and optimizer y_ = tf.placeholder("float", [None,10]) #cross_entropy = -tf.reduce_sum(y_*tf.log(y)) cross_entropy = -tf.reduce_sum(y_*y) train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) # Train tf.initialize_all_variables().run() for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) train_step.run({x: batch_xs, y_: batch_ys}) # Test trained model correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) print(accuracy.eval({x: mnist.test.images, y_: mnist.test.labels})) ================================================ FILE: SOURCE/tutorials/mnist/pros/index.md ================================================ # Deep MNIST for Experts TensorFlow is a powerful library for doing large-scale numerical computation. One of the tasks at which it excels is implementing and training deep neural networks. In this tutorial we will learn the basic building blocks of a TensorFlow model while constructing a deep convolutional MNIST classifier. *This introduction assumes familiarity with neural networks and the MNIST dataset. If you don't have a background with them, check out the [introduction for beginners](../../../tutorials/mnist/beginners/index.md).* ## Setup Before we create our model, we will first load the MNIST dataset, and start a TensorFlow session. ### Load MNIST Data For your convenience, we've included [a script](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py) which automatically downloads and imports the MNIST dataset. It will create a directory `'MNIST_data'` in which to store the data files. ```python import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) ``` Here `mnist` is a lightweight class which stores the training, validation, and testing sets as NumPy arrays. It also provides a function for iterating through data minibatches, which we will use below. ### Start TensorFlow InteractiveSession Tensorflow relies on a highly efficient C++ backend to do its computation. The connection to this backend is called a session. The common usage for TensorFlow programs is to first create a graph and then launch it in a session. Here we instead use the convenience `InteractiveSession` class, which makes TensorFlow more flexible about how you structure your code. It allows you to interleave operations which build a [computation graph](../../../get_started/basic_usage.md#the-computation-graph) with ones that run the graph. This is particularly convenient when working in interactive contexts like iPython. If you are not using an `InteractiveSession`, then you should build the entire computation graph before starting a session and [launching the graph](../../../get_started/basic_usage.md#launching-the-graph-in-a-session). ```python import tensorflow as tf sess = tf.InteractiveSession() ``` #### Computation Graph To do efficient numerical computing in Python, we typically use libraries like NumPy that do expensive operations such as matrix multiplication outside Python, using highly efficient code implemented in another language. Unfortunately, there can still be a lot of overhead from switching back to Python every operation. This overhead is especially bad if you want to run computations on GPUs or in a distributed manner, where there can be a high cost to transferring data. TensorFlow also does its heavy lifting outside Python, but it takes things a step further to avoid this overhead. Instead of running a single expensive operation independently from Python, TensorFlow lets us describe a graph of interacting operations that run entirely outside Python. This approach is similar to that used in Theano or Torch. The role of the Python code is therefore to build this external computation graph, and to dictate which parts of the computation graph should be run. See the [Computation Graph](../../../get_started/basic_usage.md#the-computation-graph) section of [Basic Usage](../../../get_started/basic_usage.md) for more detail. ## Build a Softmax Regression Model In this section we will build a softmax regression model with a single linear layer. In the next section, we will extend this to the case of softmax regression with a multilayer convolutional network. ### Placeholders We start building the computation graph by creating nodes for the input images and target output classes. ```python x = tf.placeholder("float", shape=[None, 784]) y_ = tf.placeholder("float", shape=[None, 10]) ``` Here `x` and `y_` aren't specific values. Rather, they are each a `placeholder` -- a value that we'll input when we ask TensorFlow to run a computation. The input images `x` will consist of a 2d tensor of floating point numbers. Here we assign it a `shape` of `[None, 784]`, where `784` is the dimensionality of a single flattened MNIST image, and `None` indicates that the first dimension, corresponding to the batch size, can be of any size. The target output classes `y_` will also consist of a 2d tensor, where each row is a one-hot 10-dimensional vector indicating which digit class the corresponding MNIST image belongs to. The `shape` argument to `placeholder` is optional, but it allows TensorFlow to automatically catch bugs stemming from inconsistent tensor shapes. ### Variables We now define the weights `W` and biases `b` for our model. We could imagine treating these like additional inputs, but TensorFlow has an even better way to handle them: `Variable`. A `Variable` is a value that lives in TensorFlow's computation graph. It can be used and even modified by the computation. In machine learning applications, one generally has the model paramaters be `Variable`s. ```python W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) ``` We pass the initial value for each parameter in the call to `tf.Variable`. In this case, we initialize both `W` and `b` as tensors full of zeros. `W` is a 784x10 matrix (because we have 784 input features and 10 outputs) and `b` is a 10-dimensional vector (because we have 10 classes). Before `Variable`s can be used within a session, they must be initialized using that session. This step takes the initial values (in this case tensors full of zeros) that have already been specified, and assigns them to each `Variable`. This can be done for all `Variables` at once. ```python sess.run(tf.initialize_all_variables()) ``` ### Predicted Class and Cost Function We can now implement our regression model. It only takes one line! We multiply the vectorized input images `x` by the weight matrix `W`, add the bias `b`, and compute the softmax probabilities that are assigned to each class. ```python y = tf.nn.softmax(tf.matmul(x,W) + b) ``` The cost function to be minimized during training can be specified just as easily. Our cost function will be the cross-entropy between the target and the model's prediction. ```python cross_entropy = -tf.reduce_sum(y_*tf.log(y)) ``` Note that `tf.reduce_sum` sums across all images in the minibatch, as well as all classes. We are computing the cross entropy for the entire minibatch. ## Train the Model Now that we have defined our model and training cost function, it is straightforward to train using TensorFlow. Because TensorFlow knows the entire computation graph, it can use automatic differentiation to find the gradients of the cost with respect to each of the variables. TensorFlow has a variety of [builtin optimization algorithms] (../../../api_docs/python/train.md#optimizers). For this example, we will use steepest gradient descent, with a step length of 0.01, to descend the cross entropy. ```python train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) ``` What TensorFlow actually did in that single line was to add new operations to the computation graph. These operations included ones to compute gradients, compute parameter update steps, and apply update steps to the parameters. The returned operation `train_step`, when run, will apply the gradient descent updates to the parameters. Training the model can therefore be accomplished by repeatedly running `train_step`. ```python for i in range(1000): batch = mnist.train.next_batch(50) train_step.run(feed_dict={x: batch[0], y_: batch[1]}) ``` Each training iteration we load 50 training examples. We then run the `train_step` operation, using `feed_dict` to replace the `placeholder` tensors `x` and `y_` with the training examples. Note that you can replace any tensor in your computation graph using `feed_dict` -- it's not restricted to just `placeholder`s. ### Evaluate the Model How well did our model do? First we'll figure out where we predicted the correct label. `tf.argmax` is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, `tf.argmax(y,1)` is the label our model thinks is most likely for each input, while `tf.argmax(y_,1)` is the true label. We can use `tf.equal` to check if our prediction matches the truth. ```python correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) ``` That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, `[True, False, True, True]` would become `[1,0,1,1]` which would become `0.75`. ```python accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) ``` Finally, we can evaluate our accuracy on the test data. This should be about 91% correct. ```python print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}) ``` ## Build a Multilayer Convolutional Network Getting 91% accuracy on MNIST is bad. It's almost embarrassingly bad. In this section, we'll fix that, jumping from a very simple model to something moderatly sophisticated: a small convolutional neural network. This will get us to around 99.2% accuracy -- not state of the art, but respectable. ### Weight Initialization To create this model, we're going to need to create a lot of weights and biases. One should generally initialize weights with a small amount of noise for symmetry breaking, and to prevent 0 gradients. Since we're using ReLU neurons, it is also good practice to initialize them with a slightly positive initial bias to avoid "dead neurons." Instead of doing this repeatedly while we build the model, let's create two handy functions to do it for us. ```python def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) ``` ### Convolution and Pooling TensorFlow also gives us a lot of flexibility in convolution and pooling operations. How do we handle the boundaries? What is our stride size? In this example, we're always going to choose the vanilla version. Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input. Our pooling is plain old max pooling over 2x2 blocks. To keep our code cleaner, let's also abstract those operations into functions. ```python def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') ``` ### First Convolutional Layer We can now implement our first layer. It will consist of convolution, followed by max pooling. The convolutional will compute 32 features for each 5x5 patch. Its weight tensor will have a shape of `[5, 5, 1, 32]`. The first two dimensions are the patch size, the next is the number of input channels, and the last is the number of output channels. We will also have a bias vector with a component for each output channel. ```python W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) ``` To apply the layer, we first reshape `x` to a 4d tensor, with the second and third dimensions corresponding to image width and height, and the final dimension corresponding to the number of color channels. ```python x_image = tf.reshape(x, [-1,28,28,1]) ``` We then convolve `x_image` with the weight tensor, add the bias, apply the ReLU function, and finally max pool. ```python h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) ``` ### Second Convolutional Layer In order to build a deep network, we stack several layers of this type. The second layer will have 64 features for each 5x5 patch. ```python W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) ``` ### Densely Connected Layer Now that the image size has been reduced to 7x7, we add a fully-connected layer with 1024 neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU. ```python W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) ``` #### Dropout To reduce overfitting, we will apply dropout before the readout layer. We create a `placeholder` for the probability that a neuron's output is kept during dropout. This allows us to turn dropout on during training, and turn it off during testing. TensorFlow's `tf.nn.dropout` op automatically handles scaling neuron outputs in addition to masking them, so dropout just works without any additional scaling. ```python keep_prob = tf.placeholder("float") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) ``` ### Readout Layer Finally, we add a softmax layer, just like for the one layer softmax regression above. ```python W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) ``` ### Train and Evaluate the Model How well does this model do? To train and evaluate it we will use code that is nearly identical to that for the simple one layer SoftMax network above. The differences are that: we will replace the steepest gradient descent optimizer with the more sophisticated ADAM optimizer; we will include the additional parameter `keep_prob` in `feed_dict` to control the dropout rate; and we will add logging to every 100th iteration in the training process. ```python cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) sess.run(tf.initialize_all_variables()) for i in range(20000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(feed_dict={ x:batch[0], y_: batch[1], keep_prob: 1.0}) print "step %d, training accuracy %g"%(i, train_accuracy) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) print "test accuracy %g"%accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}) ``` The final test set accuracy after running this code should be approximately 99.2%. We have learned how to quickly and easily build, train, and evaluate a fairly sophisticated deep learning model using TensorFlow. ================================================ FILE: SOURCE/tutorials/mnist/tf/index.md ================================================ # TensorFlow Mechanics 101 Code: [tensorflow/g3doc/tutorials/mnist/](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/) The goal of this tutorial is to show how to use TensorFlow to train and evaluate a simple feed-forward neural network for handwritten digit classification using the (classic) MNIST data set. The intended audience for this tutorial is experienced machine learning users interested in using TensorFlow. These tutorials are not intended for teaching Machine Learning in general. Please ensure you have followed the instructions to [install TensorFlow](../../../get_started/os_setup.md). ## Tutorial Files This tutorial references the following files: File | Purpose --- | --- [`mnist.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/mnist.py) | The code to build a fully-connected MNIST model. [`fully_connected_feed.py`](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py) | The main code to train the built MNIST model against the downloaded dataset using a feed dictionary. Simply run the `fully_connected_feed.py` file directly to start training: `python fully_connected_feed.py` ## Prepare the Data MNIST is a classic problem in machine learning. The problem is to look at greyscale 28x28 pixel images of handwritten digits and determine which digit the image represents, for all the digits from zero to nine. ![MNIST Digits](./mnist_digits.png "MNIST Digits") For more information, refer to [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/) or [Chris Olah's visualizations of MNIST](http://colah.github.io/posts/2014-10-Visualizing-MNIST/). ### Download At the top of the `run_training()` method, the `input_data.read_data_sets()` function will ensure that the correct data has been downloaded to your local training folder and then unpack that data to return a dictionary of `DataSet` instances. ```python data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data) ``` **NOTE**: The `fake_data` flag is used for unit-testing purposes and may be safely ignored by the reader. Dataset | Purpose --- | --- `data_sets.train` | 55000 images and labels, for primary training. `data_sets.validation` | 5000 images and labels, for iterative validation of training accuracy. `data_sets.test` | 10000 images and labels, for final testing of trained accuracy. For more information about the data, please read the [Download](../../../tutorials/mnist/download/index.md) tutorial. ### Inputs and Placeholders The `placeholder_inputs()` function creates two [`tf.placeholder`](../../../api_docs/python/io_ops.md#placeholder) ops that define the shape of the inputs, including the `batch_size`, to the rest of the graph and into which the actual training examples will be fed. ```python images_placeholder = tf.placeholder(tf.float32, shape=(batch_size, IMAGE_PIXELS)) labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size)) ``` Further down, in the training loop, the full image and label datasets are sliced to fit the `batch_size` for each step, matched with these placeholder ops, and then passed into the `sess.run()` function using the `feed_dict` parameter. ## Build the Graph After creating placeholders for the data, the graph is built from the `mnist.py` file according to a 3-stage pattern: `inference()`, `loss()`, and `training()`. 1. `inference()` - Builds the graph as far as is required for running the network forward to make predictions. 1. `loss()` - Adds to the inference graph the ops required to generate loss. 1. `training()` - Adds to the loss graph the ops required to compute and apply gradients.
### Inference The `inference()` function builds the graph as far as needed to return the tensor that would contain the output predictions. It takes the images placeholder as input and builds on top of it a pair of fully connected layers with ReLu activation followed by a ten node linear layer specifying the output logits. Each layer is created beneath a unique [`tf.name_scope`](../../../api_docs/python/framework.md#name_scope) that acts as a prefix to the items created within that scope. ```python with tf.name_scope('hidden1') as scope: ``` Within the defined scope, the weights and biases to be used by each of these layers are generated into [`tf.Variable`](../../../api_docs/python/state_ops.md#Variable) instances, with their desired shapes: ```python weights = tf.Variable( tf.truncated_normal([IMAGE_PIXELS, hidden1_units], stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))), name='weights') biases = tf.Variable(tf.zeros([hidden1_units]), name='biases') ``` When, for instance, these are created under the `hidden1` scope, the unique name given to the weights variable would be "`hidden1/weights`". Each variable is given initializer ops as part of their construction. In this most common case, the weights are initialized with the [`tf.truncated_normal`](../../../api_docs/python/constant_op.md#truncated_normal) and given their shape of a 2-D tensor with the first dim representing the number of units in the layer from which the weights connect and the second dim representing the number of units in the layer to which the weights connect. For the first layer, named `hidden1`, the dimensions are `[IMAGE_PIXELS, hidden1_units]` because the weights are connecting the image inputs to the hidden1 layer. The `tf.truncated_normal` initializer generates a random distribution with a given mean and standard deviation. Then the biases are initialized with [`tf.zeros`](../../../api_docs/python/constant_op.md#zeros) to ensure they start with all zero values, and their shape is simply the number of units in the layer to which they connect. The graph's three primary ops -- two [`tf.nn.relu`](../../../api_docs/python/nn.md#relu) ops wrapping [`tf.matmul`](../../../api_docs/python/math_ops.md#matmul) for the hidden layers and one extra `tf.matmul` for the logits -- are then created, each in turn, with their `tf.Variable` instances connected to the input placeholder or the output tensor of the layer beneath each. ```python hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases) ``` ```python hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases) ``` ```python logits = tf.matmul(hidden2, weights) + biases ``` Finally, the `logits` tensor that will contain the output is returned. ### Loss The `loss()` function further builds the graph by adding the required loss ops. First, the values from the `labels_placeholder` are encoded as a tensor of 1-hot values. For example, if the class identifier is '3' the value is converted to:
`[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]` ```python batch_size = tf.size(labels) labels = tf.expand_dims(labels, 1) indices = tf.expand_dims(tf.range(0, batch_size, 1), 1) concated = tf.concat(1, [indices, labels]) onehot_labels = tf.sparse_to_dense( concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) ``` A [`tf.nn.softmax_cross_entropy_with_logits`](../../../api_docs/python/nn.md#softmax_cross_entropy_with_logits) op is then added to compare the output logits from the `inference()` function and the 1-hot labels. ```python cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, onehot_labels, name='xentropy') ``` It then uses [`tf.reduce_mean`](../../../api_docs/python/math_ops.md#reduce_mean) to average the cross entropy values across the batch dimension (the first dimension) as the total loss. ```python loss = tf.reduce_mean(cross_entropy, name='xentropy_mean') ``` And the tensor that will then contain the loss value is returned. > Note: Cross-entropy is an idea from information theory that allows us > to describe how bad it is to believe the predictions of the neural network, > given what is actually true. For more information, read the blog post Visual > Information Theory (http://colah.github.io/posts/2015-09-Visual-Information/) ### Training The `training()` function adds the operations needed to minimize the loss via gradient descent. Firstly, it takes the loss tensor from the `loss()` function and hands it to a [`tf.scalar_summary`](../../../api_docs/python/train.md#scalar_summary), an op for generating summary values into the events file when used with a `SummaryWriter` (see below). In this case, it will emit the snapshot value of the loss every time the summaries are written out. ```python tf.scalar_summary(loss.op.name, loss) ``` Next, we instantiate a [`tf.train.GradientDescentOptimizer`](../../../api_docs/python/train.md#GradientDescentOptimizer) responsible for applying gradients with the requested learning rate. ```python optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate) ``` We then generate a single variable to contain a counter for the global training step and the [`minimize()`](../../../api_docs/python/train.md#Optimizer.minimize) op is used to both update the trainable weights in the system and increment the global step. This is, by convention, known as the `train_op` and is what must be run by a TensorFlow session in order to induce one full step of training (see below). ```python global_step = tf.Variable(0, name='global_step', trainable=False) train_op = optimizer.minimize(loss, global_step=global_step) ``` The tensor containing the outputs of the training op is returned. ## Train the Model Once the graph is built, it can be iteratively trained and evaluated in a loop controlled by the user code in `fully_connected_feed.py`. ### The Graph At the top of the `run_training()` function is a python `with` command that indicates all of the built ops are to be associated with the default global [`tf.Graph`](../../../api_docs/python/framework.md#Graph) instance. ```python with tf.Graph().as_default(): ``` A `tf.Graph` is a collection of ops that may be executed together as a group. Most TensorFlow uses will only need to rely on the single default graph. More complicated uses with multiple graphs are possible, but beyond the scope of this simple tutorial. ### The Session Once all of the build preparation has been completed and all of the necessary ops generated, a [`tf.Session`](../../../api_docs/python/client.md#Session) is created for running the graph. ```python sess = tf.Session() ``` Alternately, a `Session` may be generated into a `with` block for scoping: ```python with tf.Session() as sess: ``` The empty parameter to session indicates that this code will attach to (or create if not yet created) the default local session. Immediately after creating the session, all of the `tf.Variable` instances are initialized by calling [`sess.run()`](../../../api_docs/python/client.md#Session.run) on their initialization op. ```python init = tf.initialize_all_variables() sess.run(init) ``` The [`sess.run()`](../../../api_docs/python/client.md#Session.run) method will run the complete subset of the graph that corresponds to the op(s) passed as parameters. In this first call, the `init` op is a [`tf.group`](../../../api_docs/python/control_flow_ops.md#group) that contains only the initializers for the variables. None of the rest of the graph is run here; that happens in the training loop below. ### Train Loop After initializing the variables with the session, training may begin. The user code controls the training per step, and the simplest loop that can do useful training is: ```python for step in xrange(max_steps): sess.run(train_op) ``` However, this tutorial is slightly more complicated in that it must also slice up the input data for each step to match the previously generated placeholders. #### Feed the Graph For each step, the code will generate a feed dictionary that will contain the set of examples on which to train for the step, keyed by the placeholder ops they represent. In the `fill_feed_dict()` function, the given `DataSet` is queried for its next `batch_size` set of images and labels, and tensors matching the placeholders are filled containing the next images and labels. ```python images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size) ``` A python dictionary object is then generated with the placeholders as keys and the representative feed tensors as values. ```python feed_dict = { images_placeholder: images_feed, labels_placeholder: labels_feed, } ``` This is passed into the `sess.run()` function's `feed_dict` parameter to provide the input examples for this step of training. #### Check the Status The code specifies two values to fetch in its run call: `[train_op, loss]`. ```python for step in xrange(FLAGS.max_steps): feed_dict = fill_feed_dict(data_sets.train, images_placeholder, labels_placeholder) _, loss_value = sess.run([train_op, loss], feed_dict=feed_dict) ``` Because there are two values to fetch, `sess.run()` returns a tuple with two items. Each `Tensor` in the list of values to fetch corresponds to a numpy array in the returned tuple, filled with the value of that tensor during this step of training. Since `train_op` is an `Operation` with no output value, the corresponding element in the returned tuple is `None` and, thus, discarded. However, the value of the `loss` tensor may become NaN if the model diverges during training, so we capture this value for logging. Assuming that the training runs fine without NaNs, the training loop also prints a simple status text every 100 steps to let the user know the state of training. ```python if step % 100 == 0: print 'Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration) ``` #### Visualize the Status In order to emit the events files used by [TensorBoard](../../../how_tos/summaries_and_tensorboard/index.md), all of the summaries (in this case, only one) are collected into a single op during the graph building phase. ```python summary_op = tf.merge_all_summaries() ``` And then after the session is created, a [`tf.train.SummaryWriter`](../../../api_docs/python/train.md#SummaryWriter) may be instantiated to write the events files, which contain both the graph itself and the values of the summaries. ```python summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, graph_def=sess.graph_def) ``` Lastly, the events file will be updated with new summary values every time the `summary_op` is run and the ouput passed to the writer's `add_summary()` function. ```python summary_str = sess.run(summary_op, feed_dict=feed_dict) summary_writer.add_summary(summary_str, step) ``` When the events files are written, TensorBoard may be run against the training folder to display the values from the summaries. ![MNIST TensorBoard](./mnist_tensorboard.png "MNIST TensorBoard") **NOTE**: For more info about how to build and run Tensorboard, please see the accompanying tutorial [Tensorboard: Visualizing Your Training](../../../how_tos/summaries_and_tensorboard/index.md). #### Save a Checkpoint In order to emit a checkpoint file that may be used to later restore a model for further training or evaluation, we instantiate a [`tf.train.Saver`](../../../api_docs/python/state_ops.md#Saver). ```python saver = tf.train.Saver() ``` In the training loop, the [`saver.save()`](../../../api_docs/python/state_ops.md#Saver.save) method will periodically be called to write a checkpoint file to the training directory with the current values of all the trainable variables. ```python saver.save(sess, FLAGS.train_dir, global_step=step) ``` At some later point in the future, training might be resumed by using the [`saver.restore()`](../../../api_docs/python/state_ops.md#Saver.restore) method to reload the model parameters. ```python saver.restore(sess, FLAGS.train_dir) ``` ## Evaluate the Model Every thousand steps, the code will attempt to evaluate the model against both the training and test datasets. The `do_eval()` function is called thrice, for the training, validation, and test datasets. ```python print 'Training Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.train) print 'Validation Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.validation) print 'Test Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.test) ``` > Note that more complicated usage would usually sequester the `data_sets.test` > to only be checked after significant amounts of hyperparameter tuning. For > the sake of a simple little MNIST problem, however, we evaluate against all of > the data. ### Build the Eval Graph Before opening the default Graph, the test data should have been fetched by calling the `get_data(train=False)` function with the parameter set to grab the test dataset. ```python test_all_images, test_all_labels = get_data(train=False) ``` Before entering the training loop, the Eval op should have been built by calling the `evaluation()` function from `mnist.py` with the same logits/labels parameters as the `loss()` function. ```python eval_correct = mnist.evaluation(logits, labels_placeholder) ``` The `evaluation()` function simply generates a [`tf.nn.in_top_k`](../../../api_docs/python/nn.md#in_top_k) op that can automatically score each model output as correct if the true label can be found in the K most-likely predictions. In this case, we set the value of K to 1 to only consider a prediction correct if it is for the true label. ```python eval_correct = tf.nn.in_top_k(logits, labels, 1) ``` ### Eval Output One can then create a loop for filling a `feed_dict` and calling `sess.run()` against the `eval_correct` op to evaluate the model on the given dataset. ```python for step in xrange(steps_per_epoch): feed_dict = fill_feed_dict(data_set, images_placeholder, labels_placeholder) true_count += sess.run(eval_correct, feed_dict=feed_dict) ``` The `true_count` variable simply accumulates all of the predictions that the `in_top_k` op has determined to be correct. From there, the precision may be calculated from simply dividing by the total number of examples. ```python precision = float(true_count) / float(num_examples) print ' Num examples: %d Num correct: %d Precision @ 1: %0.02f' % ( num_examples, true_count, precision) ``` ================================================ FILE: SOURCE/tutorials/mnist_beginners.md ================================================ # MNIST机器学习入门 *这个教程的目标读者是对机器学习和TensorFlow都不太了解的新手。如果你已经了解MNIST和softmax回归(softmax regression)的相关知识,你可以阅读这个[快速上手教程](./mnist_pros.md)。* 当我们开始学习编程的时候,第一件事往往是学习打印"Hello World"。就好比编程入门有Hello World,机器学习入门有MNIST。 MNIST是一个入门级的计算机视觉数据集,它包含各种手写数字图片:
它也包含每一张图片对应的标签,告诉我们这个是数字几。比如,上面这四张图片的标签分别是5,0,4,1。 在此教程中,我们将训练一个机器学习模型用于预测图片里面的数字。我们的目的不是要设计一个世界一流的复杂模型 -- 尽管我们会在之后给你源代码去实现一流的预测模型 -- 而是要介绍下如何使用TensorFlow。所以,我们这里会从一个很简单的数学模型开始,它叫做Softmax Regression。 对应这个教程的实现代码很短,而且真正有意思的内容只包含在三行代码里面。但是,去理解包含在这些代码里面的设计思想是非常重要的:TensorFlow工作流程和机器学习的基本概念。因此,这个教程会很详细地介绍这些代码的实现原理。 ## MNIST数据集 MNIST数据集的官网是[Yann LeCun's website](http://yann.lecun.com/exdb/mnist/)。在这里,我们提供了一份python源代码用于自动下载和安装这个数据集。你可以下载[这份代码](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/tutorials/mnist/input_data.py),然后用下面的代码导入到你的项目里面,也可以直接复制粘贴到你的代码文件里面。 ```python import tensorflow.examples.tutorials.mnist.input_data as input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) ``` 下载下来的数据集被分成两部分:60000行的训练数据集(`mnist.train`)和10000行的测试数据集(`mnist.test`)。这样的切分很重要,在机器学习模型设计时必须有一个单独的测试数据集不用于训练而是用来评估这个模型的性能,从而更加容易把设计的模型推广到其他数据集上(泛化)。 正如前面提到的一样,每一个MNIST数据单元有两部分组成:一张包含手写数字的图片和一个对应的标签。我们把这些图片设为“xs”,把这些标签设为“ys”。训练数据集和测试数据集都包含xs和ys,比如训练数据集的图片是 `mnist.train.images` ,训练数据集的标签是 `mnist.train.labels`。 每一张图片包含28X28个像素点。我们可以用一个数字数组来表示这张图片:
我们把这个数组展开成一个向量,长度是 28x28 = 784。如何展开这个数组(数字间的顺序)不重要,只要保持各个图片采用相同的方式展开。从这个角度来看,MNIST数据集的图片就是在784维向量空间里面的点, 并且拥有比较[复杂的结构](http://colah.github.io/posts/2014-10-Visualizing-MNIST/) (提醒: 此类数据的可视化是计算密集型的)。 展平图片的数字数组会丢失图片的二维结构信息。这显然是不理想的,最优秀的计算机视觉方法会挖掘并利用这些结构信息,我们会在后续教程中介绍。但是在这个教程中我们忽略这些结构,所介绍的简单数学模型,softmax回归(softmax regression),不会利用这些结构信息。 因此,在MNIST训练数据集中,`mnist.train.images` 是一个形状为 `[60000, 784]` 的张量,第一个维度数字用来索引图片,第二个维度数字用来索引每张图片中的像素点。在此张量里的每一个元素,都表示某张图片里的某个像素的强度值,值介于0和1之间。
相对应的MNIST数据集的标签是介于0到9的数字,用来描述给定图片里表示的数字。为了用于这个教程,我们使标签数据是"one-hot vectors"。 一个one-hot向量除了某一位的数字是1以外其余各维度数字都是0。所以在此教程中,数字n将表示成一个只有在第n维度(从0开始)数字为1的10维向量。比如,标签0将表示成([1,0,0,0,0,0,0,0,0,0,0])。因此, `mnist.train.labels` 是一个 `[60000, 10]` 的数字矩阵。
现在,我们准备好可以开始构建我们的模型啦! ## Softmax回归介绍 我们知道MNIST的每一张图片都表示一个数字,从0到9。我们希望得到给定图片代表每个数字的概率。比如说,我们的模型可能推测一张包含9的图片代表数字9的概率是80%但是判断它是8的概率是5%(因为8和9都有上半部分的小圆),然后给予它代表其他数字的概率更小的值。 这是一个使用softmax回归(softmax regression)模型的经典案例。softmax模型可以用来给不同的对象分配概率。即使在之后,我们训练更加精细的模型时,最后一步也需要用softmax来分配概率。 softmax回归(softmax regression)分两步:第一步 为了得到一张给定图片属于某个特定数字类的证据(evidence),我们对图片像素值进行加权求和。如果这个像素具有很强的证据说明这张图片不属于该类,那么相应的权值为负数,相反如果这个像素拥有有利的证据支持这张图片属于这个类,那么权值是正数。 下面的图片显示了一个模型学习到的图片上每个像素对于特定数字类的权值。红色代表负数权值,蓝色代表正数权值。
我们也需要加入一个额外的偏置量(bias),因为输入往往会带有一些无关的干扰量。因此对于给定的输入图片 **x** 它代表的是数字 **i** 的证据可以表示为 ![](../images/mnist1.png) 其中 ![](../images/mnist2.png) 代表权重,![](../images/mnist3.png) 代表数字 **i** 类的偏置量,**j** 代表给定图片 **x** 的像素索引用于像素求和。然后用softmax函数可以把这些证据转换成概率 **y**: ![](../images/mnist4.png) 这里的softmax可以看成是一个激励(activation)函数或者链接(link)函数,把我们定义的线性函数的输出转换成我们想要的格式,也就是关于10个数字类的概率分布。因此,给定一张图片,它对于每一个数字的吻合度可以被softmax函数转换成为一个概率值。softmax函数可以定义为: ![](../images/mnist5.png) 展开等式右边的子式,可以得到: ![](../images/mnist6.png) 但是更多的时候把softmax模型函数定义为前一种形式:把输入值当成幂指数求值,再正则化这些结果值。这个幂运算表示,更大的证据对应更大的假设模型(hypothesis)里面的乘数权重值。反之,拥有更少的证据意味着在假设模型里面拥有更小的乘数系数。假设模型里的权值不可以是0值或者负值。Softmax然后会正则化这些权重值,使它们的总和等于1,以此构造一个有效的概率分布。(更多的关于Softmax函数的信息,可以参考Michael Nieslen的书里面的这个[部分](http://neuralnetworksanddeeplearning.com/chap3.html#softmax),其中有关于softmax的可交互式的可视化解释。) 对于softmax回归模型可以用下面的图解释,对于输入的`xs`加权求和,再分别加上一个偏置量,最后再输入到softmax函数中:
如果把它写成一个等式,我们可以得到:
我们也可以用向量表示这个计算过程:用矩阵乘法和向量相加。这有助于提高计算效率。(也是一种更有效的思考方式)
更进一步,可以写成更加紧凑的方式: ![](../images/mnist7.png) ## 实现回归模型 为了用python实现高效的数值计算,我们通常会使用函数库,比如NumPy,会把类似矩阵乘法这样的复杂运算使用其他外部语言实现。不幸的是,从外部计算切换回Python的每一个操作,仍然是一个很大的开销。如果你用GPU来进行外部计算,这样的开销会更大。用分布式的计算方式,也会花费更多的资源用来传输数据。 TensorFlow也把复杂的计算放在python之外完成,但是为了避免前面说的那些开销,它做了进一步完善。Tensorflow不单独地运行单一的复杂计算,而是让我们可以先用图描述一系列可交互的计算操作,然后全部一起在Python之外运行。(这样类似的运行方式,可以在不少的机器学习库中看到。) 使用TensorFlow之前,首先导入它: ```python import tensorflow as tf ``` 我们通过操作符号变量来描述这些可交互的操作单元,可以用下面的方式创建一个: ```python x = tf.placeholder(tf.float32, [None, 784]) ``` `x`不是一个特定的值,而是一个占位符`placeholder`,我们在TensorFlow运行计算时输入这个值。我们希望能够输入任意数量的MNIST图像,每一张图展平成784维的向量。我们用2维的浮点数张量来表示这些图,这个张量的形状是`[None,784 ]`。(这里的`None`表示此张量的第一个维度可以是任何长度的。) 我们的模型也需要权重值和偏置量,当然我们可以把它们当做是另外的输入(使用占位符),但TensorFlow有一个更好的方法来表示它们:`Variable` 。 一个`Variable`代表一个可修改的张量,存在在TensorFlow的用于描述交互性操作的图中。它们可以用于计算输入值,也可以在计算中被修改。对于各种机器学习应用,一般都会有模型参数,可以用`Variable`表示。 ```python W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) ``` 我们赋予`tf.Variable`不同的初值来创建不同的`Variable`:在这里,我们都用全为零的张量来初始化`W`和`b`。因为我们要学习`W`和`b`的值,它们的初值可以随意设置。 注意,`W`的维度是[784,10],因为我们想要用784维的图片向量乘以它以得到一个10维的证据值向量,每一位对应不同数字类。`b`的形状是[10],所以我们可以直接把它加到输出上面。 现在,我们可以实现我们的模型啦。只需要一行代码! ```python y = tf.nn.softmax(tf.matmul(x,W) + b) ``` 首先,我们用`tf.matmul(​​X,W)`表示`x`乘以`W`,对应之前等式里面的![](../images/mnist8.png),这里`x`是一个2维张量拥有多个输入。然后再加上`b`,把和输入到`tf.nn.softmax`函数里面。 至此,我们先用了几行简短的代码来设置变量,然后只用了一行代码来定义我们的模型。TensorFlow不仅仅可以使softmax回归模型计算变得特别简单,它也用这种非常灵活的方式来描述其他各种数值计算,从机器学习模型对物理学模拟仿真模型。一旦被定义好之后,我们的模型就可以在不同的设备上运行:计算机的CPU,GPU,甚至是手机! ## 训练模型 为了训练我们的模型,我们首先需要定义一个指标来评估这个模型是好的。其实,在机器学习,我们通常定义指标来表示一个模型是坏的,这个指标称为成本(cost)或损失(loss),然后尽量最小化这个指标。但是,这两种方式是相同的。 一个非常常见的,非常漂亮的成本函数是“交叉熵”(cross-entropy)。交叉熵产生于信息论里面的信息压缩编码技术,但是它后来演变成为从博弈论到机器学习等其他领域里的重要技术手段。它的定义如下: ![](../images/mnist10.png) **y** 是我们预测的概率分布, **y'** 是实际的分布(我们输入的one-hot vector)。比较粗糙的理解是,交叉熵是用来衡量我们的预测用于描述真相的低效性。更详细的关于交叉熵的解释超出本教程的范畴,但是你很有必要好好[理解它](http://colah.github.io/posts/2015-09-Visual-Information/)。 为了计算交叉熵,我们首先需要添加一个新的占位符用于输入正确值: ```python y_ = tf.placeholder("float", [None,10]) ``` 然后我们可以用 ![](../images/mnist9.png) 计算交叉熵: ```python cross_entropy = -tf.reduce_sum(y_*tf.log(y)) ``` 首先,用 `tf.log` 计算 `y` 的每个元素的对数。接下来,我们把 `y_` 的每一个元素和 `tf.log(y)` 的对应元素相乘。最后,用 `tf.reduce_sum` 计算张量的所有元素的总和。(注意,这里的交叉熵不仅仅用来衡量单一的一对预测和真实值,而是所有100幅图片的交叉熵的总和。对于100个数据点的预测表现比单一数据点的表现能更好地描述我们的模型的性能。 现在我们知道我们需要我们的模型做什么啦,用TensorFlow来训练它是非常容易的。因为TensorFlow拥有一张描述你各个计算单元的图,它可以自动地使用[反向传播算法(backpropagation algorithm)](http://colah.github.io/posts/2015-08-Backprop/)来有效地确定你的变量是如何影响你想要最小化的那个成本值的。然后,TensorFlow会用你选择的优化算法来不断地修改变量以降低成本。 ```python train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) ``` 在这里,我们要求TensorFlow用梯度下降算法(gradient descent algorithm)以0.01的学习速率最小化交叉熵。梯度下降算法(gradient descent algorithm)是一个简单的学习过程,TensorFlow只需将每个变量一点点地往使成本不断降低的方向移动。当然TensorFlow也提供了[其他许多优化算法](../api_docs/python/train.md#optimizers):只要简单地调整一行代码就可以使用其他的算法。 TensorFlow在这里实际上所做的是,它会在后台给描述你的计算的那张图里面增加一系列新的计算操作单元用于实现反向传播算法和梯度下降算法。然后,它返回给你的只是一个单一的操作,当运行这个操作时,它用梯度下降算法训练你的模型,微调你的变量,不断减少成本。 现在,我们已经设置好了我们的模型。在运行计算之前,我们需要添加一个操作来初始化我们创建的变量: ```python init = tf.initialize_all_variables() ``` 现在我们可以在一个`Session`里面启动我们的模型,并且初始化变量: ```python sess = tf.Session() sess.run(init) ``` 然后开始训练模型,这里我们让模型循环训练1000次! ```python for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) ``` 该循环的每个步骤中,我们都会随机抓取训练数据中的100个批处理数据点,然后我们用这些数据点作为参数替换之前的占位符来运行`train_step`。 使用一小部分的随机数据来进行训练被称为随机训练(stochastic training)- 在这里更确切的说是随机梯度下降训练。在理想情况下,我们希望用我们所有的数据来进行每一步的训练,因为这能给我们更好的训练结果,但显然这需要很大的计算开销。所以,每一次训练我们可以使用不同的数据子集,这样做既可以减少计算开销,又可以最大化地学习到数据集的总体特性。 ## 评估我们的模型 那么我们的模型性能如何呢? 首先让我们找出那些预测正确的标签。`tf.argmax` 是一个非常有用的函数,它能给出某个tensor对象在某一维上的其数据最大值所在的索引值。由于标签向量是由0,1组成,因此最大值1所在的索引位置就是类别标签,比如`tf.argmax(y,1)`返回的是模型对于任一输入x预测到的标签值,而 `tf.argmax(y_,1)` 代表正确的标签,我们可以用 `tf.equal` 来检测我们的预测是否真实标签匹配(索引位置一样表示匹配)。 ```python correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) ``` 这行代码会给我们一组布尔值。为了确定正确预测项的比例,我们可以把布尔值转换成浮点数,然后取平均值。例如,`[True, False, True, True]` 会变成 `[1,0,1,1]` ,取平均值后得到 `0.75`. ```python accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) ``` 最后,我们计算所学习到的模型在测试数据集上面的正确率。 ```python print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}) ``` 这个最终结果值应该大约是91%。 这个结果好吗?嗯,并不太好。事实上,这个结果是很差的。这是因为我们仅仅使用了一个非常简单的模型。不过,做一些小小的改进,我们就可以得到97%的正确率。最好的模型甚至可以获得超过99.7%的准确率!(想了解更多信息,可以看看这个关于各种模型的[性能对比列表](http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html)。) 比结果更重要的是,我们从这个模型中学习到的设计思想。不过,如果你仍然对这里的结果有点失望,可以查看[下一个教程](./mnist_pros.md),在那里你可以学习如何用TensorFlow构建更加复杂的模型以获得更好的性能! 原文地址:[MNIST For ML Beginners](http://tensorflow.org/tutorials/mnist/beginners/index.md) 翻译:[linbojin](https://github.com/linbojin) 校对: ================================================ FILE: SOURCE/tutorials/mnist_download.md ================================================ # MNIST 数据下载 源码: [tensorflow/g3doc/tutorials/mnist/](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/mnist) 本教程的目标是展示如何下载用于手写数字分类问题所要用到的(经典)MNIST数据集。 ## 教程 文件 本教程需要使用以下文件: 文件 | 目的 --- | --- [`input_data.py`](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/tutorials/mnist/input_data.py) | 下载用于训练和测试的MNIST数据集的源码 ## 准备数据 MNIST是在机器学习领域中的一个经典问题。该问题解决的是把28x28像素的灰度手写数字图片识别为相应的数字,其中数字的范围从0到9. ![MNIST Digits](../images/mnist_digits.png "MNIST Digits") 更多详情, 请参考 [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/) 或 [Chris Olah's visualizations of MNIST](http://colah.github.io/posts/2014-10-Visualizing-MNIST/). ### 下载 [Yann LeCun's MNIST page](http://yann.lecun.com/exdb/mnist/) 也提供了训练集与测试集数据的下载。 文件 | 内容 --- | --- [`train-images-idx3-ubyte.gz`](http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz) | 训练集图片 - 55000 张 训练图片, 5000 张 验证图片 [`train-labels-idx1-ubyte.gz`](http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz) | 训练集图片对应的数字标签 [`t10k-images-idx3-ubyte.gz`](http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz) | 测试集图片 - 10000 张 图片 [`t10k-labels-idx1-ubyte.gz`](http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz) | 测试集图片对应的数字标签 在 `input_data.py` 文件中, `maybe_download()` 函数可以确保这些训练数据下载到本地文件夹中。 文件夹的名字在 `fully_connected_feed.py` 文件的顶部由一个标记变量指定,你可以根据自己的需要进行修改。 ### 解压 与 重构 这些文件本身并没有使用标准的图片格式储存,并且需要使用`input_data.py`文件中`extract_images()` 和`extract_labels()`函数来手动解压(页面中有相关说明)。 图片数据将被解压成2维的tensor:`[image index, pixel index]` 其中每一项表示某一图片中特定像素的强度值, 范围从 `[0, 255]` 到 `[-0.5, 0.5]`。 "image index"代表数据集中图片的编号, 从0到数据集的上限值。"pixel index"代表该图片中像素点得个数, 从0到图片的像素上限值。 以`train-*`开头的文件中包括60000个样本,其中分割出55000个样本作为训练集,其余的5000个样本作为验证集。因为所有数据集中28x28像素的灰度图片的尺寸为784,所以训练集输出的tensor格式为`[55000, 784]`。 数字标签数据被解压称1维的tensor: `[image index]`,它定义了每个样本数值的类别分类。对于训练集的标签来说,这个数据规模就是:`[55000]`。 ### 数据集 对象 底层的源码将会执行下载、解压、重构图片和标签数据来组成以下的数据集对象: 数据集 | 目的 --- | --- `data_sets.train` | 55000 组 图片和标签, 用于训练。 `data_sets.validation` | 5000 组 图片和标签, 用于迭代验证训练的准确性。 `data_sets.test` | 10000 组 图片和标签, 用于最终测试训练的准确性。 执行`read_data_sets()`函数将会返回一个`DataSet`实例,其中包含了以上三个数据集。函数`DataSet.next_batch()`是用于获取以`batch_size`为大小的一个元组,其中包含了一组图片和标签,该元组会被用于当前的TensorFlow运算会话中。 ```python images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size) ``` 原文地址:[MNIST Data Download](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/tutorials/mnist/download/index.md) 翻译:[btpeter](https://github.com/btpeter) 校对:waiwaizheng ================================================ FILE: SOURCE/tutorials/mnist_pros.md ================================================ # 深入MNIST TensorFlow是一个非常强大的用来做大规模数值计算的库。其所擅长的任务之一就是实现以及训练深度神经网络。 在本教程中,我们将学到构建一个TensorFlow模型的基本步骤,并将通过这些步骤为MNIST构建一个深度卷积神经网络。 *这个教程假设你已经熟悉神经网络和MNIST数据集。如果你尚未了解,请查看[新手指南](./mnist_beginners.md).* ## 安装 在创建模型之前,我们会先加载MNIST数据集,然后启动一个TensorFlow的session。 ### 加载MNIST数据 为了方便起见,我们已经准备了[一个脚本](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/tutorials/mnist/input_data.py)来自动下载和导入MNIST数据集。它会自动创建一个`'MNIST_data'`的目录来存储数据。 ```python import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) ``` 这里,`mnist`是一个轻量级的类。它以Numpy数组的形式存储着训练、校验和测试数据集。同时提供了一个函数,用于在迭代中获得minibatch,后面我们将会用到。 ### 运行TensorFlow的InteractiveSession Tensorflow依赖于一个高效的C++后端来进行计算。与后端的这个连接叫做session。一般而言,使用TensorFlow程序的流程是先创建一个图,然后在session中启动它。 这里,我们使用更加方便的`InteractiveSession`类。通过它,你可以更加灵活地构建你的代码。它能让你在运行图的时候,插入一些[计算图](../get_started/basic_usage.md#the-computation-graph),这些计算图是由某些操作(operations)构成的。这对于工作在交互式环境中的人们来说非常便利,比如使用IPython。如果你没有使用`InteractiveSession`,那么你需要在启动session之前构建整个计算图,然后[启动该计算图](../get_started/basic_usage.md#launching-the-graph-in-a-session)。 ```python import tensorflow as tf sess = tf.InteractiveSession() ``` #### 计算图 为了在Python中进行高效的数值计算,我们通常会使用像NumPy一类的库,将一些诸如矩阵乘法的耗时操作在Python环境的外部来计算,这些计算通常会通过其它语言并用更为高效的代码来实现。 但遗憾的是,每一个操作切换回Python环境时仍需要不小的开销。如果你想在GPU或者分布式环境中计算时,这一开销更加可怖,这一开销主要可能是用来进行数据迁移。 TensorFlow也是在Python外部完成其主要工作,但是进行了改进以避免这种开销。其并没有采用在Python外部独立运行某个耗时操作的方式,而是先让我们描述一个交互操作图,然后完全将其运行在Python外部。这与Theano或Torch的做法类似。 因此Python代码的目的是用来构建这个可以在外部运行的计算图,以及安排计算图的哪一部分应该被运行。详情请查看[基本用法](../../../get_started/basic_usage.md)中的[计算图表](../../../get_started/basic_usage.md#the-computation-graph)一节。 ## 构建Softmax 回归模型 在这一节中我们将建立一个拥有一个线性层的softmax回归模型。在下一节,我们会将其扩展为一个拥有多层卷积网络的softmax回归模型。 ### 占位符 我们通过为输入图像和目标输出类别创建节点,来开始构建计算图。 ```python x = tf.placeholder("float", shape=[None, 784]) y_ = tf.placeholder("float", shape=[None, 10]) ``` 这里的`x`和`y`并不是特定的值,相反,他们都只是一个`占位符`,可以在TensorFlow运行某一计算时根据该占位符输入具体的值。 输入图片`x`是一个2维的浮点数张量。这里,分配给它的`shape`为`[None, 784]`,其中`784`是一张展平的MNIST图片的维度。`None`表示其值大小不定,在这里作为第一个维度值,用以指代batch的大小,意即`x`的数量不定。输出类别值`y_`也是一个2维张量,其中每一行为一个10维的one-hot向量,用于代表对应某一MNIST图片的类别。 虽然`placeholder`的`shape`参数是可选的,但有了它,TensorFlow能够自动捕捉因数据维度不一致导致的错误。 ### 变量 我们现在为模型定义权重`W`和偏置`b`。可以将它们当作额外的输入量,但是TensorFlow有一个更好的处理方式:`变量`。一个`变量`代表着TensorFlow计算图中的一个值,能够在计算过程中使用,甚至进行修改。在机器学习的应用过程中,模型参数一般用`Variable`来表示。 ```python W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) ``` 我们在调用`tf.Variable`的时候传入初始值。在这个例子里,我们把`W`和`b`都初始化为零向量。`W`是一个784x10的矩阵(因为我们有784个特征和10个输出值)。`b`是一个10维的向量(因为我们有10个分类)。 Before `Variable`s can be used within a session, they must be initialized using that session. This step takes the initial values (in this case tensors full of zeros) that have already been specified, and assigns them to each `Variable`. This can be done for all `Variables` at once. `变量`需要通过seesion初始化后,才能在session中使用。这一初始化步骤为,为初始值指定具体值(本例当中是全为零),并将其分配给每个`变量`,可以一次性为所有`变量`完成此操作。 ```python sess.run(tf.initialize_all_variables()) ``` ### 类别预测与损失函数 现在我们可以实现我们的回归模型了。这只需要一行!我们把向量化后的图片`x`和权重矩阵`W`相乘,加上偏置`b`,然后计算每个分类的softmax概率值。 ```python y = tf.nn.softmax(tf.matmul(x,W) + b) ``` 可以很容易的为训练过程指定最小化误差用的损失函数,我们的损失函数是目标类别和预测类别之间的交叉熵。 ```python cross_entropy = -tf.reduce_sum(y_*tf.log(y)) ``` 注意,`tf.reduce_sum`把minibatch里的每张图片的交叉熵值都加起来了。我们计算的交叉熵是指整个minibatch的。 ## 训练模型 我们已经定义好模型和训练用的损失函数,那么用TensorFlow进行训练就很简单了。因为TensorFlow知道整个计算图,它可以使用自动微分法找到对于各个变量的损失的梯度值。TensorFlow有[大量内置的优化算法](../../../api_docs/python/train.md#optimizers) 这个例子中,我们用最速下降法让交叉熵下降,步长为0.01. ```python train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) ``` 这一行代码实际上是用来往计算图上添加一个新操作,其中包括计算梯度,计算每个参数的步长变化,并且计算出新的参数值。 返回的`train_step`操作对象,在运行时会使用梯度下降来更新参数。因此,整个模型的训练可以通过反复地运行`train_step`来完成。 ```python for i in range(1000): batch = mnist.train.next_batch(50) train_step.run(feed_dict={x: batch[0], y_: batch[1]}) ``` 每一步迭代,我们都会加载50个训练样本,然后执行一次`train_step`,并通过`feed_dict`将`x` 和 `y_`张量`占位符`用训练训练数据替代。 注意,在计算图中,你可以用`feed_dict`来替代任何张量,并不仅限于替换`占位符`。 ### 评估模型 那么我们的模型性能如何呢? 首先让我们找出那些预测正确的标签。`tf.argmax` 是一个非常有用的函数,它能给出某个tensor对象在某一维上的其数据最大值所在的索引值。由于标签向量是由0,1组成,因此最大值1所在的索引位置就是类别标签,比如`tf.argmax(y,1)`返回的是模型对于任一输入x预测到的标签值,而 `tf.argmax(y_,1)` 代表正确的标签,我们可以用 `tf.equal` 来检测我们的预测是否真实标签匹配(索引位置一样表示匹配)。 ```python correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) ``` 这里返回一个布尔数组。为了计算我们分类的准确率,我们将布尔值转换为浮点数来代表对、错,然后取平均值。例如:`[True, False, True, True]`变为`[1,0,1,1]`,计算出平均值为`0.75`。 ```python accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) ``` 最后,我们可以计算出在测试数据上的准确率,大概是91%。 ```python print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}) ``` ## 构建一个多层卷积网络 在MNIST上只有91%正确率,实在太糟糕。在这个小节里,我们用一个稍微复杂的模型:卷积神经网络来改善效果。这会达到大概99.2%的准确率。虽然不是最高,但是还是比较让人满意。 ### 权重初始化 为了创建这个模型,我们需要创建大量的权重和偏置项。这个模型中的权重在初始化时应该加入少量的噪声来打破对称性以及避免0梯度。由于我们使用的是ReLU神经元,因此比较好的做法是用一个较小的正数来初始化偏置项,以避免神经元节点输出恒为0的问题(dead neurons)。为了不在建立模型的时候反复做初始化操作,我们定义两个函数用于初始化。 ```python def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) ``` ### 卷积和池化 TensorFlow在卷积和池化上有很强的灵活性。我们怎么处理边界?步长应该设多大?在这个实例里,我们会一直使用vanilla版本。我们的卷积使用1步长(stride size),0边距(padding size)的模板,保证输出和输入是同一个大小。我们的池化用简单传统的2x2大小的模板做max pooling。为了代码更简洁,我们把这部分抽象成一个函数。 ```python def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') ``` ### 第一层卷积 现在我们可以开始实现第一层了。它由一个卷积接一个max pooling完成。卷积在每个5x5的patch中算出32个特征。卷积的权重张量形状是`[5, 5, 1, 32]`,前两个维度是patch的大小,接着是输入的通道数目,最后是输出的通道数目。 而对于每一个输出通道都有一个对应的偏置量。 ```python W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) ``` 为了用这一层,我们把`x`变成一个4d向量,其第2、第3维对应图片的宽、高,最后一维代表图片的颜色通道数(因为是灰度图所以这里的通道数为1,如果是rgb彩色图,则为3)。 ```python x_image = tf.reshape(x, [-1,28,28,1]) ``` We then convolve `x_image` with the weight tensor, add the bias, apply the ReLU function, and finally max pool. 我们把`x_image`和权值向量进行卷积,加上偏置项,然后应用ReLU激活函数,最后进行max pooling。 ```python h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) ``` ### 第二层卷积 为了构建一个更深的网络,我们会把几个类似的层堆叠起来。第二层中,每个5x5的patch会得到64个特征。 ```python W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) ``` ### 密集连接层 现在,图片尺寸减小到7x7,我们加入一个有1024个神经元的全连接层,用于处理整个图片。我们把池化层输出的张量reshape成一些向量,乘上权重矩阵,加上偏置,然后对其使用ReLU。 ```python W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) ``` #### Dropout 为了减少过拟合,我们在输出层之前加入dropout。我们用一个`placeholder`来代表一个神经元的输出在dropout中保持不变的概率。这样我们可以在训练过程中启用dropout,在测试过程中关闭dropout。 TensorFlow的`tf.nn.dropout`操作除了可以屏蔽神经元的输出外,还会自动处理神经元输出值的scale。所以用dropout的时候可以不用考虑scale。 ```python keep_prob = tf.placeholder("float") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) ``` ### 输出层 最后,我们添加一个softmax层,就像前面的单层softmax regression一样。 ```python W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) ``` ### 训练和评估模型 这个模型的效果如何呢? 为了进行训练和评估,我们使用与之前简单的单层SoftMax神经网络模型几乎相同的一套代码,只是我们会用更加复杂的ADAM优化器来做梯度最速下降,在`feed_dict`中加入额外的参数`keep_prob`来控制dropout比例。然后每100次迭代输出一次日志。 ```python cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) sess.run(tf.initialize_all_variables()) for i in range(20000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(feed_dict={ x:batch[0], y_: batch[1], keep_prob: 1.0}) print "step %d, training accuracy %g"%(i, train_accuracy) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) print "test accuracy %g"%accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}) ``` 以上代码,在最终测试集上的准确率大概是99.2%。 目前为止,我们已经学会了用TensorFlow快捷地搭建、训练和评估一个复杂一点儿的深度学习模型。 原文地址:[Deep MNIST for Experts](http://tensorflow.org/tutorials/mnist/pros/index.html) 翻译:[chenweican](https://github.com/chenweican) 校对:[HongyangWang](https://github.com/wanghong-yang) ================================================ FILE: SOURCE/tutorials/mnist_tf.md ================================================ # TensorFlow运作方式入门 代码:[tensorflow/g3doc/tutorials/mnist/](https://github.com/tensorflow/tensorflow/tree/master/tensorflow/examples/tutorials/mnist) 本篇教程的目的,是向大家展示如何利用TensorFlow使用(经典)MNIST数据集训练并评估一个用于识别手写数字的简易前馈神经网络(feed-forward neural network)。我们的目标读者,是有兴趣使用TensorFlow的资深机器学习人士。 因此,撰写该系列教程并不是为了教大家机器学习领域的基础知识。 在学习本教程之前,请确保您已按照[安装TensorFlow](../get_started/os_setup.md)教程中的要求,完成了安装。 ## 教程使用的文件 本教程引用如下文件: 文件 | 目的 --- | --- [`mnist.py`](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/tutorials/mnist/mnist.py) | 构建一个完全连接(fully connected)的MINST模型所需的代码。 [`fully_connected_feed.py`](https://raw.githubusercontent.com/tensorflow/tensorflow/master/tensorflow/examples/tutorials/mnist/fully_connected_feed.py) | 利用下载的数据集训练构建好的MNIST模型的主要代码,以数据反馈字典(feed dictionary)的形式作为输入模型。 只需要直接运行`fully_connected_feed.py`文件,就可以开始训练: `python fully_connected_feed.py` ## 准备数据 MNIST是机器学习领域的一个经典问题,指的是让机器查看一系列大小为28x28像素的手写数字灰度图像,并判断这些图像代表0-9中的哪一个数字。 ![MNIST手写数字](.././images/mnist_digits.png "MNIST手写数字") 更多相关信息,请查阅[Yann LeCun网站中关于MNIST的介绍](http://yann.lecun.com/exdb/mnist/) 或者[Chris Olah对MNIST的可视化探索](http://colah.github.io/posts/2014-10-Visualizing-MNIST/)。 ### 下载 在`run_training()`方法的一开始,`input_data.read_data_sets()`函数会确保你的本地训练文件夹中,已经下载了正确的数据,然后将这些数据解压并返回一个含有`DataSet`实例的字典。 ```python data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data) ``` **注意**:`fake_data`标记是用于单元测试的,读者可以不必理会。 数据集 | 目的 --- | --- `data_sets.train` | 55000个图像和标签(labels),作为主要训练集。 `data_sets.validation` | 5000个图像和标签,用于迭代验证训练准确度。 `data_sets.test` | 10000个图像和标签,用于最终测试训练准确度(trained accuracy)。 了解更多数据有关信息,请查阅此系列教程的[数据下载](mnist/download/index.md) 部分. ### 输入与占位符(Inputs and Placeholders) `placeholder_inputs()`函数将生成两个[`tf.placeholder`](../api_docs/python/io_ops.md#placeholder)操作,定义传入图表中的shape参数,shape参数中包括`batch_size `值,后续还会将实际的训练用例传入图表。 ```python images_placeholder = tf.placeholder(tf.float32, shape=(batch_size, IMAGE_PIXELS)) labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size)) ``` 在训练循环(training loop)的后续步骤中,传入的整个图像和标签数据集会被切片,以符合每一个操作所设置的`batch_size`值,占位符操作将会填补以符合这个`batch_size`值。然后使用`feed_dict`参数,将数据传入`sess.run()`函数。 ## 构建图表 (Build the Graph) 在为数据创建占位符之后,就可以运行`mnist.py`文件,经过三阶段的模式函数操作:`inference()`, `loss()`,和`training()`。图表就构建完成了。 1.`inference()` —— 尽可能地构建好图表,满足促使神经网络向前反馈并做出预测的要求。 2.`loss()` —— 往inference图表中添加生成损失(loss)所需要的操作(ops)。 3.`training()` —— 往损失图表中添加计算并应用梯度(gradients)所需的操作。
### 推理(Inference) `inference()`函数会尽可能地构建图表,做到返回包含了预测结果(output prediction)的Tensor。 它接受图像占位符为输入,在此基础上借助ReLu(Rectified Linear Units)激活函数,构建一对完全连接层(layers),以及一个有着十个节点(node)、指明了输出logits模型的线性层。 每一层都创建于一个唯一的[`tf.name_scope`](../api_docs/python/framework.md#name_scope)之下,创建于该作用域之下的所有元素都将带有其前缀。 ```python with tf.name_scope('hidden1') as scope: ``` 在定义的作用域中,每一层所使用的权重和偏差都在[`tf.Variable`](../api_docs/python/state_ops.md#Variable)实例中生成,并且包含了各自期望的shape。 ```python weights = tf.Variable( tf.truncated_normal([IMAGE_PIXELS, hidden1_units], stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))), name='weights') biases = tf.Variable(tf.zeros([hidden1_units]), name='biases') ``` 例如,当这些层是在`hidden1`作用域下生成时,赋予权重变量的独特名称将会是"`hidden1/weights`"。 每个变量在构建时,都会获得初始化操作(initializer ops)。 在这种最常见的情况下,通过[`tf.truncated_normal`](../api_docs/python/constant_op.md#truncated_normal)函数初始化权重变量,给赋予的shape则是一个二维tensor,其中第一个维度代表该层中权重变量所连接(connect from)的单元数量,第二个维度代表该层中权重变量所连接到的(connect to)单元数量。对于名叫`hidden1`的第一层,相应的维度则是`[IMAGE_PIXELS, hidden1_units]`,因为权重变量将图像输入连接到了`hidden1`层。`tf.truncated_normal`初始函数将根据所得到的均值和标准差,生成一个随机分布。 然后,通过[`tf.zeros`](../api_docs/python/constant_op.md#zeros)函数初始化偏差变量(biases),确保所有偏差的起始值都是0,而它们的shape则是其在该层中所接到的(connect to)单元数量。 图表的三个主要操作,分别是两个[`tf.nn.relu`](../api_docs/python/nn.md#relu)操作,它们中嵌入了隐藏层所需的[`tf.matmul`](../api_docs/python/math_ops.md#matmul);以及logits模型所需的另外一个`tf.matmul`。三者依次生成,各自的`tf.Variable`实例则与输入占位符或下一层的输出tensor所连接。 ```python hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases) ``` ```python hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases) ``` ```python logits = tf.matmul(hidden2, weights) + biases ``` 最后,程序会返回包含了输出结果的`logits`Tensor。 ### 损失(Loss) `loss()`函数通过添加所需的损失操作,进一步构建图表。 首先,`labels_placeholer`中的值,将被编码为一个含有1-hot values的Tensor。例如,如果类标识符为“3”,那么该值就会被转换为:
`[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]` ```python batch_size = tf.size(labels) labels = tf.expand_dims(labels, 1) indices = tf.expand_dims(tf.range(0, batch_size, 1), 1) concated = tf.concat(1, [indices, labels]) onehot_labels = tf.sparse_to_dense( concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) ``` 之后,又添加一个[`tf.nn.softmax_cross_entropy_with_logits`](../api_docs/python/nn.md#softmax_cross_entropy_with_logits)操作,用来比较`inference()`函数与1-hot标签所输出的logits Tensor。 ```python cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, onehot_labels, name='xentropy') ``` 然后,使用[`tf.reduce_mean`](../api_docs/python/math_ops.md#reduce_mean)函数,计算batch维度(第一维度)下交叉熵(cross entropy)的平均值,将将该值作为总损失。 ```python loss = tf.reduce_mean(cross_entropy, name='xentropy_mean') ``` 最后,程序会返回包含了损失值的Tensor。 > 注意:交叉熵是信息理论中的概念,可以让我们描述如果基于已有事实,相信神经网络所做的推测最坏会导致什么结果。更多详情,请查阅博文《可视化信息理论》(http://colah.github.io/posts/2015-09-Visual-Information/) ### 训练 `training()`函数添加了通过梯度下降(gradient descent)将损失最小化所需的操作。 首先,该函数从`loss()`函数中获取损失Tensor,将其交给[`tf.scalar_summary`](../api_docs/python/train.md#scalar_summary),后者在与`SummaryWriter`(见下文)配合使用时,可以向事件文件(events file)中生成汇总值(summary values)。在本篇教程中,每次写入汇总值时,它都会释放损失Tensor的当前值(snapshot value)。 ```python tf.scalar_summary(loss.op.name, loss) ``` 接下来,我们实例化一个[`tf.train.GradientDescentOptimizer`](../api_docs/python/train.md#GradientDescentOptimizer),负责按照所要求的学习效率(learning rate)应用梯度下降法(gradients)。 ```python optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate) ``` 之后,我们生成一个变量用于保存全局训练步骤(global training step)的数值,并使用[`minimize()`](../api_docs/python/train.md#Optimizer.minimize)函数更新系统中的三角权重(triangle weights)、增加全局步骤的操作。根据惯例,这个操作被称为 `train_op`,是TensorFlow会话(session)诱发一个完整训练步骤所必须运行的操作(见下文)。 ```python global_step = tf.Variable(0, name='global_step', trainable=False) train_op = optimizer.minimize(loss, global_step=global_step) ``` 最后,程序返回包含了训练操作(training op)输出结果的Tensor。 ## 训练模型 一旦图表构建完毕,就通过`fully_connected_feed.py`文件中的用户代码进行循环地迭代式训练和评估。 ### 图表 在`run_training()`这个函数的一开始,是一个Python语言中的`with`命令,这个命令表明所有已经构建的操作都要与默认的[`tf.Graph`](../api_docs/python/framework.md#Graph)全局实例关联起来。 ```python with tf.Graph().as_default(): ``` `tf.Graph`实例是一系列可以作为整体执行的操作。TensorFlow的大部分场景只需要依赖默认图表一个实例即可。 利用多个图表的更加复杂的使用场景也是可能的,但是超出了本教程的范围。 ### 会话 完成全部的构建准备、生成全部所需的操作之后,我们就可以创建一个[`tf.Session`](../api_docs/python/client.md#Session),用于运行图表。 ```python sess = tf.Session() ``` 另外,也可以利用`with`代码块生成`Session`,限制作用域: ```python with tf.Session() as sess: ``` `Session`函数中没有传入参数,表明该代码将会依附于(如果还没有创建会话,则会创建新的会话)默认的本地会话。 生成会话之后,所有`tf.Variable`实例都会立即通过调用各自初始化操作中的[`sess.run()`](../api_docs/python/client.md#Session.run)函数进行初始化。 ```python init = tf.initialize_all_variables() sess.run(init) ``` [`sess.run()`](../api_docs/python/client.md#Session.run)方法将会运行图表中与作为参数传入的操作相对应的完整子集。在初次调用时,`init`操作只包含了变量初始化程序[`tf.group`](../api_docs/python/control_flow_ops.md#group)。图表的其他部分不会在这里,而是在下面的训练循环运行。 ### 训练循环 完成会话中变量的初始化之后,就可以开始训练了。 训练的每一步都是通过用户代码控制,而能实现有效训练的最简单循环就是: ```python for step in xrange(max_steps): sess.run(train_op) ``` 但是,本教程中的例子要更为复杂一点,原因是我们必须把输入的数据根据每一步的情况进行切分,以匹配之前生成的占位符。 #### 向图表提供反馈 执行每一步时,我们的代码会生成一个反馈字典(feed dictionary),其中包含对应步骤中训练所要使用的例子,这些例子的哈希键就是其所代表的占位符操作。 `fill_feed_dict`函数会查询给定的`DataSet`,索要下一批次`batch_size`的图像和标签,与占位符相匹配的Tensor则会包含下一批次的图像和标签。 ```python images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size) ``` 然后,以占位符为哈希键,创建一个Python字典对象,键值则是其代表的反馈Tensor。 ```python feed_dict = { images_placeholder: images_feed, labels_placeholder: labels_feed, } ``` 这个字典随后作为`feed_dict`参数,传入`sess.run()`函数中,为这一步的训练提供输入样例。 #### 检查状态 在运行`sess.run`函数时,要在代码中明确其需要获取的两个值:`[train_op, loss]`。 ```python for step in xrange(FLAGS.max_steps): feed_dict = fill_feed_dict(data_sets.train, images_placeholder, labels_placeholder) _, loss_value = sess.run([train_op, loss], feed_dict=feed_dict) ``` 因为要获取这两个值,`sess.run()`会返回一个有两个元素的元组。其中每一个`Tensor`对象,对应了返回的元组中的numpy数组,而这些数组中包含了当前这步训练中对应Tensor的值。由于`train_op`并不会产生输出,其在返回的元祖中的对应元素就是`None`,所以会被抛弃。但是,如果模型在训练中出现偏差,`loss` Tensor的值可能会变成NaN,所以我们要获取它的值,并记录下来。 假设训练一切正常,没有出现NaN,训练循环会每隔100个训练步骤,就打印一行简单的状态文本,告知用户当前的训练状态。 ```python if step % 100 == 0: print 'Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration) ``` #### 状态可视化 为了释放[TensorBoard](../how_tos/summaries_and_tensorboard.md)所使用的事件文件(events file),所有的即时数据(在这里只有一个)都要在图表构建阶段合并至一个操作(op)中。 ```python summary_op = tf.merge_all_summaries() ``` 在创建好会话(session)之后,可以实例化一个[`tf.train.SummaryWriter`](../api_docs/python/train.md#SummaryWriter),用于写入包含了图表本身和即时数据具体值的事件文件。 ```python summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, graph_def=sess.graph_def) ``` 最后,每次运行`summary_op`时,都会往事件文件中写入最新的即时数据,函数的输出会传入事件文件读写器(writer)的`add_summary()`函数。。 ```python summary_str = sess.run(summary_op, feed_dict=feed_dict) summary_writer.add_summary(summary_str, step) ``` 事件文件写入完毕之后,可以就训练文件夹打开一个TensorBoard,查看即时数据的情况。 ![MNIST TensorBoard](../images/mnist_tensorboard.png "MNIST TensorBoard") **注意**:了解更多如何构建并运行TensorBoard的信息,请查看相关教程[Tensorboard:训练过程可视化](../how_tos/summaries_and_tensorboard.md)。 #### 保存检查点(checkpoint) 为了得到可以用来后续恢复模型以进一步训练或评估的检查点文件(checkpoint file),我们实例化一个[`tf.train.Saver`](../api_docs/python/state_ops.md#Saver)。 ```python saver = tf.train.Saver() ``` 在训练循环中,将定期调用[`saver.save()`](../api_docs/python/state_ops.md#Saver.save)方法,向训练文件夹中写入包含了当前所有可训练变量值得检查点文件。 ```python saver.save(sess, FLAGS.train_dir, global_step=step) ``` 这样,我们以后就可以使用[`saver.restore()`](../api_docs/python/state_ops.md#Saver.restore)方法,重载模型的参数,继续训练。 ```python saver.restore(sess, FLAGS.train_dir) ``` ## 评估模型 每隔一千个训练步骤,我们的代码会尝试使用训练数据集与测试数据集,对模型进行评估。`do_eval`函数会被调用三次,分别使用训练数据集、验证数据集合测试数据集。 ```python print 'Training Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.train) print 'Validation Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.validation) print 'Test Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.test) ``` >注意,更复杂的使用场景通常是,先隔绝`data_sets.test`测试数据集,只有在大量的超参数优化调整(hyperparameter tuning)之后才进行检查。但是,由于MNIST问题比较简单,我们在这里一次性评估所有的数据。 ### 构建评估图表(Eval Graph) 在打开默认图表(Graph)之前,我们应该先调用`get_data(train=False)`函数,抓取测试数据集。 ```python test_all_images, test_all_labels = get_data(train=False) ``` 在进入训练循环之前,我们应该先调用`mnist.py`文件中的`evaluation`函数,传入的logits和标签参数要与`loss`函数的一致。这样做事为了先构建Eval操作。 ```python eval_correct = mnist.evaluation(logits, labels_placeholder) ``` `evaluation`函数会生成[`tf.nn.in_top_k`](../api_docs/python/nn.md#in_top_k) 操作,如果在K个最有可能的预测中可以发现真的标签,那么这个操作就会将模型输出标记为正确。在本文中,我们把K的值设置为1,也就是只有在预测是真的标签时,才判定它是正确的。 ```python eval_correct = tf.nn.in_top_k(logits, labels, 1) ``` ### 评估图表的输出(Eval Output) 之后,我们可以创建一个循环,往其中添加`feed_dict`,并在调用`sess.run()`函数时传入`eval_correct`操作,目的就是用给定的数据集评估模型。 ```python for step in xrange(steps_per_epoch): feed_dict = fill_feed_dict(data_set, images_placeholder, labels_placeholder) true_count += sess.run(eval_correct, feed_dict=feed_dict) ``` `true_count`变量会累加所有`in_top_k`操作判定为正确的预测之和。接下来,只需要将正确测试的总数,除以例子总数,就可以得出准确率了。 ```python precision = float(true_count) / float(num_examples) print ' Num examples: %d Num correct: %d Precision @ 1: %0.02f' % ( num_examples, true_count, precision) ``` > 原文:[TensorFlow Mechanics 101](http://www.tensorflow.org/tutorials/mnist/tf/index.md) > 翻译:[bingjin](https://github.com/bingjin) > 校对:[LichAmnesia](https://github.com/LichAmnesia) ================================================ FILE: SOURCE/tutorials/overview.md ================================================ # 综述 ## 面向机器学习初学者的 MNIST 初级教程 如果你是机器学习领域的新手, 我们推荐你从本文开始阅读. 本文通过讲述一个经典的问题, 手写数字识别 (MNIST), 让你对多类分类 (multiclass classification) 问题有直观的了解. [阅读教程](../tutorials/mnist_beginners.md) ## 面向机器学习专家的 MNIST 高级教程 如果你已经对其它深度学习软件比较熟悉, 并且也对 MNIST 很熟悉, 这篇教程能够引导你对 TensorFlow 有初步了解. [阅读教程](../tutorials/mnist_pros.md) ## TensorFlow 使用指南 这是一篇技术教程, 详细介绍了如何使用 TensorFlow 架构训练大规模模型. 本文继续使用MNIST 作为例子. [阅读教程](../tutorials/mnist_tf.md) ## 卷积神经网络 这篇文章介绍了如何使用 TensorFlow 在 CIFAR-10 数据集上训练卷积神经网络. 卷积神经网络是为图像识别量身定做的一个模型. 相比其它模型, 该模型利用了平移不变性(translation invariance), 从而能够更更简洁有效地表示视觉内容. [阅读教程](../tutorials/deep_cnn.md) ## 单词的向量表示 本文让你了解为什么学会使用向量来表示单词, 即单词嵌套 (word embedding), 是一件很有用的事情. 文章中介绍的 word2vec 模型, 是一种高效学习嵌套的方法. 本文还涉及了对比噪声(noise-contrastive) 训练方法的一些高级细节, 该训练方法是训练嵌套领域最近最大的进展. [阅读教程](../tutorials/word2vec.md) ## 循环神经网络 (Recurrent Neural Network, 简称 RNN) 一篇 RNN 的介绍文章, 文章中训练了一个 LSTM 网络来预测一个英文句子的下一个单词(该任务有时候被称作语言建模). [阅读教程](../tutorials/recurrent.md) ## 序列到序列模型 (Sequence-to-Sequence Model) RNN 教程的后续, 该教程采用序列到序列模型进行机器翻译. 你将学会构建一个完全基于机器学习,端到端的 `英语-法语` 翻译器. [阅读教程](../tutorials/seq2seq.md) ## Mandelbrot 集合 TensorFlow 可以用于与机器学习完全无关的其它计算领域. 这里实现了一个原生的 Mandelbrot 集合的可视化程序. [阅读教程](../tutorials/mandelbrot.md) ## 偏微分方程 这是另外一个非机器学习计算的例子, 我们利用一个原生实现的偏微分方程, 对雨滴落在池塘上的过程进行仿真. [阅读教程](../tutorials/pdes.md) ## MNIST 数据下载 一篇关于下载 MNIST 手写识别数据集的详细教程. [阅读教程](../tutorials/mnist_download.md) ## 视觉物体识别 (Visual Object Recognition) 我们将毫无保留地发布已经选训练好的, 目前最先进的 Inception 物体识别模型. 敬请期待... ## Deep Dream 视幻觉软件 我们将发布一个 TensorFlow 版本的 [Deep Dream](https://github.com/google/deepdream),这是一款基于 Inception 识别模型的神经网络视幻觉软件. 敬请期待... > 原文:[Overview](http://tensorflow.org/tutorials) 翻译:[@doc001](https://github.com/PFZheng) 校对:[@eric_xu](https://github.com/ericxk) ================================================ FILE: SOURCE/tutorials/pdes/index.md ================================================ # 偏积分方程 ***TensorFlow*** 不仅仅是用来机器学习,它更可以用来模拟仿真。在这里,我们将通过模拟仿真几滴落入一块方形水池的雨点的例子,来引导您如何使用 ***TensorFlow*** 中的偏积分方程来模拟仿真的基本使用方法。 >注:本教程最初是准备做为一个 **IPython** 的手册。 >>译者注:关于偏积分方程的相关知识,译者推荐读者查看 [**网易公开课**](http://open.163.com/) 上的[**《麻省理工学院公开课:多变量微积分》**](http://open.163.com/special/opencourse/multivariable.html)课程。 ## 基本设置 首先,我们需要导入一些必要的引用。 ```python #Import libraries for simulation import tensorflow as tf import numpy as np #Imports for visualization import PIL.Image from cStringIO import StringIO from IPython.display import clear_output, Image, display ``` 然后,我们还需要一个用于表示池塘表面状态的函数。 ```python def DisplayArray(a, fmt='jpeg', rng=[0,1]): """Display an array as a picture.""" a = (a - rng[0])/float(rng[1] - rng[0])*255 a = np.uint8(np.clip(a, 0, 255)) f = StringIO() PIL.Image.fromarray(a).save(f, fmt) display(Image(data=f.getvalue())) ``` 最后,为了方便演示,这里我们需要打开一个交互的 ***TensorFlow*** 会话。当然为了以后能方便调用,我们可以把相关代码写到一个可以执行的***Python***文件中。 ```python sess = tf.InteractiveSession() ``` ## 定义计算函数 ```python def make_kernel(a): """Transform a 2D array into a convolution kernel""" a = np.asarray(a) a = a.reshape(list(a.shape) + [1,1]) return tf.constant(a, dtype=1) def simple_conv(x, k): """A simplified 2D convolution operation""" x = tf.expand_dims(tf.expand_dims(x, 0), -1) y = tf.nn.depthwise_conv2d(x, k, [1, 1, 1, 1], padding='SAME') return y[0, :, :, 0] def laplace(x): """Compute the 2D laplacian of an array""" laplace_k = make_kernel([[0.5, 1.0, 0.5], [1.0, -6., 1.0], [0.5, 1.0, 0.5]]) return simple_conv(x, laplace_k) ``` ## 定义偏积分方程 首先,我们需要创建一个完美的 500 × 500 的正方形池塘,就像是我们在现实中找到的一样。. ```python N = 500 ``` 然后,我们需要创建了一个池塘和几滴将要坠入池塘的雨滴。 ```python # Initial Conditions -- some rain drops hit a pond # Set everything to zero u_init = np.zeros([N, N], dtype="float32") ut_init = np.zeros([N, N], dtype="float32") # Some rain drops hit a pond at random points for n in range(40): a,b = np.random.randint(0, N, 2) u_init[a,b] = np.random.uniform() DisplayArray(u_init, rng=[-0.1, 0.1]) ``` ![jpeg](pde_output_1.jpg) 现在,让我们来指定该微分方程的一些详细参数。 ```python # Parameters: # eps -- time resolution # damping -- wave damping eps = tf.placeholder(tf.float32, shape=()) damping = tf.placeholder(tf.float32, shape=()) # Create variables for simulation state U = tf.Variable(u_init) Ut = tf.Variable(ut_init) # Discretized PDE update rules U_ = U + eps * Ut Ut_ = Ut + eps * (laplace(U) - damping * Ut) # Operation to update the state step = tf.group( U.assign(U_), Ut.assign(Ut_)) ``` ## 开始仿真 为了能看清仿真效果,我们可以用一个简单的 **for** 循环来远行我们的仿真程序。 ```python # Initialize state to initial conditions tf.initialize_all_variables().run() # Run 1000 steps of PDE for i in range(1000): # Step simulation step.run({eps: 0.03, damping: 0.04}) # Visualize every 50 steps if i % 50 == 0: clear_output() DisplayArray(U.eval(), rng=[-0.1, 0.1]) ``` ![jpeg](pde_output_2.jpg) 看!! 雨点落在池塘中,和现实中泛起了无数涟漪。 > 原文链接:[http://tensorflow.org/tutorials/pdes/index.md](http://tensorflow.org/tutorials/pdes/index.md) 翻译:[@wangaicc](https://github.com/wangaicc) 校对: ================================================ FILE: SOURCE/tutorials/pdes.md ================================================ # 偏微分方程 ***TensorFlow*** 不仅仅是用来机器学习,它更可以用来模拟仿真。在这里,我们将通过模拟仿真几滴落入一块方形水池的雨点的例子,来引导您如何使用 ***TensorFlow*** 中的偏微分方程来模拟仿真的基本使用方法。 >注:本教程最初是准备做为一个 **IPython** 的手册。 >>译者注:关于偏微分方程的相关知识,译者推荐读者查看 [**网易公开课**](http://open.163.com/) 上的[**《麻省理工学院公开课:多变量微积分》**](http://open.163.com/special/opencourse/multivariable.html)课程。 ## 基本设置 首先,我们需要导入一些必要的引用。 ```python #导入模拟仿真需要的库 import tensorflow as tf import numpy as np #导入可视化需要的库 import PIL.Image from cStringIO import StringIO from IPython.display import clear_output, Image, display ``` 然后,我们还需要一个用于表示池塘表面状态的函数。 ```python def DisplayArray(a, fmt='jpeg', rng=[0,1]): """Display an array as a picture.""" a = (a - rng[0])/float(rng[1] - rng[0])*255 a = np.uint8(np.clip(a, 0, 255)) f = StringIO() PIL.Image.fromarray(a).save(f, fmt) display(Image(data=f.getvalue())) ``` 最后,为了方便演示,这里我们需要打开一个 ***TensorFlow*** 的交互会话(interactive session)。当然为了以后能方便调用,我们可以把相关代码写到一个可以执行的***Python***文件中。 ```python sess = tf.InteractiveSession() ``` ## 定义计算函数 ```python def make_kernel(a): """Transform a 2D array into a convolution kernel""" a = np.asarray(a) a = a.reshape(list(a.shape) + [1,1]) return tf.constant(a, dtype=1) def simple_conv(x, k): """A simplified 2D convolution operation""" x = tf.expand_dims(tf.expand_dims(x, 0), -1) y = tf.nn.depthwise_conv2d(x, k, [1, 1, 1, 1], padding='SAME') return y[0, :, :, 0] def laplace(x): """Compute the 2D laplacian of an array""" laplace_k = make_kernel([[0.5, 1.0, 0.5], [1.0, -6., 1.0], [0.5, 1.0, 0.5]]) return simple_conv(x, laplace_k) ``` ## 定义偏微分方程 首先,我们需要创建一个完美的 500 × 500 的正方形池塘,就像是我们在现实中找到的一样。 ```python N = 500 ``` 然后,我们需要创建了一个池塘和几滴将要坠入池塘的雨滴。 ```python # Initial Conditions -- some rain drops hit a pond # Set everything to zero u_init = np.zeros([N, N], dtype="float32") ut_init = np.zeros([N, N], dtype="float32") # Some rain drops hit a pond at random points for n in range(40): a,b = np.random.randint(0, N, 2) u_init[a,b] = np.random.uniform() DisplayArray(u_init, rng=[-0.1, 0.1]) ``` ![jpeg](../images/pde_output_1.jpg) 现在,让我们来指定该微分方程的一些详细参数。 ```python # Parameters: # eps -- time resolution # damping -- wave damping eps = tf.placeholder(tf.float32, shape=()) damping = tf.placeholder(tf.float32, shape=()) # Create variables for simulation state U = tf.Variable(u_init) Ut = tf.Variable(ut_init) # Discretized PDE update rules U_ = U + eps * Ut Ut_ = Ut + eps * (laplace(U) - damping * Ut) # Operation to update the state step = tf.group( U.assign(U_), Ut.assign(Ut_)) ``` ## 开始仿真 为了能看清仿真效果,我们可以用一个简单的 **for** 循环来远行我们的仿真程序。 ```python # Initialize state to initial conditions tf.initialize_all_variables().run() # Run 1000 steps of PDE for i in range(1000): # Step simulation step.run({eps: 0.03, damping: 0.04}) # Visualize every 50 steps if i % 50 == 0: clear_output() DisplayArray(U.eval(), rng=[-0.1, 0.1]) ``` ![jpeg](../images/pde_output_2.jpg) 看!! 雨点落在池塘中,和现实中一样的泛起了涟漪。 > 原文链接:[http://tensorflow.org/tutorials/pdes/index.md](http://tensorflow.org/tutorials/pdes/index.md) 翻译:[@wangaicc](https://github.com/wangaicc) 校对:[@tensorfly](https://github.com/tensorfly) ================================================ FILE: SOURCE/tutorials/recurrent/index.md ================================================ # Recurrent Neural Networks ## Introduction Take a look at [this great article] (http://colah.github.io/posts/2015-08-Understanding-LSTMs/) for an introduction to recurrent neural networks and LSTMs in particular. ## Language Modeling In this tutorial we will show how to train a recurrent neural network on a challenging task of language modeling. The goal of the problem is to fit a probabilistic model which assigns probablities to sentences. It does so by predicting next words in a text given a history of previous words. For this purpose we will use the Penn Tree Bank (PTB) dataset, which is a popular benchmark for measuring quality of these models, whilst being small and relatively fast to train. Language modeling is key to many interesting problems such as speech recognition, machine translation, or image captioning. It is also fun, too -- take a look [here] (http://karpathy.github.io/2015/05/21/rnn-effectiveness/). For the purpose of this tutorial, we will reproduce the results from [Zaremba et al., 2014] (http://arxiv.org/abs/1409.2329), which achieves very good results on the PTB dataset. ## Tutorial Files This tutorial references the following files from `models/rnn/ptb`: File | Purpose --- | --- `ptb_word_lm.py` | The code to train a language model on the PTB dataset. `reader.py` | The code to read the dataset. ## Download and Prepare the Data The data required for this tutorial is in the data/ directory of the PTB dataset from Tomas Mikolov's webpage: http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz The dataset is already preprocessed and contains overall 10000 different words, including the end-of-sentence marker and a special symbol (\) for rare words. We convert all of them in the `reader.py` to unique integer identifiers to make it easy for the neural network to process. ## The Model ### LSTM The core of the model consists of an LSTM cell that processes one word at the time and computes probabilities of the possible continuations of the sentence. The memory state of the network is initialized with a vector of zeros and gets updated after reading each word. Also, for computational reasons, we will process data in mini-batches of size `batch_size`. The basic pseudocode looks as follows: ```python lstm = rnn_cell.BasicLSTMCell(lstm_size) # Initial state of the LSTM memory. state = tf.zeros([batch_size, lstm.state_size]) loss = 0.0 for current_batch_of_words in words_in_dataset: # The value of state is updated after processing each batch of words. output, state = lstm(current_batch_of_words, state) # The LSTM output can be used to make next word predictions logits = tf.matmul(output, softmax_w) + softmax_b probabilities = tf.nn.softmax(logits) loss += loss_function(probabilities, target_words) ``` ### Truncated Backpropagation In order to make the learning process tractable, it is a common practice to truncate the gradients for backpropagation to a fixed number (`num_steps`) of unrolled steps. This is easy to implement by feeding inputs of length `num_steps` at a time and doing backward pass after each iteration. A simplifed version of the code for the graph creation for truncated backpropagation: ```python # Placeholder for the inputs in a given iteration. words = tf.placeholder(tf.int32, [batch_size, num_steps]) lstm = rnn_cell.BasicLSTMCell(lstm_size) # Initial state of the LSTM memory. initial_state = state = tf.zeros([batch_size, lstm.state_size]) for i in range(len(num_steps)): # The value of state is updated after processing each batch of words. output, state = lstm(words[:, i], state) # The rest of the code. # ... final_state = state ``` And this is how to implement an iteration over the whole dataset: ```python # A numpy array holding the state of LSTM after each batch of words. numpy_state = initial_state.eval() total_loss = 0.0 for current_batch_of_words in words_in_dataset: numpy_state, current_loss = session.run([final_state, loss], # Initialize the LSTM state from the previous iteration. feed_dict={initial_state: numpy_state, words: current_batch_of_words}) total_loss += current_loss ``` ### Inputs The word IDs will be embedded into a dense representation (see the [Vector Representations Tutorial](../../tutorials/word2vec/index.md)) before feeding to the LSTM. This allows the model to efficiently represent the knowledge about particular words. It is also easy to write: ```python # embedding_matrix is a tensor of shape [vocabulary_size, embedding size] word_embeddings = tf.nn.embedding_lookup(embedding_matrix, word_ids) ``` The embedding matrix will be initialized randomly and the model will learn to differentiate the meaning of words just by looking at the data. ### Loss Fuction We want to minimize the average negative log probability of the target words: $$ \text{loss} = -\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i} $$ It is not very difficult to implement but the function `sequence_loss_by_example` is already available, so we can just use it here. The typical measure reported in the papers is average per-word perplexity (often just called perplexity), which is equal to $$e^{-\frac{1}{N}\sum_{i=1}^{N} \ln p_{\text{target}_i}} = e^{\text{loss}} $$ and we will monitor its value throughout the training process. ### Stacking multiple LSTMs To give the model more expressive power, we can add multiple layers of LSTMs to process the data. The output of the first layer will become the input of the second and so on. We have a class called `MultiRNNCell` that makes the implementation seamless: ```python lstm = rnn_cell.BasicLSTMCell(lstm_size) stacked_lstm = rnn_cell.MultiRNNCell([lstm] * number_of_layers) initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32) for i in range(len(num_steps)): # The value of state is updated after processing each batch of words. output, state = stacked_lstm(words[:, i], state) # The rest of the code. # ... final_state = state ``` ## Compile and Run the Code First, the library needs to be built. To compile it on CPU: ``` bazel build -c opt tensorflow/models/rnn/ptb:ptb_word_lm ``` And if you have a fast GPU, run the following: ``` bazel build -c opt --config=cuda tensorflow/models/rnn/ptb:ptb_word_lm ``` Now we can run the model: ``` bazel-bin/tensorflow/models/rnn/ptb/ptb_word_lm \ --data_path=/tmp/simple-examples/data/ --alsologtostderr --model small ``` There are 3 supported model configurations in the tutorial code: "small", "medium" and "large". The difference between them is in size of the LSTMs and the set of hyperparameters used for training. The larger the model, the better results it should get. The `small` model should be able to reach perplexity below 120 on the test set and the `large` one below 80, though it might take several hours to train. ## What Next? There are several tricks that we haven't mentioned that make the model better, including: * decreasing learning rate schedule, * dropout between the LSTM layers. Study the code and modify it to improve the model even further. ================================================ FILE: SOURCE/tutorials/recurrent.md ================================================ # 递归神经网络 ## 介绍 可以在 [this great article](http://colah.github.io/posts/2015-08-Understanding-LSTMs/) 查看循环神经网络(RNN)以及 LSTM 的介绍。 ## 语言模型 此教程将展示如何在高难度的语言模型中训练循环神经网络。该问题的目标是获得一个能确定语句概率的概率模型。为了做到这一点,通过之前已经给出的词语来预测后面的词语。我们将使用 PTB(Penn Tree Bank) 数据集,这是一种常用来衡量模型的基准,同时它比较小而且训练起来相对快速。 语言模型是很多有趣难题的关键所在,比如语音识别,机器翻译,图像字幕等。它很有意思--可以参看 [here](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)。 本教程的目的是重现 [Zaremba et al., 2014](http://arxiv.org/abs/1409.2329) 的成果,他们在 PTB 数据集上得到了很棒的结果。 ## 教程文件 本教程使用的下面文件的目录是 `models/rnn/ptb`: 文件 | 作用 --- | --- `ptb_word_lm.py` | 在 PTB 数据集上训练一个语言模型. `reader.py` | 读取数据集. ## 下载及准备数据 本教程需要的数据在 data/ 路径下,来源于 Tomas Mikolov 网站上的 PTB 数据集`http://www.fit.vutbr.cz/~imikolov/rnnlm/simple-examples.tgz`。 该数据集已经预先处理过并且包含了全部的 10000 个不同的词语,其中包括语句结束标记符,以及标记稀有词语的特殊符号 `()` 。我们在 `reader.py` 中转换所有的词语,让他们各自有唯一的整型标识符,便于神经网络处理。 ## 模型 ### LSTM 模型的核心由一个 LSTM 单元组成,其可以在某时刻处理一个词语,以及计算语句可能的延续性的概率。网络的存储状态由一个零矢量初始化并在读取每一个词语后更新。而且,由于计算上的原因,我们将以 `batch_size` 为最小批量来处理数据。 基础的伪代码就像下面这样: ```python lstm = rnn_cell.BasicLSTMCell(lstm_size) # 初始化 LSTM 存储状态. state = tf.zeros([batch_size, lstm.state_size]) loss = 0.0 for current_batch_of_words in words_in_dataset: # 每次处理一批词语后更新状态值. output, state = lstm(current_batch_of_words, state) # LSTM 输出可用于产生下一个词语的预测 logits = tf.matmul(output, softmax_w) + softmax_b probabilities = tf.nn.softmax(logits) loss += loss_function(probabilities, target_words) ``` ### 截断反向传播 为使学习过程易于处理,通常的做法是将反向传播的梯度在(按时间)展开的步骤上照一个固定长度(`num_steps`)截断。 通过在一次迭代中的每个时刻上提供长度为 `num_steps` 的输入和每次迭代完成之后反向传导,这会很容易实现。 一个简化版的用于计算图创建的截断反向传播代码: ```python # 一次给定的迭代中的输入占位符. words = tf.placeholder(tf.int32, [batch_size, num_steps]) lstm = rnn_cell.BasicLSTMCell(lstm_size) # 初始化 LSTM 存储状态. initial_state = state = tf.zeros([batch_size, lstm.state_size]) for i in range(len(num_steps)): # 每处理一批词语后更新状态值. output, state = lstm(words[:, i], state) # 其余的代码. # ... final_state = state ``` 下面展现如何实现迭代整个数据集: ```python # 一个 numpy 数组,保存每一批词语之后的 LSTM 状态. numpy_state = initial_state.eval() total_loss = 0.0 for current_batch_of_words in words_in_dataset: numpy_state, current_loss = session.run([final_state, loss], # 通过上一次迭代结果初始化 LSTM 状态. feed_dict={initial_state: numpy_state, words: current_batch_of_words}) total_loss += current_loss ``` ### 输入 在输入 LSTM 前,词语 ID 被嵌入到了一个密集的表示中(查看 [矢量表示教程](tensorflow-zh/SOURCE/tutorials/word2vec/index.md))。这种方式允许模型高效地表示词语,也便于写代码: ```python # embedding_matrix 张量的形状是: [vocabulary_size, embedding_size] word_embeddings = tf.nn.embedding_lookup(embedding_matrix, word_ids) ``` 嵌入的矩阵会被随机地初始化,模型会学会通过数据分辨不同词语的意思。 ### 损失函数 我们想使目标词语的平均负对数概率最小 ![](../images/re.png) 实现起来并非很难,而且函数 `sequence_loss_by_example` 已经有了,可以直接使用。 论文中的典型衡量标准是每个词语的平均困惑度(perplexity),计算式为 ![](../images/re1.png) 同时我们会观察训练过程中的困惑度值(perplexity)。 ### 多个 LSTM 层堆叠 要想给模型更强的表达能力,可以添加多层 LSTM 来处理数据。第一层的输出作为第二层的输入,以此类推。 类 `MultiRNNCell` 可以无缝的将其实现: ```python lstm = rnn_cell.BasicLSTMCell(lstm_size) stacked_lstm = rnn_cell.MultiRNNCell([lstm] * number_of_layers) initial_state = state = stacked_lstm.zero_state(batch_size, tf.float32) for i in range(len(num_steps)): # 每次处理一批词语后更新状态值. output, state = stacked_lstm(words[:, i], state) # 其余的代码. # ... final_state = state ``` ## 编译并运行代码 首先需要构建库,在 CPU 上编译: ``` bazel build -c opt tensorflow/models/rnn/ptb:ptb_word_lm ``` 如果你有一个强大的 GPU,可以运行: ``` bazel build -c opt --config=cuda tensorflow/models/rnn/ptb:ptb_word_lm ``` 运行模型: ``` bazel-bin/tensorflow/models/rnn/ptb/ptb_word_lm \ --data_path=/tmp/simple-examples/data/ --alsologtostderr --model small ``` 教程代码中有 3 个支持的模型配置参数:"small", "medium" 和 "large"。它们指的是 LSTM 的大小,以及用于训练的超参数集。 模型越大,得到的结果应该更好。在测试集中 `small` 模型应该可以达到低于 120 的困惑度(perplexity),`large` 模型则是低于 80,但它可能花费数小时来训练。 ## 除此之外? 还有几个优化模型的技巧没有提到,包括: * 随时间降低学习率, * LSTM 层间 dropout. 继续学习和更改代码以进一步改善模型吧。 原文:[Recurrent Neural Networks](http://tensorflow.org/tutorials/recurrent/index.md) 翻译:[Warln](https://github.com/Warln) 校对:[HongyangWang](https://github.com/wanghong-yang) ================================================ FILE: SOURCE/tutorials/seq2seq/index.md ================================================ # Sequence-to-Sequence Models Recurrent neural networks can learn to model language, as already discussed in the [RNN Tutorial](../../tutorials/recurrent/index.md) (if you did not read it, please go through it before proceeding with this one). This raises an interesting question: could we condition the generated words on some input and generate a meaningful response? For example, could we train a neural network to translate from English to French? It turns out that the answer is *yes*. This tutorial will show you how to build and train such a system end-to-end. You can start by running this binary. ``` bazel run -c opt <...>/models/rnn/translate/translate.py --data_dir [your_data_directory] ``` It will download English-to-French translation data from the [WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html) prepare it for training and train. It takes about 20GB of disk space, and a while to download and prepare (see [later](#run_it) for details), so you can start and leave it running while reading this tutorial. This tutorial references the following files from `models/rnn`. File | What's in it? --- | --- `seq2seq.py` | Library for building sequence-to-sequence models. `translate/seq2seq_model.py` | Neural translation sequence-to-sequence model. `translate/data_utils.py` | Helper functions for preparing translation data. `translate/translate.py` | Binary that trains and runs the translation model. ## Sequence-to-Sequence Basics A basic sequence-to-sequence model, as introduced in [Cho et al., 2014](http://arxiv.org/pdf/1406.1078v3.pdf), consists of two recurrent neural networks (RNNs): an *encoder* that processes the input and a *decoder* that generates the output. This basic architecture is depicted below.
Each box in the picture above represents a cell of the RNN, most commonly a GRU cell or an LSTM cell (see the [RNN Tutorial](../../tutorials/recurrent/index.md) for an explanation of those). Encoder and decoder can share weights or, as is more common, use a different set of parameters. Mutli-layer cells have been successfully used in sequence-to-sequence models too, e.g. for translation [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215). In the basic model depicted above, every input has to be encoded into a fixed-size state vector, as that is the only thing passed to the decoder. To allow the decoder more direct access to the input, an *attention* mechanism was introduced in [Bahdanu et al., 2014](http://arxiv.org/abs/1409.0473). We will not go into the details of the attention mechanism (see the paper), suffice it to say that it allows the decoder to peek into the input at every decoding step. A multi-layer sequence-to-sequence network with LSTM cells and attention mechanism in the decoder looks like this.
## TensorFlow seq2seq Library As you can see above, there are many different sequence-to-sequence models. Each of these models can use different RNN cells, but all of them accept encoder inputs and decoder inputs. This motivates the interfaces in the TensorFlow seq2seq library (`models/rnn/seq2seq.py`). The basic RNN encoder-decoder sequence-to-sequence model works as follows. ```python outputs, states = basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell) ``` In the above call, `encoder_inputs` are a list of tensors representing inputs to the encoder, i.e., corresponding to the letters *A, B, C* in the first picture above. Similarly, `decoder_inputs` are tensors representing inputs to the decoder, *GO, W, X, Y, Z* on the first picture. The `cell` argument is an instance of the `models.rnn.rnn_cell.RNNCell` class that determines which cell will be used inside the model. You can use an existing cell, such as `GRUCell` or `LSTMCell`, or you can write your own. Moreover, `rnn_cell` provides wrappers to construct multi-layer cells, add dropout to cell inputs or outputs, or to do other transformations, see the [RNN Tutorial](../../tutorials/recurrent/index.md) for examples. The call to `basic_rnn_seq2seq` returns two arguments: `outputs` and `states`. Both of them are lists of tensors of the same length as `decoder_inputs`. Naturally, `outputs` correspond to the outputs of the decoder in each time-step, in the first picture above that would be *W, X, Y, Z, EOS*. The returned `states` represent the internal state of the decoder at every time-step. In many applications of sequence-to-sequence models, the output of the decoder at time t is fed back and becomes the input of the decoder at time t+1. At test time, when decoding a sequence, this is how the sequence is constructed. During training, on the other hand, it is common to provide the correct input to the decoder at every time-step, even if the decoder made a mistake before. Functions in `seq2seq.py` support both modes using the `feed_previous` argument. For example, let's analyze the following use of an embedding RNN model. ```python outputs, states = embedding_rnn_seq2seq( encoder_inputs, decoder_inputs, cell, num_encoder_symbols, num_decoder_symbols, output_projection=None, feed_previous=False) ``` In the `embedding_rnn_seq2seq` model, all inputs (both `encoder_inputs` and `decoder_inputs`) are integer-tensors that represent discrete values. They will be embedded into a dense representation (see the [Vectors Representations Tutorial](../../tutorials/word2vec/index.md) for more details on embeddings), but to construct these embeddings we need to specify the maximum number of discrete symbols that will appear: `num_encoder_symbols` on the encoder side, and `num_decoder_symbols` on the decoder side. In the above invocation, we set `feed_previous` to False. This means that the decoder will use `decoder_inputs` tensors as provided. If we set `feed_previous` to True, the decoder would only use the first element of `decoder_inputs`. All other tensors from this list would be ignored, and instead the previous output of the encoder would be used. This is used for decoding translations in our translation model, but it can also be used during training, to make the model more robust to its own mistakes, similar to [Bengio et al., 2015](http://arxiv.org/pdf/1506.03099v2.pdf). One more important argument used above is `output_projection`. If not specified, the outputs of the embedding model will be tensors of shape batch-size by `num_decoder_symbols` as they represent the logits for each generated symbol. When training models with large output vocabularies, i.e., when `num_decoder_symbols` is large, it is not practical to store these large tensors. Instead, it is better to return smaller output tensors, which will later be projected onto a large output tensor using `output_projection`. This allows to use our seq2seq models with a sampled softmax loss, as described in [Jean et. al., 2015](http://arxiv.org/pdf/1412.2007v2.pdf). In addition to `basic_rnn_seq2seq` and `embedding_rnn_seq2seq` there are a few more sequence-to-sequence models in `seq2seq.py`, take a look there. They all have similar interfaces, so we will not describe them in detail. We will use `embedding_attention_seq2seq` for our translation model below. ## Neural Translation Model While the core of the sequence-to-sequence model is constructed by the functions in `models/rnn/seq2seq.py`, there are still a few tricks that are worth mentioning that are used in our translation model in `models/rnn/translate/seq2seq_model.py`. ### Sampled softmax and output projection For one, as already mentioned above, we want to use sampled softmax to handle large output vocabulary. To decode from it, we need to keep track of the output projection. Both the sampled softmax loss and the output projections are constructed by the following code in `seq2seq_model.py`. ```python if num_samples > 0 and num_samples < self.target_vocab_size: w = tf.get_variable("proj_w", [size, self.target_vocab_size]) w_t = tf.transpose(w) b = tf.get_variable("proj_b", [self.target_vocab_size]) output_projection = (w, b) def sampled_loss(inputs, labels): labels = tf.reshape(labels, [-1, 1]) return tf.nn.sampled_softmax_loss(w_t, b, inputs, labels, num_samples, self.target_vocab_size) ``` First, note that we only construct a sampled softmax if the number of samples (512 by default) is smaller that the target vocabulary size. For vocabularies smaller than 512 it might be a better idea to just use a standard softmax loss. Then, as you can see, we construct an output projection. It is a pair, consisting of a weight matrix and a bias vector. If used, the rnn cell will return vectors of shape batch-size by `size`, rather than batch-size by `target_vocab_size`. To recover logits, we need to multiply by the weight matrix and add the biases, as is done in lines 124-126 in `seq2seq_model.py`. ```python if output_projection is not None: self.outputs[b] = [tf.matmul(output, output_projection[0]) + output_projection[1] for ...] ``` ### Bucketing and padding In addition to sampled softmax, our translation model also makes use of *bucketing*, which is a method to efficiently handle sentences of different lengths. Let us first clarify the problem. When translating English to French, we will have English sentences of different lengths L1 on input, and French sentences of different lengths L2 on output. Since the English sentence is passed as `encoder_inputs`, and the French sentence comes as `decoder_inputs` (prefixed by a GO symbol), we should in principle create a seq2seq model for every pair (L1, L2+1) of lengths of an English and French sentence. This would result in an enormous graph consisting of many very similar subgraphs. On the other hand, we could just pad every sentence with a special PAD symbol. Then we'd need only one seq2seq model, for the padded lengths. But on shorter sentence our model would be inefficient, encoding and decoding many PAD symbols that are useless. As a compromise between contructing a graph for every pair of lengths and padding to a single length, we use a number of *buckets* and pad each sentence to the length of the bucket above it. In `translate.py` we use the following default buckets. ```python buckets = [(5, 10), (10, 15), (20, 25), (40, 50)] ``` This means that if the input is an English sentence with 3 tokens, and the corresponding output is a French sentence with 6 tokens, then they will be put in the first bucket and padded to length 5 for encoder inputs, and length 10 for decoder inputs. If we have an English sentence with 8 tokens and the corresponding French sentence has 18 tokens, then they will not fit into the (10, 15) bucket, and so the (20, 25) bucket will be used, i.e. the English sentence will be padded to 20, and the French one to 25. Remember that when constructing decoder inputs we prepend the special `GO` symbol to the input data. This is done in the `get_batch()` function in `seq2seq_model.py`, which also reverses the input English sentence. Reversing the inputs was shown to improve results for the neural translation model in [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215). To put it all together, imagine we have the sentence "I go.", tokenized as `["I", "go", "."]` as input and the sentence "Je vais." as output, tokenized `["Je", "vais", "."]`. It will be put in the (5, 10) bucket, with encoder inputs representing `[PAD PAD "." "go" "I"]` and decoder inputs `[GO "Je" "vais" "." EOS PAD PAD PAD PAD PAD]`. ## Let's Run It To train the model described above, we need to a large English-French corpus. We will use the *10^9-French-English corpus* from the [WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html) for training, and the 2013 news test from the same site as development set. Both data-sets will be downloaded to `data_dir` and training will start, saving checkpoints in `train_dir`, when this command is run. ``` bazel run -c opt <...>/models/rnn/translate:translate --data_dir [your_data_directory] --train_dir [checkpoints_directory] --en_vocab_size=40000 --fr_vocab_size=40000 ``` It takes about 18GB of disk space and several hours to prepare the training corpus. It is unpacked, vocabulary files are created in `data_dir`, and then the corpus is tokenized and converted to integer ids. Note the parameters that determine vocabulary sizes. In the example above, all words outside the 40K most common ones will be converted to an `UNK` token representing unknown words. So if you change vocabulary size, the binary will re-map the corpus to token-ids again. After the data is prepared, training starts. Default parameters in `translate` are set to quite large values. Large models trained over a long time give good results, but it might take too long or use too much memory for your GPU. You can request to train a smaller model as in the following example. ``` bazel run -c opt <...>/models/rnn/translate:translate --data_dir [your_data_directory] --train_dir [checkpoints_directory] --size=256 --num_layers=2 --steps_per_checkpoint=50 ``` The above command will train a model with 2 layers (the default is 3), each layer with 256 units (default is 1024), and will save a checkpoint every 50 steps (the default is 200). You can play with these parameters to find out how large a model can be to fit into the memory of your GPU. During training, every `steps_per_checkpoint` steps the binary will print out statistics from recent steps. With the default parameters (3 layers of size 1024), first messages look like this. ``` global step 200 learning rate 0.5000 step-time 1.39 perplexity 1720.62 eval: bucket 0 perplexity 184.97 eval: bucket 1 perplexity 248.81 eval: bucket 2 perplexity 341.64 eval: bucket 3 perplexity 469.04 global step 400 learning rate 0.5000 step-time 1.38 perplexity 379.89 eval: bucket 0 perplexity 151.32 eval: bucket 1 perplexity 190.36 eval: bucket 2 perplexity 227.46 eval: bucket 3 perplexity 238.66 ``` You can see that each step takes just under 1.4 seconds, the perplexity on the training set and the perplexities on the development set for each bucket. After about 30K steps, we see perplexities on short sentences (bucket 0 and 1) going into single digits. Since the training corpus contains ~22M sentences, one epoch (going through the training data once) takes about 340K steps with batch-size of 64. At this point the model can be used for translating English sentences to French using the `--decode` option. ``` bazel run -c opt <...>/models/rnn/translate:translate --decode --data_dir [your_data_directory] --train_dir [checkpoints_directory] Reading model parameters from /tmp/translate.ckpt-340000 > Who is the president of the United States? Qui est le président des États-Unis ? ``` ## What Next? The example above shows how you can build your own English-to-French translator, end-to-end. Run it and see how the model performs for yourself. While it has reasonable quality, the default parameters will not give you the best translation model. Here are a few things you can improve. First of all, we use a very promitive tokenizer, the `basic_tokenizer` function in `data_utils`. A better tokenizer can be found on the [WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html). Using that tokenizer, and a larger vocabulary, should improve your translations. Also, the default parameters of the translation model are not tuned. You can try changing the learning rate, decay, or initializing the weights of your model in a different way. You can also change the default `GradientDescentOptimizer` in `seq2seq_model.py` to a more advanced one, such as `AdagradOptimizer`. Try these things and see how they improve your results! Finally, the model presented above can be used for any sequence-to-sequence task, not only for translation. Even if you want to transform a sequence to a tree, for example to generate a parsing tree, the same model as above can give state-of-the-art results, as demonstrated in [Vinyals & Kaiser et al., 2015](http://arxiv.org/abs/1412.7449). So you can not only build your own translator, you can also build a parser, a chat-bot, or any program that comes to your mind. Experiment! ================================================ FILE: SOURCE/tutorials/seq2seq.md ================================================ # Sequence-to-Sequence Models Recurrent neural networks can learn to model language, as already discussed in the [RNN Tutorial](tensorflow-zh/SOURCE/tutorials/recurrent/index.md) (if you did not read it, please go through it before proceeding with this one). This raises an interesting question: could we condition the generated words on some input and generate a meaningful response? For example, could we train a neural network to translate from English to French? It turns out that the answer is *yes*. This tutorial will show you how to build and train such a system end-to-end. You can start by running this binary. ``` bazel run -c opt <...>/models/rnn/translate/translate.py --data_dir [your_data_directory] ``` It will download English-to-French translation data from the [WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html) prepare it for training and train. It takes about 20GB of disk space, and a while to download and prepare (see [later](#run_it) for details), so you can start and leave it running while reading this tutorial. This tutorial references the following files from `models/rnn`. File | What's in it? --- | --- `seq2seq.py` | Library for building sequence-to-sequence models. `translate/seq2seq_model.py` | Neural translation sequence-to-sequence model. `translate/data_utils.py` | Helper functions for preparing translation data. `translate/translate.py` | Binary that trains and runs the translation model. ## Sequence-to-Sequence Basics A basic sequence-to-sequence model, as introduced in [Cho et al., 2014](http://arxiv.org/pdf/1406.1078v3.pdf), consists of two recurrent neural networks (RNNs): an *encoder* that processes the input and a *decoder* that generates the output. This basic architecture is depicted below.
Each box in the picture above represents a cell of the RNN, most commonly a GRU cell or an LSTM cell (see the [RNN Tutorial](tensorflow-zh/SOURCE/tutorials/recurrent/index.md) for an explanation of those). Encoder and decoder can share weights or, as is more common, use a different set of parameters. Mutli-layer cells have been successfully used in sequence-to-sequence models too, e.g. for translation [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215). In the basic model depicted above, every input has to be encoded into a fixed-size state vector, as that is the only thing passed to the decoder. To allow the decoder more direct access to the input, an *attention* mechanism was introduced in [Bahdanu et al., 2014](http://arxiv.org/abs/1409.0473). We will not go into the details of the attention mechanism (see the paper), suffice it to say that it allows the decoder to peek into the input at every decoding step. A multi-layer sequence-to-sequence network with LSTM cells and attention mechanism in the decoder looks like this.
## TensorFlow seq2seq Library As you can see above, there are many different sequence-to-sequence models. Each of these models can use different RNN cells, but all of them accept encoder inputs and decoder inputs. This motivates the interfaces in the TensorFlow seq2seq library (`models/rnn/seq2seq.py`). The basic RNN encoder-decoder sequence-to-sequence model works as follows. ```python outputs, states = basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell) ``` In the above call, `encoder_inputs` are a list of tensors representing inputs to the encoder, i.e., corresponding to the letters *A, B, C* in the first picture above. Similarly, `decoder_inputs` are tensors representing inputs to the decoder, *GO, W, X, Y, Z* on the first picture. The `cell` argument is an instance of the `models.rnn.rnn_cell.RNNCell` class that determines which cell will be used inside the model. You can use an existing cell, such as `GRUCell` or `LSTMCell`, or you can write your own. Moreover, `rnn_cell` provides wrappers to construct multi-layer cells, add dropout to cell inputs or outputs, or to do other transformations, see the [RNN Tutorial](tensorflow-zh/SOURCE/tutorials/recurrent/index.md) for examples. The call to `basic_rnn_seq2seq` returns two arguments: `outputs` and `states`. Both of them are lists of tensors of the same length as `decoder_inputs`. Naturally, `outputs` correspond to the outputs of the decoder in each time-step, in the first picture above that would be *W, X, Y, Z, EOS*. The returned `states` represent the internal state of the decoder at every time-step. In many applications of sequence-to-sequence models, the output of the decoder at time t is fed back and becomes the input of the decoder at time t+1. At test time, when decoding a sequence, this is how the sequence is constructed. During training, on the other hand, it is common to provide the correct input to the decoder at every time-step, even if the decoder made a mistake before. Functions in `seq2seq.py` support both modes using the `feed_previous` argument. For example, let's analyze the following use of an embedding RNN model. ```python outputs, states = embedding_rnn_seq2seq( encoder_inputs, decoder_inputs, cell, num_encoder_symbols, num_decoder_symbols, output_projection=None, feed_previous=False) ``` In the `embedding_rnn_seq2seq` model, all inputs (both `encoder_inputs` and `decoder_inputs`) are integer-tensors that represent discrete values. They will be embedded into a dense representation (see the [Vectors Representations Tutorial](tensorflow-zh/SOURCE/tutorials/word2vec/index.md) for more details on embeddings), but to construct these embeddings we need to specify the maximum number of discrete symbols that will appear: `num_encoder_symbols` on the encoder side, and `num_decoder_symbols` on the decoder side. In the above invocation, we set `feed_previous` to False. This means that the decoder will use `decoder_inputs` tensors as provided. If we set `feed_previous` to True, the decoder would only use the first element of `decoder_inputs`. All other tensors from this list would be ignored, and instead the previous output of the encoder would be used. This is used for decoding translations in our translation model, but it can also be used during training, to make the model more robust to its own mistakes, similar to [Bengio et al., 2015](http://arxiv.org/pdf/1506.03099v2.pdf). One more important argument used above is `output_projection`. If not specified, the outputs of the embedding model will be tensors of shape batch-size by `num_decoder_symbols` as they represent the logits for each generated symbol. When training models with large output vocabularies, i.e., when `num_decoder_symbols` is large, it is not practical to store these large tensors. Instead, it is better to return smaller output tensors, which will later be projected onto a large output tensor using `output_projection`. This allows to use our seq2seq models with a sampled softmax loss, as described in [Jean et. al., 2015](http://arxiv.org/pdf/1412.2007v2.pdf). In addition to `basic_rnn_seq2seq` and `embedding_rnn_seq2seq` there are a few more sequence-to-sequence models in `seq2seq.py`, take a look there. They all have similar interfaces, so we will not describe them in detail. We will use `embedding_attention_seq2seq` for our translation model below. ## Neural Translation Model While the core of the sequence-to-sequence model is constructed by the functions in `models/rnn/seq2seq.py`, there are still a few tricks that are worth mentioning that are used in our translation model in `models/rnn/translate/seq2seq_model.py`. ### Sampled softmax and output projection For one, as already mentioned above, we want to use sampled softmax to handle large output vocabulary. To decode from it, we need to keep track of the output projection. Both the sampled softmax loss and the output projections are constructed by the following code in `seq2seq_model.py`. ```python if num_samples > 0 and num_samples < self.target_vocab_size: w = tf.get_variable("proj_w", [size, self.target_vocab_size]) w_t = tf.transpose(w) b = tf.get_variable("proj_b", [self.target_vocab_size]) output_projection = (w, b) def sampled_loss(inputs, labels): labels = tf.reshape(labels, [-1, 1]) return tf.nn.sampled_softmax_loss(w_t, b, inputs, labels, num_samples, self.target_vocab_size) ``` First, note that we only construct a sampled softmax if the number of samples (512 by default) is smaller that the target vocabulary size. For vocabularies smaller than 512 it might be a better idea to just use a standard softmax loss. Then, as you can see, we construct an output projection. It is a pair, consisting of a weight matrix and a bias vector. If used, the rnn cell will return vectors of shape batch-size by `size`, rather than batch-size by `target_vocab_size`. To recover logits, we need to multiply by the weight matrix and add the biases, as is done in lines 124-126 in `seq2seq_model.py`. ```python if output_projection is not None: self.outputs[b] = [tf.matmul(output, output_projection[0]) + output_projection[1] for ...] ``` ### Bucketing and padding In addition to sampled softmax, our translation model also makes use of *bucketing*, which is a method to efficiently handle sentences of different lengths. Let us first clarify the problem. When translating English to French, we will have English sentences of different lengths L1 on input, and French sentences of different lengths L2 on output. Since the English sentence is passed as `encoder_inputs`, and the French sentence comes as `decoder_inputs` (prefixed by a GO symbol), we should in principle create a seq2seq model for every pair (L1, L2+1) of lengths of an English and French sentence. This would result in an enormous graph consisting of many very similar subgraphs. On the other hand, we could just pad every sentence with a special PAD symbol. Then we'd need only one seq2seq model, for the padded lengths. But on shorter sentence our model would be inefficient, encoding and decoding many PAD symbols that are useless. As a compromise between contructing a graph for every pair of lengths and padding to a single length, we use a number of *buckets* and pad each sentence to the length of the bucket above it. In `translate.py` we use the following default buckets. ```python buckets = [(5, 10), (10, 15), (20, 25), (40, 50)] ``` This means that if the input is an English sentence with 3 tokens, and the corresponding output is a French sentence with 6 tokens, then they will be put in the first bucket and padded to length 5 for encoder inputs, and length 10 for decoder inputs. If we have an English sentence with 8 tokens and the corresponding French sentence has 18 tokens, then they will not fit into the (10, 15) bucket, and so the (20, 25) bucket will be used, i.e. the English sentence will be padded to 20, and the French one to 25. Remember that when constructing decoder inputs we prepend the special `GO` symbol to the input data. This is done in the `get_batch()` function in `seq2seq_model.py`, which also reverses the input English sentence. Reversing the inputs was shown to improve results for the neural translation model in [Sutskever et al., 2014](http://arxiv.org/abs/1409.3215). To put it all together, imagine we have the sentence "I go.", tokenized as `["I", "go", "."]` as input and the sentence "Je vais." as output, tokenized `["Je", "vais", "."]`. It will be put in the (5, 10) bucket, with encoder inputs representing `[PAD PAD "." "go" "I"]` and decoder inputs `[GO "Je" "vais" "." EOS PAD PAD PAD PAD PAD]`. ## Let's Run It To train the model described above, we need to a large English-French corpus. We will use the *10^9-French-English corpus* from the [WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html) for training, and the 2013 news test from the same site as development set. Both data-sets will be downloaded to `data_dir` and training will start, saving checkpoints in `train_dir`, when this command is run. ``` bazel run -c opt <...>/models/rnn/translate:translate --data_dir [your_data_directory] --train_dir [checkpoints_directory] --en_vocab_size=40000 --fr_vocab_size=40000 ``` It takes about 18GB of disk space and several hours to prepare the training corpus. It is unpacked, vocabulary files are created in `data_dir`, and then the corpus is tokenized and converted to integer ids. Note the parameters that determine vocabulary sizes. In the example above, all words outside the 40K most common ones will be converted to an `UNK` token representing unknown words. So if you change vocabulary size, the binary will re-map the corpus to token-ids again. After the data is prepared, training starts. Default parameters in `translate` are set to quite large values. Large models trained over a long time give good results, but it might take too long or use too much memory for your GPU. You can request to train a smaller model as in the following example. ``` bazel run -c opt <...>/models/rnn/translate:translate --data_dir [your_data_directory] --train_dir [checkpoints_directory] --size=256 --num_layers=2 --steps_per_checkpoint=50 ``` The above command will train a model with 2 layers (the default is 3), each layer with 256 units (default is 1024), and will save a checkpoint every 50 steps (the default is 200). You can play with these parameters to find out how large a model can be to fit into the memory of your GPU. During training, every `steps_per_checkpoint` steps the binary will print out statistics from recent steps. With the default parameters (3 layers of size 1024), first messages look like this. ``` global step 200 learning rate 0.5000 step-time 1.39 perplexity 1720.62 eval: bucket 0 perplexity 184.97 eval: bucket 1 perplexity 248.81 eval: bucket 2 perplexity 341.64 eval: bucket 3 perplexity 469.04 global step 400 learning rate 0.5000 step-time 1.38 perplexity 379.89 eval: bucket 0 perplexity 151.32 eval: bucket 1 perplexity 190.36 eval: bucket 2 perplexity 227.46 eval: bucket 3 perplexity 238.66 ``` You can see that each step takes just under 1.4 seconds, the perplexity on the training set and the perplexities on the development set for each bucket. After about 30K steps, we see perplexities on short sentences (bucket 0 and 1) going into single digits. Since the training corpus contains ~22M sentences, one epoch (going through the training data once) takes about 340K steps with batch-size of 64. At this point the model can be used for translating English sentences to French using the `--decode` option. ``` bazel run -c opt <...>/models/rnn/translate:translate --decode --data_dir [your_data_directory] --train_dir [checkpoints_directory] Reading model parameters from /tmp/translate.ckpt-340000 > Who is the president of the United States? Qui est le président des États-Unis ? ``` ## What Next? The example above shows how you can build your own English-to-French translator, end-to-end. Run it and see how the model performs for yourself. While it has reasonable quality, the default parameters will not give you the best translation model. Here are a few things you can improve. First of all, we use a very promitive tokenizer, the `basic_tokenizer` function in `data_utils`. A better tokenizer can be found on the [WMT'15 Website](http://www.statmt.org/wmt15/translation-task.html). Using that tokenizer, and a larger vocabulary, should improve your translations. Also, the default parameters of the translation model are not tuned. You can try changing the learning rate, decay, or initializing the weights of your model in a different way. You can also change the default `GradientDescentOptimizer` in `seq2seq_model.py` to a more advanced one, such as `AdagradOptimizer`. Try these things and see how they improve your results! Finally, the model presented above can be used for any sequence-to-sequence task, not only for translation. Even if you want to transform a sequence to a tree, for example to generate a parsing tree, the same model as above can give state-of-the-art results, as demonstrated in [Vinyals & Kaiser et al., 2015](http://arxiv.org/abs/1412.7449). So you can not only build your own translator, you can also build a parser, a chat-bot, or any program that comes to your mind. Experiment! ================================================ FILE: SOURCE/tutorials/word2vec/__init__.py ================================================ ================================================ FILE: SOURCE/tutorials/word2vec/index.md ================================================ # Vector Representations of Words 在本教程我们来看一下[Mikolov et al](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)中提到的word2vec模型。该模型是用于学习文字的向量表示,称之为“word embedding”。 ## 亮点 本教程意在展现出在TensorfLow中构建word2vec模型有趣、本质的部分。 * 我们从我们为何需要使用向量表示文字开始。 * 我们通过直观地例子观察模型背后的本质,以及它是如何训练的(通过一些数学方法评估)。 * 同时我们也展示了TensorFlow对该模型的简单实现。 * 最后,我们着眼于让给这个简单版本的模型表现更好。 我们会在教程的推进中循序渐进地解释代码,但是如果你更希望直入主题,可以在 [tensorflow/examples/tutorials/word2vec/word2vec_basic.py](https://www.tensorflow.org/code/tensorflow/examples/tutorials/word2vec/word2vec_basic.py)查看到一个最简单的实现。这个基本的例子提供的代码可以完成下载一些数据,简单训练后展示结果。一旦你觉得已经完全掌握了这个简单版本,你可以查看 [tensorflow/models/embedding/word2vec.py](https://www.tensorflow.org/code/tensorflow/models/embedding/word2vec.py),这里提供了一些更复杂的实现,同时也展示了TensorFlow的一些更进阶的特性,比如如何更高效地使用线程将数据送入文本模型,再比如如何在训练中设置检查点等等。 但是首先,让我们来看一下为何需要学习word embeddings。如果你对word embeddings相关内容已经是个专家了,那么请安心跳过本节内容,直接深入细节干一些脏活吧。 ## 动机: 为什么需要学习 Word Embeddings? 通常图像或音频系统处理的是由图片中所有单个原始像素点强度值或者音频中功率谱密度的强度值,把它们编码成丰富、高纬度的向量数据集。对于物体或语音识别这一类的任务,我们所需的全部信息已经都存储在原始数据中(显然人类本身就是依赖原始数据进行日常的物体或语音识别的)。然后,自然语言处理系统通常将词汇作为离散的单一符号,例如 "cat" 一词或可表示为 `Id537` ,而 "dog" 一词或可表示为 `Id143`。这些符号编码毫无规律,无法提供不同词汇之间可能存在的关联信息。换句话说,在处理关于 "dogs" 一词的信息时,模型将无法利用已知的关于 "cats" 的信息(例如,它们都是动物,有四条腿,可作为宠物等等)。可见,将词汇表达为上述的独立离散符号将进一步导致数据稀疏,使我们在训练统计模型时不得不寻求更多的数据。而词汇的向量表示将克服上述的难题。
[向量空间模型](https://en.wikipedia.org/wiki/Vector_space_model) (VSMs)将词汇表达(嵌套)于一个连续的向量空间中,语义近似的词汇被映射为相邻的数据点。向量空间模型在自然语言处理领域中有着漫长且丰富的历史,不过几乎所有利用这一模型的方法都依赖于 [分布式假设](https://en.wikipedia.org/wiki/Distributional_semantics#Distributional_Hypothesis),其核心思想为出现于上下文情景中的词汇都有相类似的语义。采用这一假设的研究方法大致分为以下两类:*基于技术的方法* (e.g. [潜在语义分析](https://en.wikipedia.org/wiki/Latent_semantic_analysis)), 和 *预测方法* (e.g. [神经概率化语言模型](http://www.scholarpedia.org/article/Neural_net_language_models)). 其中它们的区别在如下论文中又详细阐述 [Baroni et al.](http://clic.cimec.unitn.it/marco/publications/acl2014/baroni-etal-countpredict-acl2014.pdf),不过简而言之:基于计数的方法计算某词汇与其邻近词汇在一个大型语料库中共同出现的频率及其他统计量,然后将这些统计量映射到一个小型且稠密的向量中。预测方法则试图直接从某词汇的邻近词汇对其进行预测,在此过程中利用已经学习到的小型且稠密的*嵌套向量*。 Word2vec是一种可以进行高效率词嵌套学习的预测模型。其两种变体分别为:连续词袋模型(CBOW)及Skip-Gram模型。从算法角度看,这两种方法非常相似,其区别为CBOW根据源词上下文词汇('the cat sits on the')来预测目标词汇(例如,‘mat’),而Skip-Gram模型做法相反,它通过目标词汇来预测源词汇。Skip-Gram模型采取CBOW的逆过程的动机在于:CBOW算法对于很多分布式信息进行了平滑处理(例如将一整段上下文信息视为一个单一观察量)。很多情况下,对于小型的数据集,这一处理是有帮助的。相形之下,Skip-Gram模型将每个“上下文-目标词汇”的组合视为一个新观察量,这种做法在大型数据集中会更为有效。本教程余下部分将着重讲解Skip-Gram模型。 ## 处理噪声对比训练 神经概率化语言模型通常使用[极大似然法](https://en.wikipedia.org/wiki/Maximum_likelihood) (ML) 进行训练,其中通过 [*softmax* function](https://en.wikipedia.org/wiki/Softmax_function) 来最大化当提供前一个单词 \\(h\\) (代表 "history"),后一个单词的概率 \\(w_t\\) (代表 "target"), $$ \begin{align} P(w_t | h) &= \text{softmax}(\exp \{ \text{score}(w_t, h) \}) \\ &= \frac{\exp \{ \text{score}(w_t, h) \} } {\sum_\text{Word w' in Vocab} \exp \{ \text{score}(w', h) \} }. \end{align} $$ 当 \\(\text{score}(w\_t, h)\\) 计算了文字 \\(w\_t\\) 和 上下文 \\(h\\) 的相容性(通常使用向量积)。我们使用对数似然函数来训练训练集的最大值,比如通过: $$ \begin{align} J_\text{ML} &= \log P(w_t | h) \\ &= \text{score}(w_t, h) - \log \left( \sum_\text{Word w' in Vocab} \exp \{ \text{score}(w', h) \} \right) \end{align} $$ 这里提出了一个解决语言概率模型的合适的通用方法。然而这个方法实际执行起来开销非常大,因为我们需要去计算并正则化当前上下文环境 \\(h\\) 中所有其他 \\(V\\) 单词 \\(w'\\) 的概率得分,*在每一步训练迭代中*。
从另一个角度来说,当使用word2vec模型时,我们并不需要对概率模型中的所有特征进行学习。而CBOW模型和Skip-Gram模型为了避免这种情况发生,使用一个二分类器(逻辑回归)在同一个上下文环境里从 \\(k\\) 虚构的 (噪声) 单词 \\(\tilde w\\) 区分出真正的目标单词 \\(w_t\\)。我们下面详细阐述一下CBOW模型,对于Skip-Gram模型只要简单地做相反的操作即可。
从数学角度来说,我们的目标是对每个样本最大化: $$J_\text{NEG} = \log Q_\theta(D=1 |w_t, h) + k \mathop{\mathbb{E}}_{\tilde w \sim P_\text{noise}} \left[ \log Q_\theta(D = 0 |\tilde w, h) \right]$$ 其中 \\(Q_\theta(D=1 | w, h)\\) 代表的是数据集在当前上下文 \\(h\\) ,根据所学习的嵌套向量 \\(\theta\\) ,目标单词 \\(w\\) 使用二分类逻辑回归计算得出的概率。在实践中,我们通过在噪声分布中绘制比对文字来获得近似的期望值(通过计算[蒙特卡洛平均值](https://en.wikipedia.org/wiki/Monte_Carlo_integration))。 当真实地目标单词被分配到较高的概率,同时噪声单词的概率很低时,目标函数也就达到最大值了。从技术层面来说,这种方法叫做 [负抽样](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf),而且使用这个损失函数在数学层面上也有很好的解释:这个更新过程也近似于softmax函数的更新。这在计算上将会有很大的优势,因为当计算这个损失函数时,只是有我们挑选出来的 (\\(k\\)) 个 *噪声单词*,而没有使用整个语料库 (\\(V\\))。这使得训练变得非常快。我们实际上使用了与[noise-contrastive estimation (NCE)](http://papers.nips.cc/paper/5165-learning-word-embeddings-efficiently-with-noise-contrastive-estimation.pdf)介绍的非常相似的方法,这在TensorFlow中已经封装了一个很便捷的函数`tf.nn.nce_loss()`。 让我们在实践中来直观地体会它是如何运作的! ## Skip-gram 模型 下面来看一下这个数据集 `the quick brown fox jumped over the lazy dog` 我们首先对一些单词以及它们的上下文环境建立一个数据集。我们可以以任何合理的方式定义‘上下文’,而通常上这个方式是根据文字的句法语境的(使用语法原理的方式处理当前目标单词可以看一下这篇文献 [Levy et al.](https://levyomer.files.wordpress.com/2014/04/dependency-based-word-embeddings-acl-2014.pdf),比如说把目标单词左边的内容当做一个‘上下文’,或者以目标单词右边的内容,等等。现在我们把目标单词的左右单词视作一个上下文, 使用大小为1的窗口,这样就得到这样一个由`(上下文, 目标单词)` 组成的数据集: `([the, brown], quick), ([quick, fox], brown), ([brown, jumped], fox), ...` 前文提到Skip-Gram模型是把目标单词和上下文颠倒过来,所以在这个问题中,举个例子,就是用'quick'来预测 'the' 和 'brown' ,用 'brown' 预测 'quick' 和 'brown' 。因此这个数据集就变成由`(输入, 输出)`组成的: `(quick, the), (quick, brown), (brown, quick), (brown, fox), ...` 目标函数通常是对整个数据集建立的,但是本问题中要对每一个样本(或者是一个`batch_size` 很小的样本集,通常设置为`16 <= batch_size <= 512`)在同一时间执行特别的操作,称之为[随机梯度下降](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) (SGD)。我们来看一下训练过程中每一步的执行。 假设用 \\(t\\)表示上面这个例子中`quick` 来预测 `the` 的训练的单个循环。用 `num_noise` 定义从噪声分布中挑选出来的噪声(相反的)单词的个数,通常使用一元分布,\\(P(w)\\)。为了简单起见,我们就定`num_noise=1`,用 `sheep` 选作噪声词。接下来就可以计算每一对观察值和噪声值的损失函数了,每一个执行步骤就可表示为: $$J^{(t)}_\text{NEG} = \log Q_\theta(D=1 | \text{the, quick}) + \log(Q_\theta(D=0 | \text{sheep, quick}))$$. 整个计算过程的目标是通过更新嵌套参数 \\(\theta\\) 来逼近目标函数(这个这个例子中就是使目标函数最大化)。为此我们要计算损失函数中嵌套参数 \\(\theta\\) 的梯度,比如, \\(\frac{\partial}{\partial \theta} J_\text{NEG}\\) (幸好TensorFlow封装了工具函数可以简单调用!)。对于整个数据集,当梯度下降的过程中不断地更新参数,对应产生的效果就是不断地移动每个单词的嵌套向量,直到可以把真实单词和噪声单词很好得区分开。 我们可以把学习向量映射到2维中以便我们观察,其中用到的技术可以参考 [t-SNE 降纬技术](http://lvdmaaten.github.io/tsne/)。当我们用可视化的方式来观察这些向量,就可以很明显的获取单词之间语义信息的关系,这实际上是非常有用的。当我们第一次发现这样的诱导向量空间中,展示了一些特定的语义关系,这是非常有趣的,比如文字中 *male-female*,*gender* 甚至还有 *country-capital* 的关系, 如下方的图所示 (也可以参考 [Mikolov et al., 2013](http://www.aclweb.org/anthology/N13-1090)论文中的例子)。
这也解释了为什么这些向量在传统的NLP问题中可作为特性使用,比如用在对一个演讲章节打个标签,或者对一个专有名词的识别 (看看如下这个例子 [Collobert et al.](http://arxiv.org/pdf/1103.0398v1.pdf)或者 [Turian et al.](http://www.aclweb.org/anthology/P10-1040))。 不过现在让我们用它们来画漂亮的图表吧! ## 建立图形 这里谈得都是嵌套,那么先来定义一个嵌套参数矩阵。我们用均匀分布的随机值来初始化这个大矩阵。 ```python embeddings = tf.Variable( tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0)) ``` 对噪声-比对的损失计算就使用一个逻辑回归模型。对此,我们需要对语料库中的每个单词定义一个权重值和偏差值。(也可称之为`输出权重` 与之对应的 `输入嵌套值`)。定义如下。 ```python nce_weights = tf.Variable( tf.truncated_normal([vocabulary_size, embedding_size], stddev=1.0 / math.sqrt(embedding_size))) nce_biases = tf.Variable(tf.zeros([vocabulary_size])) ``` 我们有了这些参数之后,就可以定义Skip-Gram模型了。简单起见,假设我们已经把语料库中的文字整型化了,这样每个整型代表一个单词(细节请查看 [tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py))。Skip-Gram模型有两个输入。一个是一组用整型表示的上下文单词,另一个是目标单词。给这些输入建立占位符节点,之后就可以填入数据了。 ```python # 建立输入占位符 train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1]) ``` 然后我们需要对批数据中的单词建立嵌套向量,TensorFlow提供了方便的工具函数。 ```python embed = tf.nn.embedding_lookup(embeddings, train_inputs) ``` 好了,现在我们有了每个单词的嵌套向量,接下来就是使用噪声-比对的训练方式来预测目标单词。 ```python # 计算 NCE 损失函数, 每次使用负标签的样本. loss = tf.reduce_mean( tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels, num_sampled, vocabulary_size)) ``` 我们对损失函数建立了图形节点,然后我们需要计算相应梯度和更新参数的节点,比如说在这里我们会使用随机梯度下降法,TensorFlow也已经封装好了该过程。 ```python # 使用 SGD 控制器. optimizer = tf.train.GradientDescentOptimizer(learning_rate=1.0).minimize(loss) ``` ## 训练模型 训练的过程很简单,只要在循环中使用`feed_dict`不断给占位符填充数据,同时调用 [`session.run`](../../api_docs/python/client.md#Session.run)即可。 ```python for inputs, labels in generate_batch(...): feed_dict = {training_inputs: inputs, training_labels: labels} _, cur_loss = session.run([optimizer, loss], feed_dict=feed_dict) ``` 完整地例子可参考 [tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py](./word2vec_basic.py). ## 嵌套学习结果可视化 使用t-SNE来看一下嵌套学习完成的结果。
Et voila! 与预期的一样,相似的单词被聚类在一起。对word2vec模型更复杂的实现需要用到TensorFlow一些更高级的特性,具体是实现可以参考 [tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py)。 ## 嵌套学习的评估: 类比推理 词嵌套在NLP的预测问题中是非常有用且使用广泛地。如果要检测一个模型是否是可以成熟地区分词性或者区分专有名词的模型,最简单的办法就是直接检验它的预测词性、语义关系的能力,比如让它解决形如`king is to queen as father is to ?`这样的问题。这种方法叫做*类比推理* ,可参考[Mikolov and colleagues](http://msr-waypoint.com/en-us/um/people/gzweig/Pubs/NAACL2013Regularities.pdf),数据集下载地址为: https://word2vec.googlecode.com/svn/trunk/questions-words.txt。 To see how we do this evaluation如何执行这样的评估,可以看`build_eval_graph()`和 `eval()`这两个函数在下面源码中的使用 [tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py). 超参数的选择对该问题解决的准确性有巨大的影响。想要模型具有很好的表现,需要有一个巨大的训练数据集,同时仔细调整参数的选择并且使用例如二次抽样的一些技巧。不过这些问题已经超出了本教程的范围。 ## 优化实现 以上简单的例子展示了TensorFlow的灵活性。比如说,我们可以很轻松得用现成的`tf.nn.sampled_softmax_loss()`来代替`tf.nn.nce_loss()`构成目标函数。如果你对损失函数想做新的尝试,你可以用TensorFlow手动编写新的目标函数的表达式,然后用控制器执行计算。这种灵活性的价值体现在,当我们探索一个机器学习模型时,我们可以很快地遍历这些尝试,从中选出最优。 一旦你有了一个满意的模型结构,或许它就可以使实现运行地更高效(在短时间内覆盖更多的数据)。比如说,在本教程中使用的简单代码,实际运行速度都不错,因为我们使用Python来读取和填装数据,而这些在TensorFlow后台只需执行非常少的工作。如果你发现你的模型在输入数据时存在严重的瓶颈,你可以根据自己的实际问题自行实现一个数据阅读器,参考 [新的数据格式](../../how_tos/new_data_formats/index.md)。对于Skip-Gram 模型,我们已经完成了如下这个例子 [tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py)。 如果I/O问题对你的模型已经不再是个问题,并且想进一步地优化性能,或许你可以自行编写TensorFlow操作单元,详见 [添加一个新的操作](../../how_tos/adding_an_op/index.md)。相应的,我们也提供了Skip-Gram模型的例子 [tensorflow/models/embedding/word2vec_optimized.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec_optimized.py)。请自行调节以上几个过程的标准,使模型在每个运行阶段有更好地性能。 ## 总结 在本教程中我们介绍了word2vec模型,它在解决词嵌套问题中具有良好的性能。我们解释了使用词嵌套模型的实用性,并且讨论了如何使用TensorFlow实现该模型的高效训练。总的来说,我们希望这个例子能够让向你展示TensorFlow可以提供实验初期的灵活性,以及在后期优化模型时对模型内部的可操控性。 原文地址:[Vector Representation of Words](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/tutorials/word2vec/index.md) 翻译:[btpeter](https://github.com/btpeter) 校对:waiwaizheng ================================================ FILE: SOURCE/tutorials/word2vec/word2vec_basic.py ================================================ from __future__ import print_function import tensorflow.python.platform import collections import math import numpy as np import os import random import tensorflow as tf import urllib import zipfile # Step 1: Download the data. url = 'http://mattmahoney.net/dc/' def maybe_download(filename, expected_bytes): """Download a file if not present, and make sure it's the right size.""" if not os.path.exists(filename): filename, _ = urllib.urlretrieve(url + filename, filename) statinfo = os.stat(filename) if statinfo.st_size == expected_bytes: print('Found and verified', filename) else: print(statinfo.st_size) raise Exception( 'Failed to verify ' + filename + '. Can you get to it with a browser?') return filename filename = maybe_download('text8.zip', 31344016) # Read the data into a string. def read_data(filename): f = zipfile.ZipFile(filename) for name in f.namelist(): return f.read(name).split() f.close() words = read_data(filename) print('Data size', len(words)) # Step 2: Build the dictionary and replace rare words with UNK token. vocabulary_size = 50000 def build_dataset(words): count = [['UNK', -1]] count.extend(collections.Counter(words).most_common(vocabulary_size - 1)) dictionary = dict() for word, _ in count: dictionary[word] = len(dictionary) data = list() unk_count = 0 for word in words: if word in dictionary: index = dictionary[word] else: index = 0 # dictionary['UNK'] unk_count = unk_count + 1 data.append(index) count[0][1] = unk_count reverse_dictionary = dict(zip(dictionary.values(), dictionary.keys())) return data, count, dictionary, reverse_dictionary data, count, dictionary, reverse_dictionary = build_dataset(words) del words # Hint to reduce memory. print('Most common words (+UNK)', count[:5]) print('Sample data', data[:10]) data_index = 0 # Step 4: Function to generate a training batch for the skip-gram model. def generate_batch(batch_size, num_skips, skip_window): global data_index assert batch_size % num_skips == 0 assert num_skips <= 2 * skip_window batch = np.ndarray(shape=(batch_size), dtype=np.int32) labels = np.ndarray(shape=(batch_size, 1), dtype=np.int32) span = 2 * skip_window + 1 # [ skip_window target skip_window ] buffer = collections.deque(maxlen=span) for _ in range(span): buffer.append(data[data_index]) data_index = (data_index + 1) % len(data) for i in range(batch_size / num_skips): target = skip_window # target label at the center of the buffer targets_to_avoid = [ skip_window ] for j in range(num_skips): while target in targets_to_avoid: target = random.randint(0, span - 1) targets_to_avoid.append(target) batch[i * num_skips + j] = buffer[skip_window] labels[i * num_skips + j, 0] = buffer[target] buffer.append(data[data_index]) data_index = (data_index + 1) % len(data) return batch, labels batch, labels = generate_batch(batch_size=8, num_skips=2, skip_window=1) for i in range(8): print(batch[i], '->', labels[i, 0]) print(reverse_dictionary[batch[i]], '->', reverse_dictionary[labels[i, 0]]) # Step 5: Build and train a skip-gram model. batch_size = 128 embedding_size = 128 # Dimension of the embedding vector. skip_window = 1 # How many words to consider left and right. num_skips = 2 # How many times to reuse an input to generate a label. # We pick a random validation set to sample nearest neighbors. Here we limit the # validation samples to the words that have a low numeric ID, which by # construction are also the most frequent. valid_size = 16 # Random set of words to evaluate similarity on. valid_window = 100 # Only pick dev samples in the head of the distribution. valid_examples = np.array(random.sample(xrange(valid_window), valid_size)) num_sampled = 64 # Number of negative examples to sample. graph = tf.Graph() with graph.as_default(): # Input data. train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1]) valid_dataset = tf.constant(valid_examples, dtype=tf.int32) # Construct the variables. embeddings = tf.Variable( tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0)) nce_weights = tf.Variable( tf.truncated_normal([vocabulary_size, embedding_size], stddev=1.0 / math.sqrt(embedding_size))) nce_biases = tf.Variable(tf.zeros([vocabulary_size])) # Look up embeddings for inputs. embed = tf.nn.embedding_lookup(embeddings, train_inputs) # Compute the average NCE loss for the batch. # tf.nce_loss automatically draws a new sample of the negative labels each # time we evaluate the loss. loss = tf.reduce_mean( tf.nn.nce_loss(nce_weights, nce_biases, train_labels,embed, num_sampled, vocabulary_size)) # Construct the SGD optimizer using a learning rate of 1.0. optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(loss) # Compute the cosine similarity between minibatch examples and all embeddings. norm = tf.sqrt(tf.reduce_sum(tf.square(embeddings), 1, keep_dims=True)) normalized_embeddings = embeddings / norm valid_embeddings = tf.nn.embedding_lookup( normalized_embeddings, valid_dataset) similarity = tf.matmul( valid_embeddings, normalized_embeddings, transpose_b=True) # Step 6: Begin training num_steps = 100001 with tf.Session(graph=graph) as session: # We must initialize all variables before we use them. tf.initialize_all_variables().run() print("Initialized") average_loss = 0 for step in xrange(num_steps): batch_inputs, batch_labels = generate_batch( batch_size, num_skips, skip_window) feed_dict = {train_inputs : batch_inputs, train_labels : batch_labels} # We perform one update step by evaluating the optimizer op (including it # in the list of returned values for session.run() _, loss_val = session.run([optimizer, loss], feed_dict=feed_dict) average_loss += loss_val if step % 2000 == 0: if step > 0: average_loss = average_loss / 2000 # The average loss is an estimate of the loss over the last 2000 batches. print("Average loss at step ", step, ": ", average_loss) average_loss = 0 # note that this is expensive (~20% slowdown if computed every 500 steps) if step % 10000 == 0: sim = similarity.eval() for i in xrange(valid_size): valid_word = reverse_dictionary[valid_examples[i]] top_k = 8 # number of nearest neighbors nearest = (-sim[i, :]).argsort()[1:top_k+1] log_str = "Nearest to %s:" % valid_word for k in xrange(top_k): close_word = reverse_dictionary[nearest[k]] log_str = "%s %s," % (log_str, close_word) print(log_str) final_embeddings = normalized_embeddings.eval() # Step 7: Visualize the embeddings. def plot_with_labels(low_dim_embs, labels, filename='tsne.png'): assert low_dim_embs.shape[0] >= len(labels), "More labels than embeddings" plt.figure(figsize=(18, 18)) #in inches for i, label in enumerate(labels): x, y = low_dim_embs[i,:] plt.scatter(x, y) plt.annotate(label, xy=(x, y), xytext=(5, 2), textcoords='offset points', ha='right', va='bottom') plt.savefig(filename) try: from sklearn.manifold import TSNE import matplotlib.pyplot as plt tsne = TSNE(perplexity=30, n_components=2, init='pca', n_iter=5000) plot_only = 500 low_dim_embs = tsne.fit_transform(final_embeddings[:plot_only,:]) labels = list(dictionary.keys())[:plot_only] plot_with_labels(low_dim_embs, labels) except ImportError: print("Please install sklearn and matplotlib to visualize embeddings.") ================================================ FILE: SOURCE/tutorials/word2vec.md ================================================ # Vector Representations of Words 在本教程我们来看一下[Mikolov et al](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf)中提到的word2vec模型。该模型是用于学习文字的向量表示,称之为“word embedding”。 ## 亮点 本教程意在展现出在TensorfLow中构建word2vec模型有趣、本质的部分。 * 我们从我们为何需要使用向量表示文字开始。 * 我们通过直观地例子观察模型背后的本质,以及它是如何训练的(通过一些数学方法评估)。 * 同时我们也展示了TensorFlow对该模型的简单实现。 * 最后,我们着眼于让给这个简单版本的模型表现更好。 我们会在教程的推进中循序渐进地解释代码,但是如果你更希望直入主题,可以在 [tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py)查看到一个最简单的实现。这个基本的例子提供的代码可以完成下载一些数据,简单训练后展示结果。一旦你觉得已经完全掌握了这个简单版本,你可以查看 [tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py),这里提供了一些更复杂的实现,同时也展示了TensorFlow的一些更进阶的特性,比如如何更高效地使用线程将数据送入文本模型,再比如如何在训练中设置检查点等等。 但是首先,让我们来看一下为何需要学习word embeddings。如果你对word embeddings相关内容已经是个专家了,那么请安心跳过本节内容,直接深入细节干一些脏活吧。 ## 动机: 为什么需要学习 Word Embeddings? 通常图像或音频系统处理的是由图片中所有单个原始像素点强度值或者音频中功率谱密度的强度值,把它们编码成丰富、高纬度的向量数据集。对于物体或语音识别这一类的任务,我们所需的全部信息已经都存储在原始数据中(显然人类本身就是依赖原始数据进行日常的物体或语音识别的)。然后,自然语言处理系统通常将词汇作为离散的单一符号,例如 "cat" 一词或可表示为 `Id537` ,而 "dog" 一词或可表示为 `Id143`。这些符号编码毫无规律,无法提供不同词汇之间可能存在的关联信息。换句话说,在处理关于 "dogs" 一词的信息时,模型将无法利用已知的关于 "cats" 的信息(例如,它们都是动物,有四条腿,可作为宠物等等)。可见,将词汇表达为上述的独立离散符号将进一步导致数据稀疏,使我们在训练统计模型时不得不寻求更多的数据。而词汇的向量表示将克服上述的难题。
[向量空间模型](https://en.wikipedia.org/wiki/Vector_space_model) (VSMs)将词汇表达(嵌套)于一个连续的向量空间中,语义近似的词汇被映射为相邻的数据点。向量空间模型在自然语言处理领域中有着漫长且丰富的历史,不过几乎所有利用这一模型的方法都依赖于 [分布式假设](https://en.wikipedia.org/wiki/Distributional_semantics#Distributional_Hypothesis),其核心思想为出现于上下文情景中的词汇都有相类似的语义。采用这一假设的研究方法大致分为以下两类:*基于计数的方法* (e.g. [潜在语义分析](https://en.wikipedia.org/wiki/Latent_semantic_analysis)), 和 *预测方法* (e.g. [神经概率化语言模型](http://www.scholarpedia.org/article/Neural_net_language_models)). 其中它们的区别在如下论文中又详细阐述 [Baroni et al.](http://clic.cimec.unitn.it/marco/publications/acl2014/baroni-etal-countpredict-acl2014.pdf),不过简而言之:基于计数的方法计算某词汇与其邻近词汇在一个大型语料库中共同出现的频率及其他统计量,然后将这些统计量映射到一个小型且稠密的向量中。预测方法则试图直接从某词汇的邻近词汇对其进行预测,在此过程中利用已经学习到的小型且稠密的*嵌套向量*。 Word2vec是一种可以进行高效率词嵌套学习的预测模型。其两种变体分别为:连续词袋模型(CBOW)及Skip-Gram模型。从算法角度看,这两种方法非常相似,其区别为CBOW根据源词上下文词汇('the cat sits on the')来预测目标词汇(例如,‘mat’),而Skip-Gram模型做法相反,它通过目标词汇来预测源词汇。Skip-Gram模型采取CBOW的逆过程的动机在于:CBOW算法对于很多分布式信息进行了平滑处理(例如将一整段上下文信息视为一个单一观察量)。很多情况下,对于小型的数据集,这一处理是有帮助的。相形之下,Skip-Gram模型将每个“上下文-目标词汇”的组合视为一个新观察量,这种做法在大型数据集中会更为有效。本教程余下部分将着重讲解Skip-Gram模型。 ## 处理噪声对比训练 神经概率化语言模型通常使用[极大似然法](https://en.wikipedia.org/wiki/Maximum_likelihood) (ML) 进行训练,其中通过 [*softmax* function](https://en.wikipedia.org/wiki/Softmax_function) 来最大化当提供前一个单词 **h** (代表 "history"),后一个单词的概率 ![](../images/vr1.png) (代表 "target"), ![](../images/vr2.png) 当 **score(w_t,h)** 计算了文字 **w_t** 和 上下文 **h** 的相容性(通常使用向量积)。我们使用对数似然函数来训练训练集的最大值,比如通过: ![](../images/word2vec2.png) 这里提出了一个解决语言概率模型的合适的通用方法。然而这个方法实际执行起来开销非常大,因为我们需要去计算并正则化当前上下文环境 **h** 中所有其他 **V** 单词 **w'** 的概率得分,*在每一步训练迭代中*。
从另一个角度来说,当使用word2vec模型时,我们并不需要对概率模型中的所有特征进行学习。而CBOW模型和Skip-Gram模型为了避免这种情况发生,使用一个二分类器(逻辑回归)在同一个上下文环境里从 **k** 虚构的 (噪声) 单词 ![](../images/rw5.png) 区分出真正的目标单词 ![](../images/rw4.png)。我们下面详细阐述一下CBOW模型,对于Skip-Gram模型只要简单地做相反的操作即可。
从数学角度来说,我们的目标是对每个样本最大化: ![](../images/rw6.png) 其中 ![](../images/rw7.png) 代表的是数据集在当前上下文 **h** ,根据所学习的嵌套向量 ![](../images/theta.png) ,目标单词 **w** 使用二分类逻辑回归计算得出的概率。在实践中,我们通过在噪声分布中绘制比对文字来获得近似的期望值(通过计算[蒙特卡洛平均值](https://en.wikipedia.org/wiki/Monte_Carlo_integration))。 当真实地目标单词被分配到较高的概率,同时噪声单词的概率很低时,目标函数也就达到最大值了。从技术层面来说,这种方法叫做 [负抽样](http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf),而且使用这个损失函数在数学层面上也有很好的解释:这个更新过程也近似于softmax函数的更新。这在计算上将会有很大的优势,因为当计算这个损失函数时,只是有我们挑选出来的 **k** 个 *噪声单词*,而没有使用整个语料库 **V**。这使得训练变得非常快。我们实际上使用了与[noise-contrastive estimation (NCE)](http://papers.nips.cc/paper/5165-learning-word-embeddings-efficiently-with-noise-contrastive-estimation.pdf)介绍的非常相似的方法,这在TensorFlow中已经封装了一个很便捷的函数`tf.nn.nce_loss()`。 让我们在实践中来直观地体会它是如何运作的! ## Skip-gram 模型 下面来看一下这个数据集 `the quick brown fox jumped over the lazy dog` 我们首先对一些单词以及它们的上下文环境建立一个数据集。我们可以以任何合理的方式定义‘上下文’,而通常上这个方式是根据文字的句法语境的(使用语法原理的方式处理当前目标单词可以看一下这篇文献 [Levy et al.](https://levyomer.files.wordpress.com/2014/04/dependency-based-word-embeddings-acl-2014.pdf),比如说把目标单词左边的内容当做一个‘上下文’,或者以目标单词右边的内容,等等。现在我们把目标单词的左右单词视作一个上下文, 使用大小为1的窗口,这样就得到这样一个由`(上下文, 目标单词)` 组成的数据集: `([the, brown], quick), ([quick, fox], brown), ([brown, jumped], fox), ...` 前文提到Skip-Gram模型是把目标单词和上下文颠倒过来,所以在这个问题中,举个例子,就是用'quick'来预测 'the' 和 'brown' ,用 'brown' 预测 'quick' 和 'brown' 。因此这个数据集就变成由`(输入, 输出)`组成的: `(quick, the), (quick, brown), (brown, quick), (brown, fox), ...` 目标函数通常是对整个数据集建立的,但是本问题中要对每一个样本(或者是一个`batch_size` 很小的样本集,通常设置为`16 <= batch_size <= 512`)在同一时间执行特别的操作,称之为[随机梯度下降](https://en.wikipedia.org/wiki/Stochastic_gradient_descent) (SGD)。我们来看一下训练过程中每一步的执行。 假设用 **t** 表示上面这个例子中`quick` 来预测 `the` 的训练的单个循环。用 `num_noise` 定义从噪声分布中挑选出来的噪声(相反的)单词的个数,通常使用一元分布,**P(w)**。为了简单起见,我们就定`num_noise=1`,用 `sheep` 选作噪声词。接下来就可以计算每一对观察值和噪声值的损失函数了,每一个执行步骤就可表示为: ![](../images/vr4.png) 整个计算过程的目标是通过更新嵌套参数 ![](../images/theta.png) 来逼近目标函数(这个这个例子中就是使目标函数最大化)。为此我们要计算损失函数中嵌套参数 ![](../images/theta.png) 的梯度,比如, ![](../images/vr5.png) (幸好TensorFlow封装了工具函数可以简单调用!)。对于整个数据集,当梯度下降的过程中不断地更新参数,对应产生的效果就是不断地移动每个单词的嵌套向量,直到可以把真实单词和噪声单词很好得区分开。 我们可以把学习向量映射到2维中以便我们观察,其中用到的技术可以参考 [t-SNE 降纬技术](http://lvdmaaten.github.io/tsne/)。当我们用可视化的方式来观察这些向量,就可以很明显的获取单词之间语义信息的关系,这实际上是非常有用的。当我们第一次发现这样的诱导向量空间中,展示了一些特定的语义关系,这是非常有趣的,比如文字中 *male-female*,*gender* 甚至还有 *country-capital* 的关系, 如下方的图所示 (也可以参考 [Mikolov et al., 2013](http://www.aclweb.org/anthology/N13-1090)论文中的例子)。
这也解释了为什么这些向量在传统的NLP问题中可作为特性使用,比如用在对一个演讲章节打个标签,或者对一个专有名词的识别 (看看如下这个例子 [Collobert et al.](http://arxiv.org/pdf/1103.0398v1.pdf)或者 [Turian et al.](http://www.aclweb.org/anthology/P10-1040))。 不过现在让我们用它们来画漂亮的图表吧! ## 建立图形 这里谈得都是嵌套,那么先来定义一个嵌套参数矩阵。我们用唯一的随机值来初始化这个大矩阵。 ```python embeddings = tf.Variable( tf.random_uniform([vocabulary_size, embedding_size], -1.0, 1.0)) ``` 对噪声-比对的损失计算就使用一个逻辑回归模型。对此,我们需要对语料库中的每个单词定义一个权重值和偏差值。(也可称之为`输出权重` 与之对应的 `输入嵌套值`)。定义如下。 ```python nce_weights = tf.Variable( tf.truncated_normal([vocabulary_size, embedding_size], stddev=1.0 / math.sqrt(embedding_size))) nce_biases = tf.Variable(tf.zeros([vocabulary_size])) ``` 我们有了这些参数之后,就可以定义Skip-Gram模型了。简单起见,假设我们已经把语料库中的文字整型化了,这样每个整型代表一个单词(细节请查看 [tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py))。Skip-Gram模型有两个输入。一个是一组用整型表示的上下文单词,另一个是目标单词。给这些输入建立占位符节点,之后就可以填入数据了。 ```python # 建立输入占位符 train_inputs = tf.placeholder(tf.int32, shape=[batch_size]) train_labels = tf.placeholder(tf.int32, shape=[batch_size, 1]) ``` 然后我们需要对批数据中的单词建立嵌套向量,TensorFlow提供了方便的工具函数。 ```python embed = tf.nn.embedding_lookup(embeddings, train_inputs) ``` 好了,现在我们有了每个单词的嵌套向量,接下来就是使用噪声-比对的训练方式来预测目标单词。 ```python # 计算 NCE 损失函数, 每次使用负标签的样本. loss = tf.reduce_mean( tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels, num_sampled, vocabulary_size)) ``` 我们对损失函数建立了图形节点,然后我们需要计算相应梯度和更新参数的节点,比如说在这里我们会使用随机梯度下降法,TensorFlow也已经封装好了该过程。 ```python # 使用 SGD 控制器. optimizer = tf.train.GradientDescentOptimizer(learning_rate=1.0).minimize(loss) ``` ## 训练模型 训练的过程很简单,只要在循环中使用`feed_dict`不断给占位符填充数据,同时调用 [`session.run`](tensorflow-zh/SOURCE/api_docs/python/client.md#Session.run)即可。 ```python for inputs, labels in generate_batch(...): feed_dict = {training_inputs: inputs, training_labels: labels} _, cur_loss = session.run([optimizer, loss], feed_dict=feed_dict) ``` 完整地例子可参考 [tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py](./word2vec_basic.py). ## 嵌套学习结果可视化 使用t-SNE来看一下嵌套学习完成的结果。
Et voila! 与预期的一样,相似的单词被聚类在一起。对word2vec模型更复杂的实现需要用到TensorFlow一些更高级的特性,具体是实现可以参考 [tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py)。 ## 嵌套学习的评估: 类比推理 词嵌套在NLP的预测问题中是非常有用且使用广泛地。如果要检测一个模型是否是可以成熟地区分词性或者区分专有名词的模型,最简单的办法就是直接检验它的预测词性、语义关系的能力,比如让它解决形如`king is to queen as father is to ?`这样的问题。这种方法叫做*类比推理* ,可参考[Mikolov and colleagues](http://msr-waypoint.com/en-us/um/people/gzweig/Pubs/NAACL2013Regularities.pdf),数据集下载地址为: https://word2vec.googlecode.com/svn/trunk/questions-words.txt。 To see how we do this evaluation如何执行这样的评估,可以看`build_eval_graph()`和 `eval()`这两个函数在下面源码中的使用 [tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py). 超参数的选择对该问题解决的准确性有巨大的影响。想要模型具有很好的表现,需要有一个巨大的训练数据集,同时仔细调整参数的选择并且使用例如二次抽样的一些技巧。不过这些问题已经超出了本教程的范围。 ## 优化实现 以上简单的例子展示了TensorFlow的灵活性。比如说,我们可以很轻松得用现成的`tf.nn.sampled_softmax_loss()`来代替`tf.nn.nce_loss()`构成目标函数。如果你对损失函数想做新的尝试,你可以用TensorFlow手动编写新的目标函数的表达式,然后用控制器执行计算。这种灵活性的价值体现在,当我们探索一个机器学习模型时,我们可以很快地遍历这些尝试,从中选出最优。 一旦你有了一个满意的模型结构,或许它就可以使实现运行地更高效(在短时间内覆盖更多的数据)。比如说,在本教程中使用的简单代码,实际运行速度都不错,因为我们使用Python来读取和填装数据,而这些在TensorFlow后台只需执行非常少的工作。如果你发现你的模型在输入数据时存在严重的瓶颈,你可以根据自己的实际问题自行实现一个数据阅读器,参考 [新的数据格式](tensorflow-zh/SOURCE/how_tos/new_data_formats/index.md)。对于Skip-Gram 模型,我们已经完成了如下这个例子 [tensorflow/models/embedding/word2vec.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py)。 如果I/O问题对你的模型已经不再是个问题,并且想进一步地优化性能,或许你可以自行编写TensorFlow操作单元,详见 [添加一个新的操作](tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md)。相应的,我们也提供了Skip-Gram模型的例子 [tensorflow/models/embedding/word2vec_optimized.py](https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec_optimized.py)。请自行调节以上几个过程的标准,使模型在每个运行阶段有更好地性能。 ## 总结 在本教程中我们介绍了word2vec模型,它在解决词嵌套问题中具有良好的性能。我们解释了使用词嵌套模型的实用性,并且讨论了如何使用TensorFlow实现该模型的高效训练。总的来说,我们希望这个例子能够让向你展示TensorFlow可以提供实验初期的灵活性,以及在后期优化模型时对模型内部的可操控性。 原文地址:[Vector Representation of Words](https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/tutorials/word2vec/index.md) 翻译:[btpeter](https://github.com/btpeter) 校对:waiwaizheng ================================================ FILE: TOC.md ================================================ - 起步 - [介绍](SOURCE/get_started/introduction.md) - [下载及安装](SOURCE/get_started/os_setup.md) - [基本用法](SOURCE/get_started/basic_usage.md) - 教程 - [总览](SOURCE/tutorials/overview.md) - [MNIST 机器学习入门](SOURCE/tutorials/mnist_beginners.md) - [深入 MNIST](SOURCE/tutorials/mnist_pros.md) - [TensorFlow 运作方式入门](SOURCE/tutorials/mnist_tf.md) - [卷积神经网络](SOURCE/tutorials/deep_cnn.md) - [字词的向量表示](SOURCE/tutorials/word2vec.md) - [递归神经网络](SOURCE/tutorials/recurrent.md) - [曼德布洛特(Mandelbrot)集合](SOURCE/tutorials/mandelbrot.md) - [偏微分方程](SOURCE/tutorials/pdes.md) - [MNIST数据下载](SOURCE/tutorials/mnist_download.md) - 运作方式 - [总览](SOURCE/how_tos/overview.md) - [变量:创建、初始化、保存和加载](SOURCE/how_tos/variables.md) - [TensorBoard:可视化学习](SOURCE/how_tos/summaries_and_tensorboard.md) - [TensorBoard:图表可视化](SOURCE/how_tos/graph_viz.md) - [读取数据](SOURCE/how_tos/reading_data.md) - [线程和队列](SOURCE/how_tos/threading_and_queues.md) - [添加新的Op](SOURCE/how_tos/adding_an_op.md) - [自定义数据读取](SOURCE/how_tos/new_data_formats.md) - [使用gpu](SOURCE/how_tos/using_gpu.md) - [共享变量](SOURCE/how_tos/variable_scope.md) - 资源 - [总览](SOURCE/resources/overview.md) - [BibTex 引用](SOURCE/resources/bib.md) - [示例使用](SOURCE/resources/uses.md) - [FAQ](SOURCE/resources/faq.md) - [术语表](SOURCE/resources/glossary.md) - [Tensor排名、形状和类型](SOURCE/resources/dims_types.md) - 其他 - [常见问题汇总](SOURCE/faq.md) - [相关资源](SOURCE/resource.md) - [个人学习心得](SOURCE/personal.md) ================================================ FILE: config.json ================================================ { "name": "TensorFlow 官方文档中文版", "introduction": "Google 人工智能系统 TensorFlow 官方文档中文版协同翻译。", "path": { "content": "SOURCE", "images": "SOURCE/images" } } ================================================ FILE: learn-github.md ================================================ ## 注册 GitHub 账号 ## 创建仓库 ## 工具安装,克隆仓库到本地 Windows 用户详见文档: Mac 用户详见文档: >注意:你克隆到本地的文件内至少应包含一个 README.md 文件。![](images/begin.png) 至此,你已经拥有了一个 GitHub 账号,一个远程仓库,一个本地仓库。下一步,你将要花费 5 分钟的时间了解下 [Markdown 的简单语法](markdown.md) ================================================ FILE: markdown.md ================================================ ## Markdown 工具下载 - Windows 用户推荐使用编辑器 MarkdownPad 下载地址: - Mac 用户推荐使用编辑器 Mou 下载地址: ## 标题 在标题内容前添加 `# + 空格` 就可以实现标题样式的添加: - 一级标题就添加一个 `#` + 空格,例如:# 我是一级标题 - 二级标题就添加两个 `#` + 空格,例如:## 我是二级标题。 - 三级标题就添加三个 `#` + 空格,例如:### 我是三级标题。 依次类推... ## 特殊文字处理 ### 换行 书写示例: ``` 我是第一行(后面有两个空格) 我是第二行 ``` 效果: 我是第一行(后面有两个空格) 我是第二行 ### 加粗 书写示例: ``` **我是粗体** ``` 效果: **我是粗体** ### 标亮 书写示例: ``` `请把我标亮` ``` 效果: `请把我标亮` ## 列表 ### 一级列表(不带序号) 书写示例: ``` - 列表 1(减号 + 空格) - 列表 2 ``` 效果: - 列表 1(减号 + 空格) - 列表 2 ### 一级列表(带序号) 书写示例: ``` 1. 列表 1(自然数 + 点 + 空格) 2. 列表 2 ``` 效果: 1. 列表 1(自然数 + 点 + 空格) 2. 列表 2 ### 二级列表(不带序号) 书写示例: ``` - 列表 1(一级列表:减号 + 空格) - 列表 1.1(二级列表:空格 + 减号 + 空格) - 列表 1.2 - 列表 2 ``` 效果: - 列表 1(一级列表:减号 + 空格) - 列表 1.1(二级列表:空格 + 减号 + 空格) - 列表 1.2 - 列表 2 ## 添加图片 书写示例: ``` ![图片描述性文字](图片相对地址) ``` 效果: ![我是示例图片的描述](http://s1.jikexueyuan.com/common/images/logo_c8caff4.png) ## 添加超链接 书写示例: ``` [链接描述文字(可见)](链接地址) ``` 效果: [点击跳转到极客学院](http://www.jikexueyuan.com) ## 引用 书写示例: ``` >注意: 我是引用的内容(空一行就结束引用)。 ``` 效果: >注意: 我是引用的内容(空一行就结束引用)。 ## 添加代码 书写示例:
```
// 代码区域的上下分别用三个 ` 括起来
public class Person {
    // 代码缩进请使用 四个空格,不要使用 Tab
}
```
效果: ``` // 代码区域的上下分别用三个 ` 括起来 public class Person { // 代码缩进请使用 四个空格,不要使用 Tab } ``` ## 锚点使用 书写示例: ``` 极客学院是中国最大的 IT 职业在线教育平台。[[锚点名]](#footnode) [锚点名]我的示例。 ``` 效果: 极客学院是中国最大的 IT 职业在线教育平台。[[锚点名]](#footnode) [锚点名]我的示例。 ## 特别注意 1. 英文字母与中文之间需要两边空格(英文与符号之间不空格) 2. 高亮英文之间需要两边空格 3. 代码符号\`\`\`上面需要空一行,结束```下面需要空一行 4. 中文之间和短英文之间使用中文标点 5. 一篇文章中只能出现一个#号标题 6. 标题#号之后需要空一格 ## 表格 书写示例: ``` | Prefix | Framework | | ------------- |:-------------:| -----:| | NS | Foundation (OS X and iOS) and Application Kit (OS X) | | UI | UIKit (iOS) | | AB | Address Book | | CA | Core Animation | | CI | Core Image | ``` 效果: | Prefix | Framework | | ------------- |:-------------:| -----:| | NS | Foundation (OS X and iOS) and Application Kit (OS X) | | UI | UIKit (iOS) | | AB | Address Book | | CA | Core Animation | | CI | Core Image | ================================================ FILE: tex_pdf/api/c4s00.tex ================================================ \section{Overview}\label{api_overview} TensorFlow has APIs available in several languages both for constructing and executing a TensorFlow graph. The Python API is at present the most complete and the easiest to use, but the C++ API may offer some performance advantages in graph execution, and supports deployment to small devices such as Android. Over time, we hope that the TensorFlow community will develop front ends for languages like Go, Java, JavaScript, Lua R, and perhaps others. With \href{http://swig.org}{SWIG}, it's relatively easy to develop a TensorFlow interface for your favorite language. Note: Many practical aspects of usage are covered in the Mechanics tab, and some additional documentation not specific to any particular language API is available in the Resources tab. ================================================ FILE: tex_pdf/api/cc/ClassEnv.md ================================================ # Class `tensorflow::Env` An interface used by the tensorflow implementation to access operating system functionality like the filesystem etc. Callers may wish to provide a custom Env object to get fine grain control. All Env implementations are safe for concurrent access from multiple threads without any external synchronization. ##Member Summary * [`tensorflow::Env::Env()`](#tensorflow_Env_Env) * [`virtual tensorflow::Env::~Env()`](#virtual_tensorflow_Env_Env) * [`virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0`](#virtual_Status_tensorflow_Env_NewRandomAccessFile) * Creates a brand new random access read-only file with the specified name. * [`virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewWritableFile) * Creates an object that writes to a new file with the specified name. * [`virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0`](#virtual_Status_tensorflow_Env_NewAppendableFile) * Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). * [`virtual bool tensorflow::Env::FileExists(const string &fname)=0`](#virtual_bool_tensorflow_Env_FileExists) * Returns true iff the named file exists. * [`virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0`](#virtual_Status_tensorflow_Env_GetChildren) * Stores in *result the names of the children of the specified directory. The names are relative to "dir". * [`virtual Status tensorflow::Env::DeleteFile(const string &fname)=0`](#virtual_Status_tensorflow_Env_DeleteFile) * Deletes the named file. * [`virtual Status tensorflow::Env::CreateDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_CreateDir) * Creates the specified directory. * [`virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0`](#virtual_Status_tensorflow_Env_DeleteDir) * Deletes the specified directory. * [`virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0`](#virtual_Status_tensorflow_Env_GetFileSize) * Stores the size of fname in *file_size. * [`virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0`](#virtual_Status_tensorflow_Env_RenameFile) * Renames file src to target. If target already exists, it will be replaced. * [`virtual uint64 tensorflow::Env::NowMicros()=0`](#virtual_uint64_tensorflow_Env_NowMicros) * Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. * [`virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0`](#virtual_void_tensorflow_Env_SleepForMicroseconds) * Sleeps/delays the thread for the prescribed number of micro-seconds. * [`virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0`](#virtual_Thread_tensorflow_Env_StartThread) * Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". * [`static Env* tensorflow::Env::Default()`](#static_Env_tensorflow_Env_Default) * Returns a default environment suitable for the current operating system. ##Member Details #### `tensorflow::Env::Env()` #### `virtual tensorflow::Env::~Env()` #### `virtual Status tensorflow::Env::NewRandomAccessFile(const string &fname, RandomAccessFile **result)=0` Creates a brand new random access read-only file with the specified name. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status. The returned file may be concurrently accessed by multiple threads. #### `virtual Status tensorflow::Env::NewWritableFile(const string &fname, WritableFile **result)=0` Creates an object that writes to a new file with the specified name. Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `virtual Status tensorflow::Env::NewAppendableFile(const string &fname, WritableFile **result)=0` Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `virtual bool tensorflow::Env::FileExists(const string &fname)=0` Returns true iff the named file exists. #### `virtual Status tensorflow::Env::GetChildren(const string &dir, std::vector< string > *result)=0` Stores in *result the names of the children of the specified directory. The names are relative to "dir". Original contents of *results are dropped. #### `virtual Status tensorflow::Env::DeleteFile(const string &fname)=0` Deletes the named file. #### `virtual Status tensorflow::Env::CreateDir(const string &dirname)=0` Creates the specified directory. #### `virtual Status tensorflow::Env::DeleteDir(const string &dirname)=0` Deletes the specified directory. #### `virtual Status tensorflow::Env::GetFileSize(const string &fname, uint64 *file_size)=0` Stores the size of fname in *file_size. #### `virtual Status tensorflow::Env::RenameFile(const string &src, const string &target)=0` Renames file src to target. If target already exists, it will be replaced. #### `virtual uint64 tensorflow::Env::NowMicros()=0` Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. #### `virtual void tensorflow::Env::SleepForMicroseconds(int micros)=0` Sleeps/delays the thread for the prescribed number of micro-seconds. #### `virtual Thread* tensorflow::Env::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) TF_MUST_USE_RESULT=0` Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running). #### `static Env* tensorflow::Env::Default()` Returns a default environment suitable for the current operating system. Sophisticated users may wish to provide their own Env implementation instead of relying on this default environment. The result of Default() belongs to this library and must never be deleted. ================================================ FILE: tex_pdf/api/cc/ClassEnvWrapper.md ================================================ # Class `tensorflow::EnvWrapper` An implementation of Env that forwards all calls to another Env . May be useful to clients who wish to override just part of the functionality of another Env . ##Member Summary * [`tensorflow::EnvWrapper::EnvWrapper(Env *t)`](#tensorflow_EnvWrapper_EnvWrapper) * Initializes an EnvWrapper that delegates all calls to *t. * [`virtual tensorflow::EnvWrapper::~EnvWrapper()`](#virtual_tensorflow_EnvWrapper_EnvWrapper) * [`Env* tensorflow::EnvWrapper::target() const`](#Env_tensorflow_EnvWrapper_target) * Returns the target to which this Env forwards all calls. * [`Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override`](#Status_tensorflow_EnvWrapper_NewRandomAccessFile) * Creates a brand new random access read-only file with the specified name. * [`Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewWritableFile) * Creates an object that writes to a new file with the specified name. * [`Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override`](#Status_tensorflow_EnvWrapper_NewAppendableFile) * Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). * [`bool tensorflow::EnvWrapper::FileExists(const string &f) override`](#bool_tensorflow_EnvWrapper_FileExists) * Returns true iff the named file exists. * [`Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override`](#Status_tensorflow_EnvWrapper_GetChildren) * Stores in *result the names of the children of the specified directory. The names are relative to "dir". * [`Status tensorflow::EnvWrapper::DeleteFile(const string &f) override`](#Status_tensorflow_EnvWrapper_DeleteFile) * Deletes the named file. * [`Status tensorflow::EnvWrapper::CreateDir(const string &d) override`](#Status_tensorflow_EnvWrapper_CreateDir) * Creates the specified directory. * [`Status tensorflow::EnvWrapper::DeleteDir(const string &d) override`](#Status_tensorflow_EnvWrapper_DeleteDir) * Deletes the specified directory. * [`Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override`](#Status_tensorflow_EnvWrapper_GetFileSize) * Stores the size of fname in *file_size. * [`Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override`](#Status_tensorflow_EnvWrapper_RenameFile) * Renames file src to target. If target already exists, it will be replaced. * [`uint64 tensorflow::EnvWrapper::NowMicros() override`](#uint64_tensorflow_EnvWrapper_NowMicros) * Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. * [`void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override`](#void_tensorflow_EnvWrapper_SleepForMicroseconds) * Sleeps/delays the thread for the prescribed number of micro-seconds. * [`Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override`](#Thread_tensorflow_EnvWrapper_StartThread) * Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". ##Member Details #### `tensorflow::EnvWrapper::EnvWrapper(Env *t)` Initializes an EnvWrapper that delegates all calls to *t. #### `virtual tensorflow::EnvWrapper::~EnvWrapper()` #### `Env* tensorflow::EnvWrapper::target() const` Returns the target to which this Env forwards all calls. #### `Status tensorflow::EnvWrapper::NewRandomAccessFile(const string &f, RandomAccessFile **r) override` Creates a brand new random access read-only file with the specified name. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. If the file does not exist, returns a non-OK status. The returned file may be concurrently accessed by multiple threads. #### `Status tensorflow::EnvWrapper::NewWritableFile(const string &f, WritableFile **r) override` Creates an object that writes to a new file with the specified name. Deletes any existing file with the same name and creates a new file. On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `Status tensorflow::EnvWrapper::NewAppendableFile(const string &f, WritableFile **r) override` Creates an object that either appends to an existing file, or writes to a new file (if the file does not exist to begin with). On success, stores a pointer to the new file in *result and returns OK. On failure stores NULL in *result and returns non-OK. The returned file will only be accessed by one thread at a time. #### `bool tensorflow::EnvWrapper::FileExists(const string &f) override` Returns true iff the named file exists. #### `Status tensorflow::EnvWrapper::GetChildren(const string &dir, std::vector< string > *r) override` Stores in *result the names of the children of the specified directory. The names are relative to "dir". Original contents of *results are dropped. #### `Status tensorflow::EnvWrapper::DeleteFile(const string &f) override` Deletes the named file. #### `Status tensorflow::EnvWrapper::CreateDir(const string &d) override` Creates the specified directory. #### `Status tensorflow::EnvWrapper::DeleteDir(const string &d) override` Deletes the specified directory. #### `Status tensorflow::EnvWrapper::GetFileSize(const string &f, uint64 *s) override` Stores the size of fname in *file_size. #### `Status tensorflow::EnvWrapper::RenameFile(const string &s, const string &t) override` Renames file src to target. If target already exists, it will be replaced. #### `uint64 tensorflow::EnvWrapper::NowMicros() override` Returns the number of micro-seconds since some fixed point in time. Only useful for computing deltas of time. #### `void tensorflow::EnvWrapper::SleepForMicroseconds(int micros) override` Sleeps/delays the thread for the prescribed number of micro-seconds. #### `Thread* tensorflow::EnvWrapper::StartThread(const ThreadOptions &thread_options, const string &name, std::function< void()> fn) override` Returns a new thread that is running fn() and is identified (for debugging/performance-analysis) by "name". Caller takes ownership of the result and must delete it eventually (the deletion will block until fn() stops running). ================================================ FILE: tex_pdf/api/cc/ClassRandomAccessFile.md ================================================ # Class `tensorflow::RandomAccessFile` A file abstraction for randomly reading the contents of a file. ##Member Summary * [`tensorflow::RandomAccessFile::RandomAccessFile()`](#tensorflow_RandomAccessFile_RandomAccessFile) * [`virtual tensorflow::RandomAccessFile::~RandomAccessFile()`](#virtual_tensorflow_RandomAccessFile_RandomAccessFile) * [`virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0`](#virtual_Status_tensorflow_RandomAccessFile_Read) * Reads up to "n" bytes from the file starting at "offset". ##Member Details #### `tensorflow::RandomAccessFile::RandomAccessFile()` #### `virtual tensorflow::RandomAccessFile::~RandomAccessFile()` #### `virtual Status tensorflow::RandomAccessFile::Read(uint64 offset, size_t n, StringPiece *result, char *scratch) const =0` Reads up to "n" bytes from the file starting at "offset". "scratch[0..n-1]" may be written by this routine. Sets "*result" to the data that was read (including if fewer than "n" bytes were successfully read). May set "*result" to point at data in "scratch[0..n-1]", so "scratch[0..n-1]" must be live when "*result" is used. On OK returned status: "n" bytes have been stored in "*result". On non-OK returned status: [0..n] bytes have been stored in "*result". Returns `OUT_OF_RANGE` if fewer than n bytes were stored in "*result" because of EOF. Safe for concurrent use by multiple threads. ================================================ FILE: tex_pdf/api/cc/ClassSession.md ================================================ # Class `tensorflow::Session` A Session instance lets a caller drive a TensorFlow graph computation. When a Session is created with a given target, a new Session object is bound to the universe of resources specified by that target. Those resources are available to this session to perform computation described in the GraphDef. After extending the session with a graph, the caller uses the Run() API to perform the computation and potentially fetch outputs as Tensors. Example: ```c++ tensorflow::GraphDef graph; // ... Create or load graph into "graph". // This example uses the default options which connects // to a local runtime. tensorflow::SessionOptions options; std::unique_ptr session(tensorflow::NewSession(options)); // Create the session with this graph. tensorflow::Status s = session->Create(graph); if (!s.ok()) { ... } // Run the graph and fetch the first output of the "output" // operation, and also run to but do not return anything // for the "update_state" operation. std::vector outputs; s = session->Run({}, {"output:0"}, {"update_state"}, &outputs); if (!s.ok()) { ... } // Map the output as a flattened float tensor, and do something // with it. auto output_tensor = outputs[0].flat(); if (output_tensor(0) > 0.5) { ... } // Close the session to release the resources associated with // this session. session->Close() ``` A Session allows concurrent calls to Run() , though a Session must be created / extended by a single thread. Only one thread must call Close() , and Close() must only be called after all other calls to Run() have returned. ##Member Summary * [`virtual Status tensorflow::Session::Create(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Create) * Create the graph to be used for the session. * [`virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0`](#virtual_Status_tensorflow_Session_Extend) * Adds operations to the graph that is already registered with the Session . * [`virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0`](#virtual_Status_tensorflow_Session_Run) * Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`. * [`virtual Status tensorflow::Session::Close()=0`](#virtual_Status_tensorflow_Session_Close) * Closes this session. * [`virtual tensorflow::Session::~Session()`](#virtual_tensorflow_Session_Session) ##Member Details #### `virtual Status tensorflow::Session::Create(const GraphDef &graph)=0` Create the graph to be used for the session. Returns an error if this session has already been created with a graph. To re-use the session with a different graph, the caller must Close() the session first. #### `virtual Status tensorflow::Session::Extend(const GraphDef &graph)=0` Adds operations to the graph that is already registered with the Session . The names of new operations in "graph" must not exist in the graph that is already registered. #### `virtual Status tensorflow::Session::Run(const std::vector< std::pair< string, Tensor > > &inputs, const std::vector< string > &output_tensor_names, const std::vector< string > &target_node_names, std::vector< Tensor > *outputs)=0` Runs the graph with the provided input tensors and fills `outputs` for the endpoints specified in `output_tensor_names`. Runs to but does not return Tensors for the nodes in `target_node_names`. The order of tensors in `outputs` will match the order provided by `output_tensor_names`. If `Run` returns `OK()`, then `outputs->size()` will be equal to `output_tensor_names.size()`. If `Run` does not return `OK()`, the state of `outputs` is undefined. REQUIRES: The name of each Tensor of the input or output must match a "Tensor endpoint" in the `GraphDef` passed to ` Create() `. REQUIRES: outputs is not nullptr if `output_tensor_names` is non-empty. #### `virtual Status tensorflow::Session::Close()=0` Closes this session. Closing a session releases the resources used by this session on the TensorFlow runtime (specified during session creation by the ` SessionOptions::target ` field). #### `virtual tensorflow::Session::~Session()` ================================================ FILE: tex_pdf/api/cc/ClassStatus.md ================================================ # Class `tensorflow::Status` ##Member Summary * [`tensorflow::Status::Status()`](#tensorflow_Status_Status) * Create a success status. * [`tensorflow::Status::~Status()`](#tensorflow_Status_Status) * [`tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)`](#tensorflow_Status_Status) * Create a status with the specified error code and msg as a human-readable string containing more detailed information. * [`tensorflow::Status::Status(const Status &s)`](#tensorflow_Status_Status) * Copy the specified status. * [`void tensorflow::Status::operator=(const Status &s)`](#void_tensorflow_Status_operator_) * [`bool tensorflow::Status::ok() const`](#bool_tensorflow_Status_ok) * Returns true iff the status indicates success. * [`tensorflow::error::Code tensorflow::Status::code() const`](#tensorflow_error_Code_tensorflow_Status_code) * [`const string& tensorflow::Status::error_message() const`](#const_string_tensorflow_Status_error_message) * [`bool tensorflow::Status::operator==(const Status &x) const`](#bool_tensorflow_Status_operator_) * [`bool tensorflow::Status::operator!=(const Status &x) const`](#bool_tensorflow_Status_operator_) * [`void tensorflow::Status::Update(const Status &new_status)`](#void_tensorflow_Status_Update) * If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`. * [`string tensorflow::Status::ToString() const`](#string_tensorflow_Status_ToString) * Return a string representation of this status suitable for printing. Returns the string `"OK"` for success. * [`static Status tensorflow::Status::OK()`](#static_Status_tensorflow_Status_OK) ##Member Details #### `tensorflow::Status::Status()` Create a success status. #### `tensorflow::Status::~Status()` #### `tensorflow::Status::Status(tensorflow::error::Code code, tensorflow::StringPiece msg)` Create a status with the specified error code and msg as a human-readable string containing more detailed information. #### `tensorflow::Status::Status(const Status &s)` Copy the specified status. #### `void tensorflow::Status::operator=(const Status &s)` #### `bool tensorflow::Status::ok() const` Returns true iff the status indicates success. #### `tensorflow::error::Code tensorflow::Status::code() const` #### `const string& tensorflow::Status::error_message() const` #### `bool tensorflow::Status::operator==(const Status &x) const` #### `bool tensorflow::Status::operator!=(const Status &x) const` #### `void tensorflow::Status::Update(const Status &new_status)` If ` ok() `, stores `new_status` into `*this`. If `!ok()`, preserves the current status, but may augment with additional information about `new_status`. Convenient way of keeping track of the first error encountered. Instead of: `if (overall_status.ok()) overall_status = new_status` Use: `overall_status.Update(new_status);` #### `string tensorflow::Status::ToString() const` Return a string representation of this status suitable for printing. Returns the string `"OK"` for success. #### `static Status tensorflow::Status::OK()` ================================================ FILE: tex_pdf/api/cc/ClassTensor.md ================================================ # Class `tensorflow::Tensor` Represents an n-dimensional array of values. ##Member Summary * [`tensorflow::Tensor::Tensor()`](#tensorflow_Tensor_Tensor) * Default Tensor constructor. Creates a 1-dimension, 0-element float tensor. * [`tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor) * Creates a Tensor of the given `type` and `shape`. * [`tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)`](#tensorflow_Tensor_Tensor) * Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer. * [`tensorflow::Tensor::Tensor(DataType type)`](#tensorflow_Tensor_Tensor) * Creates an uninitialized Tensor of the given data type. * [`tensorflow::Tensor::Tensor(const Tensor &other)`](#tensorflow_Tensor_Tensor) * [`tensorflow::Tensor::~Tensor()`](#tensorflow_Tensor_Tensor) * Copy constructor. * [`DataType tensorflow::Tensor::dtype() const`](#DataType_tensorflow_Tensor_dtype) * Returns the data type. * [`const TensorShape& tensorflow::Tensor::shape() const`](#const_TensorShape_tensorflow_Tensor_shape) * Returns the shape of the tensor. * [`int tensorflow::Tensor::dims() const`](#int_tensorflow_Tensor_dims) * Convenience accessor for the tensor shape. * [`int64 tensorflow::Tensor::dim_size(int d) const`](#int64_tensorflow_Tensor_dim_size) * Convenience accessor for the tensor shape. * [`int64 tensorflow::Tensor::NumElements() const`](#int64_tensorflow_Tensor_NumElements) * Convenience accessor for the tensor shape. * [`bool tensorflow::Tensor::IsSameSize(const Tensor &b) const`](#bool_tensorflow_Tensor_IsSameSize) * [`bool tensorflow::Tensor::IsInitialized() const`](#bool_tensorflow_Tensor_IsInitialized) * Has this Tensor been initialized? * [`size_t tensorflow::Tensor::TotalBytes() const`](#size_t_tensorflow_Tensor_TotalBytes) * Returns the estimated memory usage of this tensor. * [`Tensor& tensorflow::Tensor::operator=(const Tensor &other)`](#Tensor_tensorflow_Tensor_operator_) * Assign operator. This tensor shares other's underlying storage. * [`bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_CopyFrom) * Copy the other tensor into this tensor and reshape it. * [`Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const`](#Tensor_tensorflow_Tensor_Slice) * Slice this tensor along the 1st dimension. * [`bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto) * Parse `other` and construct the tensor. * [`bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT`](#bool_tensorflow_Tensor_FromProto) * [`void tensorflow::Tensor::AsProtoField(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoField) * Fills in `proto` with `*this` tensor's content. * [`void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const`](#void_tensorflow_Tensor_AsProtoTensorContent) * [`TTypes::Vec tensorflow::Tensor::vec()`](#TTypes_T_Vec_tensorflow_Tensor_vec) * Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `. * [`TTypes::Matrix tensorflow::Tensor::matrix()`](#TTypes_T_Matrix_tensorflow_Tensor_matrix) * [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_tensor) * [`TTypes::Flat tensorflow::Tensor::flat()`](#TTypes_T_Flat_tensorflow_Tensor_flat) * Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape. * [`TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()`](#TTypes_T_UnalignedFlat_tensorflow_Tensor_unaligned_flat) * [`TTypes::Matrix tensorflow::Tensor::flat_inner_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_inner_dims) * [`TTypes::Matrix tensorflow::Tensor::flat_outer_dims()`](#TTypes_T_Matrix_tensorflow_Tensor_flat_outer_dims) * [`TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_Tensor_tensorflow_Tensor_shaped) * [`TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)`](#TTypes_T_NDIMS_UnalignedTensor_tensorflow_Tensor_unaligned_shaped) * [`TTypes< T >::Scalar tensorflow::Tensor::scalar()`](#TTypes_T_Scalar_tensorflow_Tensor_scalar) * Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`. * [`TTypes::ConstVec tensorflow::Tensor::vec() const`](#TTypes_T_ConstVec_tensorflow_Tensor_vec) * Const versions of all the methods above. * [`TTypes::ConstMatrix tensorflow::Tensor::matrix() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_matrix) * [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_tensor) * [`TTypes::ConstFlat tensorflow::Tensor::flat() const`](#TTypes_T_ConstFlat_tensorflow_Tensor_flat) * [`TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const`](#TTypes_T_UnalignedConstFlat_tensorflow_Tensor_unaligned_flat) * [`TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_inner_dims) * [`TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const`](#TTypes_T_ConstMatrix_tensorflow_Tensor_flat_outer_dims) * [`TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_ConstTensor_tensorflow_Tensor_shaped) * [`TTypes< T, NDIMS >::UnalignedConstTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes) const`](#TTypes_T_NDIMS_UnalignedConstTensor_tensorflow_Tensor_unaligned_shaped) * [`TTypes< T >::ConstScalar tensorflow::Tensor::scalar() const`](#TTypes_T_ConstScalar_tensorflow_Tensor_scalar) * [`string tensorflow::Tensor::SummarizeValue(int64 max_entries) const`](#string_tensorflow_Tensor_SummarizeValue) * Render the first `max_entries` values in `*this` into a string. * [`string tensorflow::Tensor::DebugString() const`](#string_tensorflow_Tensor_DebugString) * A human-readable summary of the tensor suitable for debugging. * [`void tensorflow::Tensor::FillDescription(TensorDescription *description) const`](#void_tensorflow_Tensor_FillDescription) * [`StringPiece tensorflow::Tensor::tensor_data() const`](#StringPiece_tensorflow_Tensor_tensor_data) * Returns a `StringPiece` mapping the current tensor's buffer. ##Member Details #### `tensorflow::Tensor::Tensor()` Default Tensor constructor. Creates a 1-dimension, 0-element float tensor. #### `tensorflow::Tensor::Tensor(DataType type, const TensorShape &shape)` Creates a Tensor of the given `type` and `shape`. The underlying buffer is allocated using a `CPUAllocator`. #### `tensorflow::Tensor::Tensor(Allocator *a, DataType type, const TensorShape &shape)` Creates a tensor with the input `type` and `shape`, using the allocator `a` to allocate the underlying buffer. `a` must outlive the lifetime of this Tensor . #### `tensorflow::Tensor::Tensor(DataType type)` Creates an uninitialized Tensor of the given data type. #### `tensorflow::Tensor::Tensor(const Tensor &other)` #### `tensorflow::Tensor::~Tensor()` Copy constructor. #### `DataType tensorflow::Tensor::dtype() const` Returns the data type. #### `const TensorShape& tensorflow::Tensor::shape() const` Returns the shape of the tensor. #### `int tensorflow::Tensor::dims() const` Convenience accessor for the tensor shape. For all shape accessors, see comments for relevant methods of ` TensorShape ` in ` tensor_shape.h `. #### `int64 tensorflow::Tensor::dim_size(int d) const` Convenience accessor for the tensor shape. #### `int64 tensorflow::Tensor::NumElements() const` Convenience accessor for the tensor shape. #### `bool tensorflow::Tensor::IsSameSize(const Tensor &b) const` #### `bool tensorflow::Tensor::IsInitialized() const` Has this Tensor been initialized? #### `size_t tensorflow::Tensor::TotalBytes() const` Returns the estimated memory usage of this tensor. #### `Tensor& tensorflow::Tensor::operator=(const Tensor &other)` Assign operator. This tensor shares other's underlying storage. #### `bool tensorflow::Tensor::CopyFrom(const Tensor &other, const TensorShape &shape) TF_MUST_USE_RESULT` Copy the other tensor into this tensor and reshape it. This tensor shares other's underlying storage. Returns `true` iff `other.shape()` has the same number of elements of the given `shape`. #### `Tensor tensorflow::Tensor::Slice(int64 dim0_start, int64 dim0_limit) const` Slice this tensor along the 1st dimension. I.e., the returned tensor satisifies returned[i, ...] == this[dim0_start + i, ...]. The returned tensor shares the underlying tensor buffer with this tensor. NOTE: The returned tensor may not satisfies the same alignment requirement as this tensor depending on the shape. The caller must check the returned tensor's alignment before calling certain methods that have alignment requirement (e.g., ` flat() `, `tensor()`). REQUIRES: ` dims() ` >= 1 REQUIRES: `0 <= dim0_start <= dim0_limit <= dim_size(0)` #### `bool tensorflow::Tensor::FromProto(const TensorProto &other) TF_MUST_USE_RESULT` Parse `other` and construct the tensor. Returns `true` iff the parsing succeeds. If the parsing fails, the state of `*this` is unchanged. #### `bool tensorflow::Tensor::FromProto(Allocator *a, const TensorProto &other) TF_MUST_USE_RESULT` #### `void tensorflow::Tensor::AsProtoField(TensorProto *proto) const` Fills in `proto` with `*this` tensor's content. ` AsProtoField() ` fills in the repeated field for `proto.dtype()`, while `AsProtoTensorContent()` encodes the content in `proto.tensor_content()` in a compact form. #### `void tensorflow::Tensor::AsProtoTensorContent(TensorProto *proto) const` #### `TTypes::Vec tensorflow::Tensor::vec()` Return the tensor data as an `Eigen::Tensor` with the type and sizes of this ` Tensor `. Use these methods when you know the data type and the number of dimensions of the Tensor and you want an `Eigen::Tensor` automatically sized to the ` Tensor ` sizes. The implementation check fails if either type or sizes mismatch. Example: ```c++ typedef float T; Tensor my_mat(...built with Shape{rows: 3, cols: 5}...); auto mat = my_mat.matrix(); // 2D Eigen::Tensor, 3 x 5. auto mat = my_mat.tensor(); // 2D Eigen::Tensor, 3 x 5. auto vec = my_mat.vec(); // CHECK fails as my_mat is 2D. auto vec = my_mat.tensor(); // CHECK fails as my_mat is 2D. auto mat = my_mat.matrix();// CHECK fails as type mismatch. ``` #### `TTypes::Matrix tensorflow::Tensor::matrix()` #### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::tensor()` #### `TTypes::Flat tensorflow::Tensor::flat()` Return the tensor data as an `Eigen::Tensor` of the data type and a specified shape. These methods allow you to access the data with the dimensions and sizes of your choice. You do not need to know the number of dimensions of the Tensor to call them. However, they `CHECK` that the type matches and the dimensions requested creates an `Eigen::Tensor` with the same number of elements as the tensor. Example: ```c++ typedef float T; Tensor my_ten(...built with Shape{planes: 4, rows: 3, cols: 5}...); // 1D Eigen::Tensor, size 60: auto flat = my_ten.flat(); // 2D Eigen::Tensor 12 x 5: auto inner = my_ten.flat_inner_dims(); // 2D Eigen::Tensor 4 x 15: auto outer = my_ten.shaped({4, 15}); // CHECK fails, bad num elements: auto outer = my_ten.shaped({4, 8}); // 3D Eigen::Tensor 6 x 5 x 2: auto weird = my_ten.shaped({6, 5, 2}); // CHECK fails, type mismatch: auto bad = my_ten.flat(); ``` #### `TTypes::UnalignedFlat tensorflow::Tensor::unaligned_flat()` #### `TTypes::Matrix tensorflow::Tensor::flat_inner_dims()` Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the last one into the first dimension of the result. #### `TTypes::Matrix tensorflow::Tensor::flat_outer_dims()` Returns the data as an Eigen::Tensor with 2 dimensions, collapsing all Tensor dimensions but the first one into the last dimension of the result. #### `TTypes< T, NDIMS >::Tensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes)` #### `TTypes< T, NDIMS >::UnalignedTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes)` #### `TTypes< T >::Scalar tensorflow::Tensor::scalar()` Return the Tensor data as a `TensorMap` of fixed size 1: `TensorMap>`. Using ` scalar() ` allows the compiler to perform optimizations as the size of the tensor is known at compile time. #### `TTypes::ConstVec tensorflow::Tensor::vec() const` Const versions of all the methods above. #### `TTypes::ConstMatrix tensorflow::Tensor::matrix() const` #### `TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::tensor() const` #### `TTypes::ConstFlat tensorflow::Tensor::flat() const` #### `TTypes::UnalignedConstFlat tensorflow::Tensor::unaligned_flat() const` #### `TTypes::ConstMatrix tensorflow::Tensor::flat_inner_dims() const` #### `TTypes::ConstMatrix tensorflow::Tensor::flat_outer_dims() const` #### `TTypes< T, NDIMS >::ConstTensor tensorflow::Tensor::shaped(gtl::ArraySlice< int64 > new_sizes) const` #### `TTypes< T, NDIMS >::UnalignedConstTensor tensorflow::Tensor::unaligned_shaped(gtl::ArraySlice< int64 > new_sizes) const` #### `TTypes< T >::ConstScalar tensorflow::Tensor::scalar() const` #### `string tensorflow::Tensor::SummarizeValue(int64 max_entries) const` Render the first `max_entries` values in `*this` into a string. #### `string tensorflow::Tensor::DebugString() const` A human-readable summary of the tensor suitable for debugging. #### `void tensorflow::Tensor::FillDescription(TensorDescription *description) const` Fill in the `TensorDescription` proto with metadata about the tensor that is useful for monitoring and debugging. #### `StringPiece tensorflow::Tensor::tensor_data() const` Returns a `StringPiece` mapping the current tensor's buffer. The returned `StringPiece` may point to memory location on devices that the CPU cannot address directly. NOTE: The underlying tensor buffer is refcounted, so the lifetime of the contents mapped by the `StringPiece` matches the lifetime of the buffer; callers should arrange to make sure the buffer does not get destroyed while the `StringPiece` is still used. REQUIRES: `DataTypeCanUseMemcpy( dtype() )`. ================================================ FILE: tex_pdf/api/cc/ClassTensorBuffer.md ================================================ # Class `tensorflow::TensorBuffer` ##Member Summary * [`tensorflow::TensorBuffer::~TensorBuffer() override`](#tensorflow_TensorBuffer_TensorBuffer) * [`virtual void* tensorflow::TensorBuffer::data() const =0`](#virtual_void_tensorflow_TensorBuffer_data) * [`virtual size_t tensorflow::TensorBuffer::size() const =0`](#virtual_size_t_tensorflow_TensorBuffer_size) * [`virtual TensorBuffer* tensorflow::TensorBuffer::root_buffer()=0`](#virtual_TensorBuffer_tensorflow_TensorBuffer_root_buffer) * [`virtual void tensorflow::TensorBuffer::FillAllocationDescription(AllocationDescription *proto) const =0`](#virtual_void_tensorflow_TensorBuffer_FillAllocationDescription) * [`T* tensorflow::TensorBuffer::base() const`](#T_tensorflow_TensorBuffer_base) ##Member Details #### `tensorflow::TensorBuffer::~TensorBuffer() override` #### `virtual void* tensorflow::TensorBuffer::data() const =0` #### `virtual size_t tensorflow::TensorBuffer::size() const =0` #### `virtual TensorBuffer* tensorflow::TensorBuffer::root_buffer()=0` #### `virtual void tensorflow::TensorBuffer::FillAllocationDescription(AllocationDescription *proto) const =0` #### `T* tensorflow::TensorBuffer::base() const` ================================================ FILE: tex_pdf/api/cc/ClassTensorShape.md ================================================ # Class `tensorflow::TensorShape` Manages the dimensions of a Tensor and their sizes. ##Member Summary * [`tensorflow::TensorShape::TensorShape(gtl::ArraySlice< int64 > dim_sizes)`](#tensorflow_TensorShape_TensorShape) * Construct a ` TensorShape ` from the provided sizes. REQUIRES: `dim_sizes[i] >= 0` * [`tensorflow::TensorShape::TensorShape(std::initializer_list< int64 > dim_sizes)`](#tensorflow_TensorShape_TensorShape) * [`tensorflow::TensorShape::TensorShape(const TensorShapeProto &proto)`](#tensorflow_TensorShape_TensorShape) * REQUIRES: `IsValid(proto)` * [`tensorflow::TensorShape::TensorShape()`](#tensorflow_TensorShape_TensorShape) * [`void tensorflow::TensorShape::Clear()`](#void_tensorflow_TensorShape_Clear) * Clear a tensor shape. * [`void tensorflow::TensorShape::AddDim(int64 size)`](#void_tensorflow_TensorShape_AddDim) * Add a dimension to the end ("inner-most"). REQUIRES: `size >= 0` * [`void tensorflow::TensorShape::AppendShape(const TensorShape &shape)`](#void_tensorflow_TensorShape_AppendShape) * Appends all the dimensions from `shape`. * [`void tensorflow::TensorShape::InsertDim(int d, int64 size)`](#void_tensorflow_TensorShape_InsertDim) * Insert a dimension somewhere in the ` TensorShape `. REQUIRES: `0 <= d <= dims() ` REQUIRES: `size >= 0` * [`void tensorflow::TensorShape::set_dim(int d, int64 size)`](#void_tensorflow_TensorShape_set_dim) * Modifies the size of the dimension `d` to be `size` REQUIRES: `0 <= d < dims() ` REQUIRES: `size >= 0` * [`void tensorflow::TensorShape::RemoveDim(int d)`](#void_tensorflow_TensorShape_RemoveDim) * Removes dimension `d` from the ` TensorShape `. REQUIRES: `0 <= d < dims() ` * [`int tensorflow::TensorShape::dims() const`](#int_tensorflow_TensorShape_dims) * Return the number of dimensions in the tensor. * [`int64 tensorflow::TensorShape::dim_size(int d) const`](#int64_tensorflow_TensorShape_dim_size) * Returns the number of elements in dimension `d`. REQUIRES: `0 <= d < dims() ` * [`gtl::ArraySlice tensorflow::TensorShape::dim_sizes() const`](#gtl_ArraySlice_int64_tensorflow_TensorShape_dim_sizes) * Returns sizes of all dimensions. * [`int64 tensorflow::TensorShape::num_elements() const`](#int64_tensorflow_TensorShape_num_elements) * Returns the number of elements in the tensor. * [`bool tensorflow::TensorShape::IsSameSize(const TensorShape &b) const`](#bool_tensorflow_TensorShape_IsSameSize) * [`bool tensorflow::TensorShape::operator==(const TensorShape &b) const`](#bool_tensorflow_TensorShape_operator_) * [`void tensorflow::TensorShape::AsProto(TensorShapeProto *proto) const`](#void_tensorflow_TensorShape_AsProto) * Fill `*proto` from `*this`. * [`Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizes() const`](#Eigen_DSizes_Eigen_DenseIndex_NDIMS_tensorflow_TensorShape_AsEigenDSizes) * Fill `*dsizes` from `*this`. * [`Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizesWithPadding() const`](#Eigen_DSizes_Eigen_DenseIndex_NDIMS_tensorflow_TensorShape_AsEigenDSizesWithPadding) * [`TensorShapeIter tensorflow::TensorShape::begin() const`](#TensorShapeIter_tensorflow_TensorShape_begin) * For iterating through the dimensions. * [`TensorShapeIter tensorflow::TensorShape::end() const`](#TensorShapeIter_tensorflow_TensorShape_end) * [`string tensorflow::TensorShape::DebugString() const`](#string_tensorflow_TensorShape_DebugString) * For error messages. * [`string tensorflow::TensorShape::ShortDebugString() const`](#string_tensorflow_TensorShape_ShortDebugString) * [`static bool tensorflow::TensorShape::IsValid(const TensorShapeProto &proto)`](#static_bool_tensorflow_TensorShape_IsValid) * Returns `true` iff `proto` is a valid tensor shape. ##Member Details #### `tensorflow::TensorShape::TensorShape(gtl::ArraySlice< int64 > dim_sizes)` Construct a ` TensorShape ` from the provided sizes. REQUIRES: `dim_sizes[i] >= 0` #### `tensorflow::TensorShape::TensorShape(std::initializer_list< int64 > dim_sizes)` #### `tensorflow::TensorShape::TensorShape(const TensorShapeProto &proto)` REQUIRES: `IsValid(proto)` #### `tensorflow::TensorShape::TensorShape()` Create a tensor shape with no dimensions and one element, which you can then call ` AddDim() ` on. #### `void tensorflow::TensorShape::Clear()` Clear a tensor shape. #### `void tensorflow::TensorShape::AddDim(int64 size)` Add a dimension to the end ("inner-most"). REQUIRES: `size >= 0` #### `void tensorflow::TensorShape::AppendShape(const TensorShape &shape)` Appends all the dimensions from `shape`. #### `void tensorflow::TensorShape::InsertDim(int d, int64 size)` Insert a dimension somewhere in the ` TensorShape `. REQUIRES: `0 <= d <= dims() ` REQUIRES: `size >= 0` #### `void tensorflow::TensorShape::set_dim(int d, int64 size)` Modifies the size of the dimension `d` to be `size` REQUIRES: `0 <= d < dims() ` REQUIRES: `size >= 0` #### `void tensorflow::TensorShape::RemoveDim(int d)` Removes dimension `d` from the ` TensorShape `. REQUIRES: `0 <= d < dims() ` #### `int tensorflow::TensorShape::dims() const` Return the number of dimensions in the tensor. #### `int64 tensorflow::TensorShape::dim_size(int d) const` Returns the number of elements in dimension `d`. REQUIRES: `0 <= d < dims() ` #### `gtl::ArraySlice tensorflow::TensorShape::dim_sizes() const` Returns sizes of all dimensions. #### `int64 tensorflow::TensorShape::num_elements() const` Returns the number of elements in the tensor. We use `int64` and not `size_t` to be compatible with `Eigen::Tensor` which uses `ptrdiff_t`. #### `bool tensorflow::TensorShape::IsSameSize(const TensorShape &b) const` Returns true if `*this` and `b` have the same sizes. Ignores dimension names. #### `bool tensorflow::TensorShape::operator==(const TensorShape &b) const` #### `void tensorflow::TensorShape::AsProto(TensorShapeProto *proto) const` Fill `*proto` from `*this`. #### `Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizes() const` Fill `*dsizes` from `*this`. #### `Eigen::DSizes< Eigen::DenseIndex, NDIMS > tensorflow::TensorShape::AsEigenDSizesWithPadding() const` Same as ` AsEigenDSizes() ` but allows for `NDIMS > dims() ` in which case we pad the rest of the sizes with 1. #### `TensorShapeIter tensorflow::TensorShape::begin() const` For iterating through the dimensions. #### `TensorShapeIter tensorflow::TensorShape::end() const` #### `string tensorflow::TensorShape::DebugString() const` For error messages. #### `string tensorflow::TensorShape::ShortDebugString() const` #### `static bool tensorflow::TensorShape::IsValid(const TensorShapeProto &proto)` Returns `true` iff `proto` is a valid tensor shape. ================================================ FILE: tex_pdf/api/cc/ClassTensorShapeIter.md ================================================ # Class `tensorflow::TensorShapeIter` ##Member Summary * [`tensorflow::TensorShapeIter::TensorShapeIter(const TensorShape *shape, int d)`](#tensorflow_TensorShapeIter_TensorShapeIter) * [`bool tensorflow::TensorShapeIter::operator==(const TensorShapeIter &rhs)`](#bool_tensorflow_TensorShapeIter_operator_) * [`bool tensorflow::TensorShapeIter::operator!=(const TensorShapeIter &rhs)`](#bool_tensorflow_TensorShapeIter_operator_) * [`void tensorflow::TensorShapeIter::operator++()`](#void_tensorflow_TensorShapeIter_operator_) * [`TensorShapeDim tensorflow::TensorShapeIter::operator*()`](#TensorShapeDim_tensorflow_TensorShapeIter_operator_) ##Member Details #### `tensorflow::TensorShapeIter::TensorShapeIter(const TensorShape *shape, int d)` #### `bool tensorflow::TensorShapeIter::operator==(const TensorShapeIter &rhs)` #### `bool tensorflow::TensorShapeIter::operator!=(const TensorShapeIter &rhs)` #### `void tensorflow::TensorShapeIter::operator++()` #### `TensorShapeDim tensorflow::TensorShapeIter::operator*()` ================================================ FILE: tex_pdf/api/cc/ClassTensorShapeUtils.md ================================================ # Class `tensorflow::TensorShapeUtils` Static helper routines for ` TensorShape `. Includes a few common predicates on a tensor shape. ##Member Summary * [`static bool tensorflow::TensorShapeUtils::IsScalar(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsScalar) * [`static bool tensorflow::TensorShapeUtils::IsVector(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsVector) * [`static bool tensorflow::TensorShapeUtils::IsLegacyScalar(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsLegacyScalar) * [`static bool tensorflow::TensorShapeUtils::IsLegacyVector(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsLegacyVector) * [`static bool tensorflow::TensorShapeUtils::IsVectorOrHigher(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsVectorOrHigher) * [`static bool tensorflow::TensorShapeUtils::IsMatrix(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsMatrix) * [`static bool tensorflow::TensorShapeUtils::IsMatrixOrHigher(const TensorShape &shape)`](#static_bool_tensorflow_TensorShapeUtils_IsMatrixOrHigher) * [`static TensorShape tensorflow::TensorShapeUtils::MakeShape(const T *dims, int n)`](#static_TensorShape_tensorflow_TensorShapeUtils_MakeShape) * Returns a ` TensorShape ` whose dimensions are `dims[0]`, `dims[1]`, ..., `dims[n-1]`. * [`static string tensorflow::TensorShapeUtils::ShapeListString(const gtl::ArraySlice< TensorShape > &shapes)`](#static_string_tensorflow_TensorShapeUtils_ShapeListString) * [`static bool tensorflow::TensorShapeUtils::StartsWith(const TensorShape &shape0, const TensorShape &shape1)`](#static_bool_tensorflow_TensorShapeUtils_StartsWith) ##Member Details #### `static bool tensorflow::TensorShapeUtils::IsScalar(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsVector(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsLegacyScalar(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsLegacyVector(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsVectorOrHigher(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsMatrix(const TensorShape &shape)` #### `static bool tensorflow::TensorShapeUtils::IsMatrixOrHigher(const TensorShape &shape)` #### `static TensorShape tensorflow::TensorShapeUtils::MakeShape(const T *dims, int n)` Returns a ` TensorShape ` whose dimensions are `dims[0]`, `dims[1]`, ..., `dims[n-1]`. #### `static string tensorflow::TensorShapeUtils::ShapeListString(const gtl::ArraySlice< TensorShape > &shapes)` #### `static bool tensorflow::TensorShapeUtils::StartsWith(const TensorShape &shape0, const TensorShape &shape1)` ================================================ FILE: tex_pdf/api/cc/ClassThread.md ================================================ # Class `tensorflow::Thread` ##Member Summary * [`tensorflow::Thread::Thread()`](#tensorflow_Thread_Thread) * [`virtual tensorflow::Thread::~Thread()`](#virtual_tensorflow_Thread_Thread) * Blocks until the thread of control stops running. ##Member Details #### `tensorflow::Thread::Thread()` #### `virtual tensorflow::Thread::~Thread()` Blocks until the thread of control stops running. ================================================ FILE: tex_pdf/api/cc/ClassWritableFile.md ================================================ # Class `tensorflow::WritableFile` A file abstraction for sequential writing. The implementation must provide buffering since callers may append small fragments at a time to the file. ##Member Summary * [`tensorflow::WritableFile::WritableFile()`](#tensorflow_WritableFile_WritableFile) * [`virtual tensorflow::WritableFile::~WritableFile()`](#virtual_tensorflow_WritableFile_WritableFile) * [`virtual Status tensorflow::WritableFile::Append(const StringPiece &data)=0`](#virtual_Status_tensorflow_WritableFile_Append) * [`virtual Status tensorflow::WritableFile::Close()=0`](#virtual_Status_tensorflow_WritableFile_Close) * [`virtual Status tensorflow::WritableFile::Flush()=0`](#virtual_Status_tensorflow_WritableFile_Flush) * [`virtual Status tensorflow::WritableFile::Sync()=0`](#virtual_Status_tensorflow_WritableFile_Sync) ##Member Details #### `tensorflow::WritableFile::WritableFile()` #### `virtual tensorflow::WritableFile::~WritableFile()` #### `virtual Status tensorflow::WritableFile::Append(const StringPiece &data)=0` #### `virtual Status tensorflow::WritableFile::Close()=0` #### `virtual Status tensorflow::WritableFile::Flush()=0` #### `virtual Status tensorflow::WritableFile::Sync()=0` ================================================ FILE: tex_pdf/api/cc/StructSessionOptions.md ================================================ # Struct `tensorflow::SessionOptions` Configuration information for a Session . ##Member Summary * [`Env* tensorflow::SessionOptions::env`](#Env_tensorflow_SessionOptions_env) * The environment to use. * [`string tensorflow::SessionOptions::target`](#string_tensorflow_SessionOptions_target) * The TensorFlow runtime to connect to. * [`ConfigProto tensorflow::SessionOptions::config`](#ConfigProto_tensorflow_SessionOptions_config) * Configuration options. * [`tensorflow::SessionOptions::SessionOptions()`](#tensorflow_SessionOptions_SessionOptions) ##Member Details #### `Env* tensorflow::SessionOptions::env` The environment to use. #### `string tensorflow::SessionOptions::target` The TensorFlow runtime to connect to. If 'target' is empty or unspecified, the local TensorFlow runtime implementation will be used. Otherwise, the TensorFlow engine defined by 'target' will be used to perform all computations. "target" can be either a single entry or a comma separated list of entries. Each entry is a resolvable address of the following format: local ip:port host:port ... other system-specific formats to identify tasks and jobs ... NOTE: at the moment 'local' maps to an in-process service-based runtime. Upon creation, a single session affines itself to one of the remote processes, with possible load balancing choices when the "target" resolves to a list of possible processes. If the session disconnects from the remote process during its lifetime, session calls may fail immediately. #### `ConfigProto tensorflow::SessionOptions::config` Configuration options. #### `tensorflow::SessionOptions::SessionOptions()` ================================================ FILE: tex_pdf/api/cc/StructState.md ================================================ # Struct `tensorflow::Status::State` ##Member Summary * [`tensorflow::error::Code tensorflow::Status::State::code`](#tensorflow_error_Code_tensorflow_Status_State_code) * [`string tensorflow::Status::State::msg`](#string_tensorflow_Status_State_msg) ##Member Details #### `tensorflow::error::Code tensorflow::Status::State::code` #### `string tensorflow::Status::State::msg` ================================================ FILE: tex_pdf/api/cc/StructTensorShapeDim.md ================================================ # Struct `tensorflow::TensorShapeDim` ##Member Summary * [`int tensorflow::TensorShapeDim::size`](#int_tensorflow_TensorShapeDim_size) * [`tensorflow::TensorShapeDim::TensorShapeDim(int64 s)`](#tensorflow_TensorShapeDim_TensorShapeDim) ##Member Details #### `int tensorflow::TensorShapeDim::size` #### `tensorflow::TensorShapeDim::TensorShapeDim(int64 s)` ================================================ FILE: tex_pdf/api/cc/StructThreadOptions.md ================================================ # Struct `tensorflow::ThreadOptions` Options to configure a Thread . Note that the options are all hints, and the underlying implementation may choose to ignore it. ##Member Summary * [`size_t tensorflow::ThreadOptions::stack_size`](#size_t_tensorflow_ThreadOptions_stack_size) * Thread stack size to use (in bytes). * [`size_t tensorflow::ThreadOptions::guard_size`](#size_t_tensorflow_ThreadOptions_guard_size) * Guard area size to use near thread stacks to use (in bytes) ##Member Details #### `size_t tensorflow::ThreadOptions::stack_size` Thread stack size to use (in bytes). #### `size_t tensorflow::ThreadOptions::guard_size` Guard area size to use near thread stacks to use (in bytes) ================================================ FILE: tex_pdf/api/cc/index.md ================================================ # TensorFlow C++ Session API reference documentation TensorFlow's public C++ API includes only the API for executing graphs, as of version 0.5. To control the execution of a graph from C++: 1. Build the computation graph using the [Python API](../../api_docs/python/). 1. Use [tf.train.write_graph()](../../api_docs/python/train.md#write_graph) to write the graph to a file. 1. Load the graph using the C++ Session API. For example: ```c++ // Reads a model graph definition from disk, and creates a session object you // can use to run it. Status LoadGraph(string graph_file_name, Session** session) { GraphDef graph_def; TF_RETURN_IF_ERROR( ReadBinaryProto(Env::Default(), graph_file_name, &graph_def)); TF_RETURN_IF_ERROR(NewSession(SessionOptions(), session)); TF_RETURN_IF_ERROR((*session)->Create(graph_def)); return Status::OK(); } ``` 1. Run the graph with a call to `session->Run()` ##Classes * [tensorflow::Env](../../api_docs/cc/ClassEnv.md) * [tensorflow::EnvWrapper](../../api_docs/cc/ClassEnvWrapper.md) * [tensorflow::RandomAccessFile](../../api_docs/cc/ClassRandomAccessFile.md) * [tensorflow::Session](../../api_docs/cc/ClassSession.md) * [tensorflow::Status](../../api_docs/cc/ClassStatus.md) * [tensorflow::Tensor](../../api_docs/cc/ClassTensor.md) * [tensorflow::TensorBuffer](../../api_docs/cc/ClassTensorBuffer.md) * [tensorflow::TensorShape](../../api_docs/cc/ClassTensorShape.md) * [tensorflow::TensorShapeIter](../../api_docs/cc/ClassTensorShapeIter.md) * [tensorflow::TensorShapeUtils](../../api_docs/cc/ClassTensorShapeUtils.md) * [tensorflow::Thread](../../api_docs/cc/ClassThread.md) * [tensorflow::WritableFile](../../api_docs/cc/ClassWritableFile.md) ##Structs * [tensorflow::SessionOptions](../../api_docs/cc/StructSessionOptions.md) * [tensorflow::Status::State](../../api_docs/cc/StructState.md) * [tensorflow::TensorShapeDim](../../api_docs/cc/StructTensorShapeDim.md) * [tensorflow::ThreadOptions](../../api_docs/cc/StructThreadOptions.md) ================================================ FILE: tex_pdf/api/python/c4s01_framework.tex ================================================ \section{Building Graphs} \label{building-graphs} \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-building-graphs}{Building Graphs}}{Building Graphs}}\label{building-graphs-1} \begin{itemize} \setlength{\parsep}{-5pt} \setlength{\itemsep}{-5pt} \item \hyperref[core-graph-data-structures]{Core graph data structures} \item \hyperref[class-tf.graph]{class \lstinline{tf.Graph}} \item \hyperref[class-tf.operation]{class \lstinline{tf.Operation}} \item \protect\hyperlink{Tensor}{class \lstinline{tf.Tensor}} \item \protect\hyperlink{AUTOGENERATED-tensor-types}{Tensor types} \item \protect\hyperlink{DType}{\lstinline{class tf.DType}} \item \protect\hyperlink{asux5fdtype}{\lstinline{tf.as_dtype(type_value)}} \item \protect\hyperlink{AUTOGENERATED-utility-functions}{Utility functions} \item \protect\hyperlink{device}{\lstinline{tf.device(dev)}} \item \protect\hyperlink{nameux5fscope}{\lstinline{tf.name_scope(name)}} \item \protect\hyperlink{controlux5fdependencies}{\lstinline{tf.control_dependencies(control_inputs)}} \item \protect\hyperlink{convertux5ftoux5ftensor}{\lstinline{tf.convert_to_tensor(value, dtype=None, name=None)}} \item \protect\hyperlink{getux5fdefaultux5fgraph}{\lstinline{tf.get_default_graph()}} \item \protect\hyperlink{importux5fgraphux5fdef}{\lstinline{tf.import_graph_def(graph_def, input_map=None, return_elements=None, name=None, op_dict=None)}} \item \protect\hyperlink{AUTOGENERATED-graph-collections}{Graph collections} \item \protect\hyperlink{addux5ftoux5fcollection}{\lstinline{tf.add_to_collection(name, value)}} \item \protect\hyperlink{getux5fcollection}{\lstinline{tf.get_collection(key, scope=None)}} \item \protect\hyperlink{GraphKeys}{\lstinline{class tf.GraphKeys}} \item \protect\hyperlink{AUTOGENERATED-defining-new-operations}{Defining new operations} \item \protect\hyperlink{RegisterGradient}{\lstinline{class tf.RegisterGradient}} \item \protect\hyperlink{NoGradient}{\lstinline{tf.NoGradient(op_type)}} \item \protect\hyperlink{RegisterShape}{\lstinline{class tf.RegisterShape}} \item \protect\hyperlink{TensorShape}{\lstinline{class tf.TensorShape}} \item \protect\hyperlink{Dimension}{\lstinline{class tf.Dimension}} \item \protect\hyperlink{opux5fscope}{\lstinline{tf.op_scope(values, name, default_name)}} \item \protect\hyperlink{getux5fseed}{\lstinline{tf.get_seed(op_seed)}} \end{itemize} Classes and functions for building TensorFlow graphs. \subsection{Core graph data structures}\label{core-graph-data-structures} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.Graph}}{class tf.Graph }}\label{class-tf.graph} A TensorFlow computation, represented as a dataflow graph. A \lstinline{Graph} contains a set of \href{../../api_docs/python/framework.md\#Operation}{\lstinline{Operation}} objects, which represent units of computation; and \href{../../api_docs/python/framework.md\#Tensor}{\lstinline{Tensor}} objects, which represent the units of data that flow between operations. A default \lstinline{Graph} is always registered, and accessible by calling \href{../../api_docs/python/framework.md\#get_default_graph}{\lstinline{tf.get_default_graph()}}. To add an operation to the default graph, simply call one of the functions that defines a new \lstinline{Operation}: \begin{lstlisting} c = tf.constant(4.0) assert c.graph is tf.get_default_graph() \end{lstlisting} Another typical usage involves the \href{../../api_docs/python/framework.md\#Graph.as_default}{\lstinline{Graph.as_default()}} context manager, which overrides the current default graph for the lifetime of the context: \begin{Shaded} \begin{Highlighting}[] \NormalTok{g }\OperatorTok{=} \NormalTok{tf.Graph()} \ControlFlowTok{with} \NormalTok{g.as_default():} \CommentTok{# Define operations and tensors in `g`.} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{30.0}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{c.graph }\OperatorTok{is} \NormalTok{g} \end{Highlighting} \end{Shaded} Important note: This class \emph{is not} thread-safe for graph construction. All operations should be created from a single thread, or external synchronization must be provided. Unless otherwise specified, all methods are not thread-safe. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.__init__()} }{tf.Graph.__init__() }}\label{tf.graph.ux5fux5finitux5fux5f} Creates a new, empty Graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.as_default()} }{tf.Graph.as_default() }}\label{tf.graph.asux5fdefault} Returns a context manager that makes this \lstinline{Graph} the default graph. This method should be used if you want to create multiple graphs in the same process. For convenience, a global default graph is provided, and all ops will be added to this graph if you do not create a new graph explicitly. Use this method the \lstinline{with} keyword to specify that ops created within the scope of a block should be added to this graph. The default graph is a property of the current thread. If you create a new thread, and wish to use the default graph in that thread, you must explicitly add a \lstinline{with g.as_default():} in that thread's function. The following code examples are equivalent: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 1. Using Graph.as_default():} \NormalTok{g }\OperatorTok{=} \NormalTok{tf.Graph()} \ControlFlowTok{with} \NormalTok{g.as_default():} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{5.0}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{c.graph }\OperatorTok{is} \NormalTok{g} \CommentTok{# 2. Constructing and making default:} \ControlFlowTok{with} \NormalTok{tf.Graph().as_default() }\ImportTok{as} \NormalTok{g:} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{5.0}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{c.graph }\OperatorTok{is} \NormalTok{g} \end{Highlighting} \end{Shaded} \subparagraph{Returns: }\label{returns} A context manager for using this graph as the default graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.as_graph_def(from_version=None)} }{tf.Graph.as_graph_def(from_version=None) }}\label{tf.graph.asux5fgraphux5fdeffromux5fversionnone} Returns a serialized \lstinline{GraphDef} representation of this graph. The serialized \lstinline{GraphDef} can be imported into another \lstinline{Graph} (using \protect\hyperlink{importux5fgraphux5fdef}{\lstinline{import_graph_def()}}) or used with the \href{../../api_docs/cc/index.md}{C++ Session API}. This method is thread-safe. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \lstinline{from_version}: Optional. If this is set, returns a \lstinline{GraphDef} containing only the nodes that were added to this graph since its \lstinline{version} property had the given value. \end{itemize} \subparagraph{Returns: }\label{returns-1} A \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto}{\lstinline{GraphDef}} protocol buffer. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.finalize()} }{tf.Graph.finalize() }}\label{tf.graph.finalize} Finalizes this graph, making it read-only. After calling \lstinline{g.finalize()}, no new operations can be added to \lstinline{g}. This method is used to ensure that no operations are added to a graph when it is shared between multiple threads, for example when using a \href{../../api_docs/python/train.md\#QueueRunner}{\lstinline{QueueRunner}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.finalized} }{tf.Graph.finalized }}\label{tf.graph.finalized} True if this graph has been finalized. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.control_dependencies(control_inputs)} }{tf.Graph.control_dependencies(control_inputs) }}\label{tf.graph.controlux5fdependenciescontrolux5finputs} Returns a context manager that specifies control dependencies. Use with the \lstinline{with} keyword to specify that all operations constructed within the context should have control dependencies on \lstinline{control_inputs}. For example: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{g.control_dependencies([a, b, c]):} \CommentTok{# `d` and `e` will only run after `a`, `b`, and `c` have executed.} \NormalTok{d }\OperatorTok{=} \NormalTok{...} \NormalTok{e }\OperatorTok{=} \NormalTok{...} \end{Highlighting} \end{Shaded} Multiple calls to \lstinline{control_dependencies()} can be nested, and in that case a new \lstinline{Operation} will have control dependencies on the union of \lstinline{control_inputs} from all active contexts. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{g.control_dependencies([a, b]):} \CommentTok{# Ops declared here run after `a` and `b`.} \ControlFlowTok{with} \NormalTok{g.control_dependencies([c, d]):} \CommentTok{# Ops declared here run after `a`, `b`, `c`, and `d`.} \end{Highlighting} \end{Shaded} \emph{N.B.} The control dependencies context applies \emph{only} to ops that are constructed within the context. Merely using an op or tensor in the context does not add a control dependency. The following example illustrates this point: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# WRONG} \KeywordTok{def} \NormalTok{my_func(pred, tensor):} \NormalTok{t }\OperatorTok{=} \NormalTok{tf.matmul(tensor, tensor)} \ControlFlowTok{with} \NormalTok{tf.control_dependencies([pred]):} \CommentTok{# The matmul op is created outside the context, so no control} \CommentTok{# dependency will be added.} \ControlFlowTok{return} \NormalTok{t} \CommentTok{# RIGHT} \KeywordTok{def} \NormalTok{my_func(pred, tensor):} \ControlFlowTok{with} \NormalTok{tf.control_dependencies([pred]):} \CommentTok{# The matmul op is created in the context, so a control dependency} \CommentTok{# will be added.} \ControlFlowTok{return} \NormalTok{tf.matmul(tensor, tensor)} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \lstinline{control_inputs}: A list of \lstinline{Operation} or \lstinline{Tensor} objects, which must be executed or computed before running the operations defined in the context. \end{itemize} \subparagraph{Returns: }\label{returns-2} A context manager that specifies control dependencies for all operations constructed within the context. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{control_inputs} is not a list of \lstinline{Operation} or \lstinline{Tensor} objects. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.device(device_name_or_function)} }{tf.Graph.device(device_name_or_function) }}\label{tf.graph.devicedeviceux5fnameux5forux5ffunction} Returns a context manager that specifies the default device to use. The \lstinline{device_name_or_function} argument may either be a device name string, a device function, or None: \begin{itemize} \tightlist \item If it is a device name string, all operations constructed in this context will be assigned to the device with that name. \item If it is a function, it will be treated as function from Operation objects to device name strings, and invoked each time a new Operation is created. The Operation will be assigned to the device with the returned name. \item If it is None, the default device will be cleared. \end{itemize} For example: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{g.device(}\StringTok{'/gpu:0'}\NormalTok{):} \CommentTok{# All operations constructed in this context will be placed} \CommentTok{# on GPU 0.} \ControlFlowTok{with} \NormalTok{g.device(}\VariableTok{None}\NormalTok{):} \CommentTok{# All operations constructed in this context will have no} \CommentTok{# assigned device.} \CommentTok{# Defines a function from `Operation` to device string.} \KeywordTok{def} \NormalTok{matmul_on_gpu(n):} \ControlFlowTok{if} \NormalTok{n.}\BuiltInTok{type} \OperatorTok{==} \StringTok{"MatMul"}\NormalTok{:} \ControlFlowTok{return} \StringTok{"/gpu:0"} \ControlFlowTok{else}\NormalTok{:} \ControlFlowTok{return} \StringTok{"/cpu:0"} \ControlFlowTok{with} \NormalTok{g.device(matmul_on_gpu):} \CommentTok{# All operations of type "MatMul" constructed in this context} \CommentTok{# will be placed on GPU 0; all other operations will be placed} \CommentTok{# on CPU 0.} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \lstinline{device_name_or_function}: The device name or function to use in the context. \end{itemize} \subparagraph{Returns: }\label{returns-3} A context manager that specifies the default device to use for newly created ops. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.name_scope(name)} }{tf.Graph.name_scope(name) }}\label{tf.graph.nameux5fscopename} Returns a context manager that creates hierarchical names for operations. A graph maintains a stack of name scopes. A \lstinline{with name_scope(...):} statement pushes a new name onto the stack for the lifetime of the context. The \lstinline{name} argument will be interpreted as follows: \begin{itemize} \tightlist \item A string (not ending with `/') will create a new name scope, in which \lstinline{name} is appended to the prefix of all operations created in the context. If \lstinline{name} has been used before, it will be made unique by calling \lstinline{self.unique_name(name)}. \item A scope previously captured from a \lstinline{with g.name_scope(...) as scope:} statement will be treated as an ``absolute'' name scope, which makes it possible to re-enter existing scopes. \item A value of \lstinline{None} or the empty string will reset the current name scope to the top-level (empty) name scope. \end{itemize} For example: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.Graph().as_default() }\ImportTok{as} \NormalTok{g:} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{5.0}\NormalTok{, name}\OperatorTok{=}\StringTok{"c"}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{c_1.name }\OperatorTok{==} \StringTok{"c"} \NormalTok{c_1 }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{6.0}\NormalTok{, name}\OperatorTok{=}\StringTok{"c"}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{c_1.name }\OperatorTok{==} \StringTok{"c_1"} \CommentTok{# Creates a scope called "nested"} \ControlFlowTok{with} \NormalTok{g.name_scope(}\StringTok{"nested"}\NormalTok{) }\ImportTok{as} \NormalTok{scope:} \NormalTok{nested_c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{10.0}\NormalTok{, name}\OperatorTok{=}\StringTok{"c"}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{nested_c.name }\OperatorTok{==} \StringTok{"nested/c"} \CommentTok{# Creates a nested scope called "inner".} \ControlFlowTok{with} \NormalTok{g.name_scope(}\StringTok{"inner"}\NormalTok{):} \NormalTok{nested_inner_c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{20.0}\NormalTok{, name}\OperatorTok{=}\StringTok{"c"}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{nested_inner_c.name }\OperatorTok{==} \StringTok{"nested/inner/c"} \CommentTok{# Create a nested scope called "inner_1".} \ControlFlowTok{with} \NormalTok{g.name_scope(}\StringTok{"inner"}\NormalTok{):} \NormalTok{nested_inner_1_c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{30.0}\NormalTok{, name}\OperatorTok{=}\StringTok{"c"}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{nested_inner_1_c.name }\OperatorTok{==} \StringTok{"nested/inner_1/c"} \CommentTok{# Treats `scope` as an absolute name scope, and} \CommentTok{# switches to the "nested/" scope.} \ControlFlowTok{with} \NormalTok{g.name_scope(scope):} \NormalTok{nested_d }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{40.0}\NormalTok{, name}\OperatorTok{=}\StringTok{"d"}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{nested_d.name }\OperatorTok{==} \StringTok{"nested/d"} \ControlFlowTok{with} \NormalTok{g.name_scope(}\StringTok{""}\NormalTok{):} \NormalTok{e }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{50.0}\NormalTok{, name}\OperatorTok{=}\StringTok{"e"}\NormalTok{)} \ControlFlowTok{assert} \NormalTok{e.name }\OperatorTok{==} \StringTok{"e"} \end{Highlighting} \end{Shaded} The name of the scope itself can be captured by \lstinline{with g.name_scope(...) as scope:}, which stores the name of the scope in the variable \lstinline{scope}. This value can be used to name an operation that represents the overall result of executing the ops in a scope. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{inputs }\OperatorTok{=} \NormalTok{tf.constant(...)} \ControlFlowTok{with} \NormalTok{g.name_scope(}\StringTok{'my_layer'}\NormalTok{) }\ImportTok{as} \NormalTok{scope:} \NormalTok{weights }\OperatorTok{=} \NormalTok{tf.Variable(..., name}\OperatorTok{=}\StringTok{"weights"}\NormalTok{)} \NormalTok{biases }\OperatorTok{=} \NormalTok{tf.Variable(..., name}\OperatorTok{=}\StringTok{"biases"}\NormalTok{)} \NormalTok{affine }\OperatorTok{=} \NormalTok{tf.matmul(inputs, weights) }\OperatorTok{+} \NormalTok{biases} \NormalTok{output }\OperatorTok{=} \NormalTok{tf.nn.relu(affine, name}\OperatorTok{=}\NormalTok{scope)} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \lstinline{name}: A name for the scope. \end{itemize} \subparagraph{Returns: }\label{returns-4} A context manager that installs \lstinline{name} as a new name scope. A \lstinline{Graph} instance supports an arbitrary number of ``collections'' that are identified by name. For convenience when building a large graph, collections can store groups of related objects: for example, the \lstinline{tf.Variable} uses a collection (named \href{../../api_docs/python/framework.md\#GraphKeys}{\lstinline{tf.GraphKeys.VARIABLES}}) for all variables that are created during the construction of a graph. The caller may define additional collections by specifying a new name. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.add_to_collection(name, value)} }{tf.Graph.add_to_collection(name, value) }}\label{tf.graph.addux5ftoux5fcollectionname-value} Stores \lstinline{value} in the collection with the given \lstinline{name}. \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \lstinline{name}: The key for the collection. For example, the \lstinline{GraphKeys} class contains many standard names for collections. \item \lstinline{value}: The value to add to the collection. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.get_collection(name, scope=None)} }{tf.Graph.get_collection(name, scope=None) }}\label{tf.graph.getux5fcollectionname-scopenone} Returns a list of values in the collection with the given \lstinline{name}. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \lstinline{key}: The key for the collection. For example, the \lstinline{GraphKeys} class contains many standard names for collections. \item \lstinline{scope}: (Optional.) If supplied, the resulting list is filtered to include only items whose name begins with this string. \end{itemize} \subparagraph{Returns: }\label{returns-5} The list of values in the collection with the given \lstinline{name}, or an empty list if no value has been added to that collection. The list contains the values in the order under which they were collected. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.as_graph_element(obj, allow_tensor=True, allow_operation=True)} }{tf.Graph.as_graph_element(obj, allow_tensor=True, allow_operation=True) }}\label{tf.graph.asux5fgraphux5felementobj-allowux5ftensortrue-allowux5foperationtrue} Returns the object referred to by \lstinline{obj}, as an \lstinline{Operation} or \lstinline{Tensor}. This function validates that \lstinline{obj} represents an element of this graph, and gives an informative error message if it is not. This function is the canonical way to get/validate an object of one of the allowed types from an external argument reference in the Session API. This method may be called concurrently from multiple threads. \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \lstinline{obj}: A \lstinline{Tensor}, an \lstinline{Operation}, or the name of a tensor or operation. Can also be any object with an \lstinline{_as_graph_element()} method that returns a value of one of these types. \item \lstinline{allow_tensor}: If true, \lstinline{obj} may refer to a \lstinline{Tensor}. \item \lstinline{allow_operation}: If true, \lstinline{obj} may refer to an \lstinline{Operation}. \end{itemize} \subparagraph{Returns: }\label{returns-6} The \lstinline{Tensor} or \lstinline{Operation} in the Graph corresponding to \lstinline{obj}. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{obj} is not a type we support attempting to convert to types. \item \lstinline{ValueError}: If \lstinline{obj} is of an appropriate type but invalid. For example, an invalid string. \item \lstinline{KeyError}: If \lstinline{obj} is not an object in the graph. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.get_operation_by_name(name)} }{tf.Graph.get_operation_by_name(name) }}\label{tf.graph.getux5foperationux5fbyux5fnamename} Returns the \lstinline{Operation} with the given \lstinline{name}. This method may be called concurrently from multiple threads. \subparagraph{Args: }\label{args-7} \begin{itemize} \item \lstinline{name}: The name of the \lstinline{Operation} to return. \end{itemize} \subparagraph{Returns: }\label{returns-7} The \lstinline{Operation} with the given \lstinline{name}. \subparagraph{Raises: }\label{raises-2} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{name} is not a string. \item \lstinline{KeyError}: If \lstinline{name} does not correspond to an operation in this graph. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.get_tensor_by_name(name)} }{tf.Graph.get_tensor_by_name(name) }}\label{tf.graph.getux5ftensorux5fbyux5fnamename} Returns the \lstinline{Tensor} with the given \lstinline{name}. This method may be called concurrently from multiple threads. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \lstinline{name}: The name of the \lstinline{Tensor} to return. \end{itemize} \subparagraph{Returns: }\label{returns-8} The \lstinline{Tensor} with the given \lstinline{name}. \subparagraph{Raises: }\label{raises-3} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{name} is not a string. \item \lstinline{KeyError}: If \lstinline{name} does not correspond to a tensor in this graph. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.get_operations()} }{tf.Graph.get_operations() }}\label{tf.graph.getux5foperations} Return the list of operations in the graph. You can modify the operations in place, but modifications to the list such as inserts/delete have no effect on the list of operations known to the graph. This method may be called concurrently from multiple threads. \subparagraph{Returns: }\label{returns-9} A list of Operations. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.get_default_device()} }{tf.Graph.get_default_device() }}\label{tf.graph.getux5fdefaultux5fdevice} Returns the default device. \subparagraph{Returns: }\label{returns-10} A string. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.seed} }{tf.Graph.seed }}\label{tf.graph.seed} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.unique_name(name)} }{tf.Graph.unique_name(name) }}\label{tf.graph.uniqueux5fnamename} Return a unique Operation name for ``name''. Note: You rarely need to call unique_name() directly. Most of the time you just need to create ``with g.name_scope()'' blocks to generate structured names. \lstinline{unique_name} is used to generate structured names, separated by ``/'', to help identify Operations when debugging a Graph. Operation names are displayed in error messages reported by the TensorFlow runtime, and in various visualization tools such as TensorBoard. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \lstinline{name}: The name for an \lstinline{Operation}. \end{itemize} \subparagraph{Returns: }\label{returns-11} A string to be passed to \lstinline{create_op()} that will be used to name the operation being created. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.version} }{tf.Graph.version }}\label{tf.graph.version} Returns a version number that increases as ops are added to the graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.create_op(op_type, inputs, dtypes, input_types=None, name=None, attrs=None, op_def=None, compute_shapes=True)} }{tf.Graph.create_op(op_type, inputs, dtypes, input_types=None, name=None, attrs=None, op_def=None, compute_shapes=True) }}\label{tf.graph.createux5fopopux5ftype-inputs-dtypes-inputux5ftypesnone-namenone-attrsnone-opux5fdefnone-computeux5fshapestrue} Creates an \lstinline{Operation} in this graph. This is a low-level interface for creating an \lstinline{Operation}. Most programs will not call this method directly, and instead use the Python op constructors, such as \lstinline{tf.constant()}, which add ops to the default graph. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \lstinline{op_type}: The \lstinline{Operation} type to create. This corresponds to the \lstinline{OpDef.name} field for the proto that defines the operation. \item \lstinline{inputs}: A list of \lstinline{Tensor} objects that will be inputs to the \lstinline{Operation}. \item \lstinline{dtypes}: A list of \lstinline{DType} objects that will be the types of the tensors that the operation produces. \item \lstinline{input_types}: (Optional.) A list of \lstinline{DType}s that will be the types of the tensors that the operation consumes. By default, uses the base \lstinline{DType} of each input in \lstinline{inputs}. Operations that expect reference-typed inputs must specify \lstinline{input_types} explicitly. \item \lstinline{name}: (Optional.) A string name for the operation. If not specified, a name is generated based on \lstinline{op_type}. \item \lstinline{attrs}: (Optional.) A list of \lstinline{AttrValue} protos for the \lstinline{attr} field of the \lstinline{NodeDef} proto that will represent the operation. \item \lstinline{op_def}: (Optional.) The \lstinline{OpDef} proto that describes the \lstinline{op_type} that the operation will have. \item \lstinline{compute_shapes}: (Optional.) If True, shape inference will be performed to compute the shapes of the outputs. \end{itemize} \subparagraph{Raises: }\label{raises-4} \begin{itemize} \tightlist \item \lstinline{TypeError}: if any of the inputs is not a \lstinline{Tensor}. \end{itemize} \subparagraph{Returns: }\label{returns-12} An \lstinline{Operation} object. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Graph.gradient_override_map(op_type_map)} }{tf.Graph.gradient_override_map(op_type_map) }}\label{tf.graph.gradientux5foverrideux5fmapopux5ftypeux5fmap} EXPERIMENTAL: A context manager for overriding gradient functions. This context manager can be used to override the gradient function that will be used for ops within the scope of the context. For example: \begin{Shaded} \begin{Highlighting}[] \AttributeTok{@tf.RegisterGradient}\NormalTok{(}\StringTok{"CustomSquare"}\NormalTok{)} \KeywordTok{def} \NormalTok{_custom_square_grad(op, inputs):} \CommentTok{# ...} \ControlFlowTok{with} \NormalTok{tf.Graph().as_default() }\ImportTok{as} \NormalTok{g:} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{5.0}\NormalTok{)} \NormalTok{s_1 }\OperatorTok{=} \NormalTok{tf.square(c) }\CommentTok{# Uses the default gradient for tf.square.} \ControlFlowTok{with} \NormalTok{g.gradient_override_map(\{}\StringTok{"Square"}\NormalTok{: }\StringTok{"CustomSquare"}\NormalTok{\}):} \NormalTok{s_2 }\OperatorTok{=} \NormalTok{tf.square(s_2) }\CommentTok{# Uses _custom_square_grad to compute the} \CommentTok{# gradient of s_2.} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \lstinline{op_type_map}: A dictionary mapping op type strings to alternative op type strings. \end{itemize} \subparagraph{Returns: }\label{returns-13} A context manager that sets the alternative op type to be used for one or more ops created in that context. \subparagraph{Raises: }\label{raises-5} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{op_type_map} is not a dictionary mapping strings to strings. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{class \lstinline{tf.Operation}} \label{class-tf.operation} Represents a graph node that performs computation on tensors. An \lstinline{Operation} is a node in a TensorFlow \lstinline{Graph} that takes zero or more \lstinline{Tensor} objects as input, and produces zero or more \lstinline{Tensor} objects as output. Objects of type \lstinline{Operation} are created by calling a Python op constructor (such as \href{../../api_docs/python/math_ops.md\#matmul}{\lstinline{tf.matmul()}}) or \href{../../api_docs/python/framework.md\#Graph.create_op}{\lstinline{Graph.create_op()}}. For example \lstinline{c = tf.matmul(a, b)} creates an \lstinline{Operation} of type "MatMul" that takes tensors \lstinline{a} and \lstinline{b} as input, and produces \lstinline{c} as output. After the graph has been launched in a session, an \lstinline{Operation} can be executed by passing it to \href{../../api_docs/python/client.md\#Session.run}{\lstinline{Session.run()}}. \lstinline{op.run()} is a shortcut for calling \lstinline{tf.get_default_session().run(op)}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.name} }{tf.Operation.name }}\label{tf.operation.name} The full name of this operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.type} }{tf.Operation.type }}\label{tf.operation.type} The type of the op (e.g. \lstinline{"MatMul"}). \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.inputs} }{tf.Operation.inputs }}\label{tf.operation.inputs} The list of \lstinline{Tensor} objects representing the data inputs of this op. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.control_inputs} }{tf.Operation.control_inputs }}\label{tf.operation.controlux5finputs} The \lstinline{Operation} objects on which this op has a control dependency. Before this op is executed, TensorFlow will ensure that the operations in \lstinline{self.control_inputs} have finished executing. This mechanism can be used to run ops sequentially for performance reasons, or to ensure that the side effects of an op are observed in the correct order. \subparagraph{Returns: }\label{returns-14} A list of \lstinline{Operation} objects. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.outputs} }{tf.Operation.outputs }}\label{tf.operation.outputs} The list of \lstinline{Tensor} objects representing the outputs of this op. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.device} }{tf.Operation.device }}\label{tf.operation.device} The name of the device to which this op has been assigned, if any. \subparagraph{Returns: }\label{returns-15} The string name of the device to which this op has been assigned, or None if it has not been assigned to a device. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.graph} }{tf.Operation.graph }}\label{tf.operation.graph} The \lstinline{Graph} that contains this operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.run(feed_dict=None, session=None)} }{tf.Operation.run(feed_dict=None, session=None) }}\label{tf.operation.run} Runs this operation in a \lstinline{Session}. Calling this method will execute all preceding operations that produce the inputs needed for this operation. \emph{N.B.} Before invoking \lstinline{Operation.run()}, its graph must have been launched in a session, and either a default session must be available, or \lstinline{session} must be specified explicitly. \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \lstinline{feed_dict}: A dictionary that maps \lstinline{Tensor} objects to feed values. See \href{../../api_docs/python/client.md\#Session.run}{\lstinline{Session.run()}} for a description of the valid feed values. \item \lstinline{session}: (Optional.) The \lstinline{Session} to be used to run to this operation. If none, the default session will be used. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.get_attr(name)} }{tf.Operation.get_attr(name) }}\label{tf.operation.getux5fattrname} Returns the value of the attr of this op with the given \lstinline{name}. \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \lstinline{name}: The name of the attr to fetch. \end{itemize} \subparagraph{Returns: }\label{returns-16} The value of the attr, as a Python object. \subparagraph{Raises: }\label{raises-6} \begin{itemize} \tightlist \item \lstinline{ValueError}: If this op does not have an attr with the given \lstinline{name}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.traceback} }{tf.Operation.traceback }}\label{tf.operation.traceback} Returns the call stack from when this operation was constructed. \paragraph{Other Methods }\label{other-methods} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.__init__(node_def, g, inputs=None, output_types=None, control_inputs=None, input_types=None, original_op=None, op_def=None)} }{tf.Operation.__init__(node_def, g, inputs=None, output_types=None, control_inputs=None, input_types=None, original_op=None, op_def=None) }}\label{tf.operation.ux5fux5finitux5fux5fnodeux5fdef-g-inputsnone-outputux5ftypesnone-controlux5finputsnone-inputux5ftypesnone-originalux5fopnone-opux5fdefnone} Creates an \lstinline{Operation}. NOTE: This constructor validates the name of the Operation (passed as ``node_def.name''). Valid Operation names match the following regular expression: {[}A-Za-z0-9.{]}{[}A-Za-z0-9_.-/{]}* \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \lstinline{node_def}: graph_pb2.NodeDef. NodeDef for the Operation. Used for attributes of graph_pb2.NodeDef, typically ``name'', ``op'', and ``device''. The ``input'' attribute is irrelevant here as it will be computed when generating the model. \item \lstinline{g}: Graph. The parent graph. \item \lstinline{inputs}: list of Tensor objects. The inputs to this Operation. \item \lstinline{output_types}: list of types_pb2.DataType. List of the types of the Tensors computed by this operation. The length of this list indicates the number of output endpoints of the Operation. \item \lstinline{control_inputs}: list of operations or tensors from which to have a control dependency. \item \lstinline{input_types}: List of types_pb2.DataType representing the types of the Tensors accepted by the Operation. By default uses {[}x.dtype.base_dtype for x in inputs{]}. Operations that expect reference-typed inputs must specify these explicitly. \item \lstinline{original_op}: Optional. Used to associate the new Operation with an existing Operation (for example, a replica with the op that was replicated). \item \lstinline{op_def}: Optional. The op_def_pb2.OpDef proto that describes the op type that this Operation represents. \end{itemize} \subparagraph{Raises: }\label{raises-7} \begin{itemize} \tightlist \item \lstinline{TypeError}: if control inputs are not Operations or Tensors, or if node_def is not a NodeDef, or if g is not a Graph, or if inputs are not Tensors, or if inputs and input_types are incompatible. \item \lstinline{ValueError}: if the node_def name is not valid. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.node_def} }{tf.Operation.node_def }}\label{tf.operation.nodeux5fdef} Returns a serialized \lstinline{NodeDef} representation of this operation. \subparagraph{Returns: }\label{returns-17} A \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto}{\lstinline{NodeDef}} protocol buffer. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.op_def} }{tf.Operation.op_def }}\label{tf.operation.opux5fdef} Returns the \lstinline{OpDef} proto that represents the type of this op. \subparagraph{Returns: }\label{returns-18} An \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def.proto}{\lstinline{OpDef}} protocol buffer. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Operation.values()} }{tf.Operation.values() }}\label{tf.operation.values} DEPRECATED: Use outputs. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.Tensor} }{class tf.Tensor }}\label{class-tf.tensor} Represents a value produced by an \lstinline{Operation}. A \lstinline{Tensor} is a symbolic handle to one of the outputs of an \lstinline{Operation}. It does not hold the values of that operation's output, but instead provides a means of computing those values in a TensorFlow \href{../../api_docs/python/client.md\#Session}{\lstinline{Session}}. This class has two primary purposes: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \item A \lstinline{Tensor} can be passed as an input to another \lstinline{Operation}. This builds a dataflow connection between operations, which enables TensorFlow to execute an entire \lstinline{Graph} that represents a large, multi-step computation. \item After the graph has been launched in a session, the value of the \lstinline{Tensor} can be computed by passing it to \href{../../api_docs/python/client.md\#Session.run}{\lstinline{Session.run()}}. \lstinline{t.eval()} is a shortcut for calling \lstinline{tf.get_default_session().run(t)}. \end{enumerate} In the following example, \lstinline{c}, \lstinline{d}, and \lstinline{e} are symbolic \lstinline{Tensor} objects, whereas \lstinline{result} is a numpy array that stores a concrete value: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Build a dataflow graph.} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant([[}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{], [}\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{]])} \NormalTok{d }\OperatorTok{=} \NormalTok{tf.constant([[}\FloatTok{1.0}\NormalTok{, }\FloatTok{1.0}\NormalTok{], [}\FloatTok{0.0}\NormalTok{, }\FloatTok{1.0}\NormalTok{]])} \NormalTok{e }\OperatorTok{=} \NormalTok{tf.matmul(c, d)} \CommentTok{# Construct a `Session` to execut the graph.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \CommentTok{# Execute the graph and store the value that `e` represents in `result`.} \NormalTok{result }\OperatorTok{=} \NormalTok{sess.run(e)} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.dtype} }{tf.Tensor.dtype }}\label{tf.tensor.dtype} The \lstinline{DType} of elements in this tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.name} }{tf.Tensor.name }}\label{tf.tensor.name} The string name of this tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.value_index} }{tf.Tensor.value_index }}\label{tf.tensor.valueux5findex} The index of this tensor in the outputs of its \lstinline{Operation}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.graph} }{tf.Tensor.graph }}\label{tf.tensor.graph} The \lstinline{Graph} that contains this tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.op} }{tf.Tensor.op }}\label{tf.tensor.op} The \lstinline{Operation} that produces this tensor as an output. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.consumers()} }{tf.Tensor.consumers() }}\label{tf.tensor.consumers} Returns a list of \lstinline{Operation}s that consume this tensor. \subparagraph{Returns: }\label{returns-19} A list of \lstinline{Operation}s. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.eval(feed_dict=None, session=None)} }{tf.Tensor.eval(feed_dict=None, session=None) }}\label{tf.tensor.eval} Evaluates this tensor in a \lstinline{Session}. Calling this method will execute all preceding operations that produce the inputs needed for the operation that produces this tensor. \emph{N.B.} Before invoking \lstinline{Tensor.eval()}, its graph must have been launched in a session, and either a default session must be available, or \lstinline{session} must be specified explicitly. \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \lstinline{feed_dict}: A dictionary that maps \lstinline{Tensor} objects to feed values. See \href{../../api_docs/python/client.md\#Session.run}{\lstinline{Session.run()}} for a description of the valid feed values. \item \lstinline{session}: (Optional.) The \lstinline{Session} to be used to evaluate this tensor. If none, the default session will be used. \end{itemize} \subparagraph{Returns: }\label{returns-20} A numpy array corresponding to the value of this tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.get_shape()} }{tf.Tensor.get_shape() }}\label{tf.tensor.getux5fshape} Returns the \lstinline{TensorShape} that represents the shape of this tensor. The shape is computed using shape inference functions that are registered for each \lstinline{Operation} type using \lstinline{tf.RegisterShape}. See \href{../../api_docs/python/framework.md\#TensorShape}{\lstinline{TensorShape}} for more details of what a shape represents. The inferred shape of a tensor is used to provide shape information without having to launch the graph in a session. This can be used for debugging, and providing early error messages. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant([[}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{], [}\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{]])} \BuiltInTok{print} \NormalTok{c.get_shape()} \OperatorTok{==>} \NormalTok{TensorShape([Dimension(}\DecValTok{2}\NormalTok{), Dimension(}\DecValTok{3}\NormalTok{)])} \NormalTok{d }\OperatorTok{=} \NormalTok{tf.constant([[}\FloatTok{1.0}\NormalTok{, }\FloatTok{0.0}\NormalTok{], [}\FloatTok{0.0}\NormalTok{, }\FloatTok{1.0}\NormalTok{], [}\FloatTok{1.0}\NormalTok{, }\FloatTok{0.0}\NormalTok{], [}\FloatTok{0.0}\NormalTok{, }\FloatTok{1.0}\NormalTok{]])} \BuiltInTok{print} \NormalTok{d.get_shape()} \OperatorTok{==>} \NormalTok{TensorShape([Dimension(}\DecValTok{4}\NormalTok{), Dimension(}\DecValTok{2}\NormalTok{)])} \CommentTok{# Raises a ValueError, because `c` and `d` do not have compatible} \CommentTok{# inner dimensions.} \NormalTok{e }\OperatorTok{=} \NormalTok{tf.matmul(c, d)} \NormalTok{f }\OperatorTok{=} \NormalTok{tf.matmul(c, d, transpose_a}\OperatorTok{=}\VariableTok{True}\NormalTok{, transpose_b}\OperatorTok{=}\VariableTok{True}\NormalTok{)} \BuiltInTok{print} \NormalTok{f.get_shape()} \OperatorTok{==>} \NormalTok{TensorShape([Dimension(}\DecValTok{3}\NormalTok{), Dimension(}\DecValTok{4}\NormalTok{)])} \end{Highlighting} \end{Shaded} In some cases, the inferred shape may have unknown dimensions. If the caller has additional information about the values of these dimensions, \lstinline{Tensor.set_shape()} can be used to augment the inferred shape. \subparagraph{Returns: }\label{returns-21} A \lstinline{TensorShape} representing the shape of this tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.set_shape(shape)} }{tf.Tensor.set_shape(shape) }}\label{tf.tensor.setux5fshapeshape} Updates the shape of this tensor. This method can be called multiple times, and will merge the given \lstinline{shape} with the current shape of this tensor. It can be used to provide additional information about the shape of this tensor that cannot be inferred from the graph alone. For example, this can be used to provide additional information about the shapes of images: \begin{Shaded} \begin{Highlighting}[] \NormalTok{_, image_data }\OperatorTok{=} \NormalTok{tf.TFRecordReader(...).read(...)} \NormalTok{image }\OperatorTok{=} \NormalTok{tf.image.decode_png(image_data, channels}\OperatorTok{=}\DecValTok{3}\NormalTok{)} \CommentTok{# The height and width dimensions of `image` are data dependent, and} \CommentTok{# cannot be computed without executing the op.} \BuiltInTok{print} \NormalTok{image.get_shape()} \OperatorTok{==>} \NormalTok{TensorShape([Dimension(}\VariableTok{None}\NormalTok{), Dimension(}\VariableTok{None}\NormalTok{), Dimension(}\DecValTok{3}\NormalTok{)])} \CommentTok{# We know that each image in this dataset is 28 x 28 pixels.} \NormalTok{image.set_shape([}\DecValTok{28}\NormalTok{, }\DecValTok{28}\NormalTok{, }\DecValTok{3}\NormalTok{])} \BuiltInTok{print} \NormalTok{image.get_shape()} \OperatorTok{==>} \NormalTok{TensorShape([Dimension(}\DecValTok{28}\NormalTok{), Dimension(}\DecValTok{28}\NormalTok{), Dimension(}\DecValTok{3}\NormalTok{)])} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \lstinline{shape}: A \lstinline{TensorShape} representing the shape of this tensor. \end{itemize} \subparagraph{Raises: }\label{raises-8} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{shape} is not compatible with the current shape of this tensor. \end{itemize} \paragraph{Other Methods }\label{other-methods-1} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.__init__(op, value_index, dtype)} }{tf.Tensor.__init__(op, value_index, dtype) }}\label{tf.tensor.ux5fux5finitux5fux5fop-valueux5findex-dtype} Creates a new \lstinline{Tensor}. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \lstinline{op}: An \lstinline{Operation}. \lstinline{Operation} that computes this tensor. \item \lstinline{value_index}: An \lstinline{int}. Index of the operation's endpoint that produces this tensor. \item \lstinline{dtype}: A \lstinline{types.DType}. Type of data stored in this tensor. \end{itemize} \subparagraph{Raises: }\label{raises-9} \begin{itemize} \tightlist \item \lstinline{TypeError}: If the op is not an \lstinline{Operation}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Tensor.device} }{tf.Tensor.device }}\label{tf.tensor.device} The name of the device on which this tensor will be produced, or None. \subsection{Tensor types }\label{tensor-types} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.DType} }{class tf.DType }}\label{class-tf.dtype} Represents the type of the elements in a \lstinline{Tensor}. The following \lstinline{DType} objects are defined: \begin{itemize} \item \lstinline{tf.float32}: 32-bit single-precision floating-point. \item \lstinline{tf.float64}: 64-bit double-precision floating-point. \item \lstinline{tf.bfloat16}: 16-bit truncated floating-point. \item \lstinline{tf.complex64}: 64-bit single-precision complex. \item \lstinline{tf.int8}: 8-bit signed integer. \item \lstinline{tf.uint8}: 8-bit unsigned integer. \item \lstinline{tf.int32}: 32-bit signed integer. \item \lstinline{tf.int64}: 64-bit signed integer. \item \lstinline{tf.bool}: Boolean. \item \lstinline{tf.string}: String. \item \lstinline{tf.qint8}: Quantized 8-bit signed integer. \item \lstinline{tf.quint8}: Quantized 8-bit unsigned integer. \item \lstinline{tf.qint32}: Quantized 32-bit signed integer. \end{itemize} In addition, variants of these types with the \lstinline{_ref} suffix are defined for reference-typed tensors. The \lstinline{tf.as_dtype()} function converts numpy types and string type names to a \lstinline{DType} object. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.is_compatible_with(other)} }{tf.DType.is_compatible_with(other) }}\label{tf.dtype.isux5fcompatibleux5fwithother} Returns True if the \lstinline{other} DType will be converted to this DType. The conversion rules are as follows: \begin{lstlisting} DType(T) .is_compatible_with(DType(T)) == True DType(T) .is_compatible_with(DType(T).as_ref) == True DType(T).as_ref.is_compatible_with(DType(T)) == False DType(T).as_ref.is_compatible_with(DType(T).as_ref) == True \end{lstlisting} \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \lstinline{other}: A \lstinline{DType} (or object that may be converted to a \lstinline{DType}). \end{itemize} \subparagraph{Returns: }\label{returns-22} True if a Tensor of the \lstinline{other} \lstinline{DType} will be implicitly converted to this \lstinline{DType}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.name} }{tf.DType.name }}\label{tf.dtype.name} Returns the string name for this \lstinline{DType}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.base_dtype} }{tf.DType.base_dtype }}\label{tf.dtype.baseux5fdtype} Returns a non-reference \lstinline{DType} based on this \lstinline{DType}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.is_ref_dtype} }{tf.DType.is_ref_dtype }}\label{tf.dtype.isux5frefux5fdtype} Returns \lstinline{True} if this \lstinline{DType} represents a reference type. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.as_ref} }{tf.DType.as_ref }}\label{tf.dtype.asux5fref} Returns a reference \lstinline{DType} based on this \lstinline{DType}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.is_integer} }{tf.DType.is_integer }}\label{tf.dtype.isux5finteger} Returns whether this is a (non-quantized) integer type. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.is_quantized} }{tf.DType.is_quantized }}\label{tf.dtype.isux5fquantized} Returns whether this is a quantized data type. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.as_numpy_dtype} }{tf.DType.as_numpy_dtype }}\label{tf.dtype.asux5fnumpyux5fdtype} Returns a \lstinline{numpy.dtype} based on this \lstinline{DType}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.as_datatype_enum} }{tf.DType.as_datatype_enum }}\label{tf.dtype.asux5fdatatypeux5fenum} Returns a \lstinline{types_pb2.DataType} enum value based on this \lstinline{DType}. \paragraph{Other Methods }\label{other-methods-2} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.__init__(type_enum)} }{tf.DType.__init__(type_enum) }}\label{tf.dtype.ux5fux5finitux5fux5ftypeux5fenum} Creates a new \lstinline{DataType}. NOTE(mrry): In normal circumstances, you should not need to construct a DataType object directly. Instead, use the types.as_dtype() function. \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \lstinline{type_enum}: A \lstinline{types_pb2.DataType} enum value. \end{itemize} \subparagraph{Raises: }\label{raises-10} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{type_enum} is not a value \lstinline{types_pb2.DataType}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.max} }{tf.DType.max }}\label{tf.dtype.max} Returns the maximum representable value in this data type. \subparagraph{Raises: }\label{raises-11} \begin{itemize} \tightlist \item \lstinline{TypeError}: if this is a non-numeric, unordered, or quantized type. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.DType.min} }{tf.DType.min }}\label{tf.dtype.min} Returns the minimum representable value in this data type. \subparagraph{Raises: }\label{raises-12} \begin{itemize} \tightlist \item \lstinline{TypeError}: if this is a non-numeric, unordered, or quantized type. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.as_dtype(type_value)} }{tf.as_dtype(type_value) }}\label{tf.asux5fdtypetypeux5fvalue} Converts the given \lstinline{type_value} to a \lstinline{DType}. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \lstinline{type_value}: A value that can be converted to a \lstinline{tf.DType} object. This may currently be a \lstinline{tf.DType} object, a \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.proto}{\lstinline{DataType} enum}, a string type name, or a \lstinline{numpy.dtype}. \end{itemize} \subparagraph{Returns: }\label{returns-23} A \lstinline{DType} corresponding to \lstinline{type_value}. \subparagraph{Raises: }\label{raises-13} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{type_value} cannot be converted to a \lstinline{DType}. \end{itemize} \subsection{Utility functions }\label{utility-functions} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.device(dev)} }{tf.device(dev) }}\label{tf.devicedev} Wrapper for \lstinline{Graph.device()} using the default graph. See \href{../../api_docs/python/framework.md\#Graph.name_scope}{\lstinline{Graph.name_scope()}} for more details. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \lstinline{device_name_or_function}: The device name or function to use in the context. \end{itemize} \subparagraph{Returns: }\label{returns-24} A context manager that specifies the default device to use for newly created ops. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.name_scope(name)} }{tf.name_scope(name) }}\label{tf.nameux5fscopename} Wrapper for \lstinline{Graph.name_scope()} using the default graph. See \href{../../api_docs/python/framework.md\#Graph.name_scope}{\lstinline{Graph.name_scope()}} for more details. \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \lstinline{name}: A name for the scope. \end{itemize} \subparagraph{Returns: }\label{returns-25} A context manager that installs \lstinline{name} as a new name scope in the default graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.control_dependencies(control_inputs)} }{tf.control_dependencies(control_inputs) }}\label{tf.controlux5fdependenciescontrolux5finputs} Wrapper for \lstinline{Graph.control_dependencies()} using the default graph. See \href{../../api_docs/python/framework.md\#Graph.control_dependencies}{\lstinline{Graph.control_dependencies()}} for more details. \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \lstinline{control_inputs}: A list of \lstinline{Operation} or \lstinline{Tensor} objects, which must be executed or computed before running the operations defined in the context. \end{itemize} \subparagraph{Returns: }\label{returns-26} A context manager that specifies control dependencies for all operations constructed within the context. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.convert_to_tensor(value, dtype=None, name=None)} }{tf.convert_to_tensor(value, dtype=None, name=None) }}\label{tf.convertux5ftoux5ftensorvalue-dtypenone-namenone} Converts the given \lstinline{value} to a \lstinline{Tensor}. This function converts Python objects of various types to \lstinline{Tensor} objects. It accepts \lstinline{Tensor} objects, numpy arrays, Python lists, and Python scalars. For example: \begin{Shaded} \begin{Highlighting}[] \ImportTok{import} \NormalTok{numpy }\ImportTok{as} \NormalTok{np} \NormalTok{array }\OperatorTok{=} \NormalTok{np.random.rand((}\DecValTok{32}\NormalTok{, }\DecValTok{100}\NormalTok{, }\DecValTok{100}\NormalTok{))} \KeywordTok{def} \NormalTok{my_func(arg):} \NormalTok{arg }\OperatorTok{=} \NormalTok{tf.convert_to_tensor(arg, dtype}\OperatorTok{=}\NormalTok{tf.float32)} \ControlFlowTok{return} \NormalTok{tf.matmul(arg, arg) }\OperatorTok{+} \NormalTok{arg} \CommentTok{# The following calls are equivalent.} \NormalTok{value_1 }\OperatorTok{=} \NormalTok{my_func(tf.constant([[}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{], [}\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{]]))} \NormalTok{value_2 }\OperatorTok{=} \NormalTok{my_func([[}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{], [}\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{]])} \NormalTok{value_3 }\OperatorTok{=} \NormalTok{my_func(np.array([[}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{], [}\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{]], dtype}\OperatorTok{=}\NormalTok{np.float32))} \end{Highlighting} \end{Shaded} This function can be useful when composing a new operation in Python (such as \lstinline{my_func} in the example above). All standard Python op constructors apply this function to each of their Tensor-valued inputs, which allows those ops to accept numpy arrays, Python lists, and scalars in addition to \lstinline{Tensor} objects. \subparagraph{Args: }\label{args-24} \begin{itemize} \tightlist \item \lstinline{value}: An object whose type has a registered \lstinline{Tensor} conversion function. \item \lstinline{dtype}: Optional element type for the returned tensor. If missing, the type is inferred from the type of \lstinline{value}. \item \lstinline{name}: Optional name to use if a new \lstinline{Tensor} is created. \end{itemize} \subparagraph{Returns: }\label{returns-27} A \lstinline{Tensor} based on \lstinline{value}. \subparagraph{Raises: }\label{raises-14} \begin{itemize} \tightlist \item \lstinline{TypeError}: If no conversion function is registered for \lstinline{value}. \item \lstinline{RuntimeError}: If a registered conversion function returns an invalid value. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.get_default_graph()} }{tf.get_default_graph() }}\label{tf.getux5fdefaultux5fgraph} Returns the default graph for the current thread. The returned graph will be the innermost graph on which a \lstinline{Graph.as_default()} context has been entered, or a global default graph if none has been explicitly created. \emph{N.B.} The default graph is a property of the current thread. If you create a new thread, and wish to use the default graph in that thread, you must explicitly add a \lstinline{with g.as_default():} in that thread's function. \subparagraph{Returns: }\label{returns-28} The default \lstinline{Graph} being used in the current thread. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.import_graph_def(graph_def, input_map=None, return_elements=None, name=None, op_dict=None)} }{tf.import_graph_def(graph_def, input_map=None, return_elements=None, name=None, op_dict=None) }}\label{tf.importux5fgraphux5fdefgraphux5fdef-inputux5fmapnone-returnux5felementsnone-namenone-opux5fdictnone} Imports the TensorFlow graph in \lstinline{graph_def} into the Python \lstinline{Graph}. This function provides a way to import a serialized TensorFlow \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/graph.proto}{\lstinline{GraphDef}} protocol buffer, and extract individual objects in the \lstinline{GraphDef} as \protect\hyperlink{Tensor}{\lstinline{Tensor}} and \protect\hyperlink{Operation}{\lstinline{Operation}} objects. See \protect\hyperlink{Graph.asux5fgraphux5fdef}{\lstinline{Graph.as_graph_def()}} for a way to create a \lstinline{GraphDef} proto. \subparagraph{Args: }\label{args-25} \begin{itemize} \tightlist \item \lstinline{graph_def}: A \lstinline{GraphDef} proto containing operations to be imported into the default graph. \item \lstinline{input_map}: A dictionary mapping input names (as strings) in \lstinline{graph_def} to \lstinline{Tensor} objects. The values of the named input tensors in the imported graph will be re-mapped to the respective \lstinline{Tensor} values. \item \lstinline{return_elements}: A list of strings containing operation names in \lstinline{graph_def} that will be returned as \lstinline{Operation} objects; and/or tensor names in \lstinline{graph_def} that will be returned as \lstinline{Tensor} objects. \item \lstinline{name}: (Optional.) A prefix that will be prepended to the names in \lstinline{graph_def}. Defaults to \lstinline{"import"}. \item \lstinline{op_dict}: (Optional.) A dictionary mapping op type names to \lstinline{OpDef} protos. Must contain an \lstinline{OpDef} proto for each op type named in \lstinline{graph_def}. If omitted, uses the \lstinline{OpDef} protos registered in the global registry. \end{itemize} \subparagraph{Returns: }\label{returns-29} A list of \lstinline{Operation} and/or \lstinline{Tensor} objects from the imported graph, corresponding to the names in `return_elements'. \subparagraph{Raises: }\label{raises-15} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{graph_def} is not a \lstinline{GraphDef} proto, \lstinline{input_map\textquotesingle{} is not a dictionary mapping strings to}Tensor\lstinline{objects, or}return_elements` is not a list of strings. \item \lstinline{ValueError}: If \lstinline{input_map}, or \lstinline{return_elements} contains names that do not appear in \lstinline{graph_def}, or \lstinline{graph_def} is not well-formed (e.g.~it refers to an unknown tensor). \end{itemize} \subsection{Graph collections }\label{graph-collections} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.add_to_collection(name, value)} }{tf.add_to_collection(name, value) }}\label{tf.addux5ftoux5fcollectionname-value} Wrapper for \lstinline{Graph.add_to_collection()} using the default graph. See \href{../../api_docs/python/framework.md\#Graph.add_to_collection}{\lstinline{Graph.add_to_collection()}} for more details. \subparagraph{Args: }\label{args-26} \begin{itemize} \tightlist \item \lstinline{name}: The key for the collection. For example, the \lstinline{GraphKeys} class contains many standard names for collections. \item \lstinline{value}: The value to add to the collection. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.get_collection(key, scope=None)} }{tf.get_collection(key, scope=None) }}\label{tf.getux5fcollectionkey-scopenone} Wrapper for \lstinline{Graph.get_collection()} using the default graph. See \href{../../api_docs/python/framework.md\#Graph.get_collection}{\lstinline{Graph.get_collection()}} for more details. \subparagraph{Args: }\label{args-27} \begin{itemize} \tightlist \item \lstinline{key}: The key for the collection. For example, the \lstinline{GraphKeys} class contains many standard names for collections. \item \lstinline{scope}: (Optional.) If supplied, the resulting list is filtered to include only items whose name begins with this string. \end{itemize} \subparagraph{Returns: }\label{returns-30} The list of values in the collection with the given \lstinline{name}, or an empty list if no value has been added to that collection. The list contains the values in the order under which they were collected. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.GraphKeys} }{class tf.GraphKeys }}\label{class-tf.graphkeys} Standard names to use for graph collections. The standard library uses various well-known names to collect and retrieve values associated with a graph. For example, the \lstinline{tf.Optimizer} subclasses default to optimizing the variables collected under \lstinline{tf.GraphKeys.TRAINABLE_VARIABLES} if none is specified, but it is also possible to pass an explicit list of variables. The following standard keys are defined: \begin{itemize} \tightlist \item \lstinline{VARIABLES}: the \lstinline{Variable} objects that comprise a model, and must be saved and restored together. See \href{../../api_docs/python/state_ops.md\#all_variables}{\lstinline{tf.all_variables()}} for more details. \item \lstinline{TRAINABLE_VARIABLES}: the subset of \lstinline{Variable} objects that will be trained by an optimizer. See \href{../../api_docs/python/state_ops.md\#trainable_variables}{\lstinline{tf.trainable_variables()}} for more details. \item \lstinline{SUMMARIES}: the summary \lstinline{Tensor} objects that have been created in the graph. See \href{../../api_docs/python/train.md\#merge_all_summaries}{\lstinline{tf.merge_all_summaries()}} for more details. \item \lstinline{QUEUE_RUNNERS}: the \lstinline{QueueRunner} objects that are used to produce input for a computation. See \href{../../api_docs/python/train.md\#start_queue_runners}{\lstinline{tf.start_queue_runners()}} for more details. \end{itemize} \subsection{Defining new operations }\label{defining-new-operations} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.RegisterGradient} }{class tf.RegisterGradient }}\label{class-tf.registergradient} A decorator for registering the gradient function for an op type. This decorator is only used when defining a new op type. For an op with \lstinline{m} inputs and \lstinline{n} inputs, the gradient function is a function that takes the original \lstinline{Operation} and \lstinline{n} \lstinline{Tensor} objects (representing the gradients with respect to each output of the op), and returns \lstinline{m} \lstinline{Tensor} objects (representing the partial gradients with respect to each input of the op). For example, assuming that operations of type \lstinline{"Sub"} take two inputs \lstinline{x} and \lstinline{y}, and return a single output \lstinline{x - y}, the following gradient function would be registered: \begin{Shaded} \begin{Highlighting}[] \AttributeTok{@tf.RegisterGradient}\NormalTok{(}\StringTok{"Sub"}\NormalTok{)} \KeywordTok{def} \NormalTok{_sub_grad(unused_op, grad):} \ControlFlowTok{return} \NormalTok{grad, tf.Neg(grad)} \end{Highlighting} \end{Shaded} The decorator argument \lstinline{op_type} is the string type of an operation. This corresponds to the \lstinline{OpDef.name} field for the proto that defines the operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.RegisterGradient.__init__(op_type)} }{tf.RegisterGradient.__init__(op_type) }}\label{tf.registergradient.ux5fux5finitux5fux5fopux5ftype} Creates a new decorator with \lstinline{op_type} as the Operation type. \subparagraph{Args: }\label{args-28} \begin{itemize} \tightlist \item \lstinline{op_type}: The string type of an operation. This corresponds to the \lstinline{OpDef.name} field for the proto that defines the operation. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.NoGradient(op_type)} }{tf.NoGradient(op_type) }}\label{tf.nogradientopux5ftype} Specifies that ops of type \lstinline{op_type} do not have a defined gradient. This function is only used when defining a new op type. It may be used for ops such as \lstinline{tf.size()} that are not differentiable. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{tf.NoGradient(}\StringTok{"Size"}\NormalTok{)} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-29} \begin{itemize} \tightlist \item \lstinline{op_type}: The string type of an operation. This corresponds to the \lstinline{OpDef.name} field for the proto that defines the operation. \end{itemize} \subparagraph{Raises: }\label{raises-16} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{op_type} is not a string. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.RegisterShape} }{class tf.RegisterShape }}\label{class-tf.registershape} A decorator for registering the shape function for an op type. This decorator is only used when defining a new op type. A shape function is a function from an \lstinline{Operation} object to a list of \lstinline{TensorShape} objects, with one \lstinline{TensorShape} for each output of the operation. For example, assuming that operations of type \lstinline{"Sub"} take two inputs \lstinline{x} and \lstinline{y}, and return a single output \lstinline{x - y}, all with the same shape, the following shape function would be registered: \begin{Shaded} \begin{Highlighting}[] \AttributeTok{@tf.RegisterShape}\NormalTok{(}\StringTok{"Sub"}\NormalTok{)} \KeywordTok{def} \NormalTok{_sub_shape(op):} \ControlFlowTok{return} \NormalTok{[op.inputs[}\DecValTok{0}\NormalTok{].get_shape().merge_with(op.inputs[}\DecValTok{1}\NormalTok{].get_shape())]} \end{Highlighting} \end{Shaded} The decorator argument \lstinline{op_type} is the string type of an operation. This corresponds to the \lstinline{OpDef.name} field for the proto that defines the operation. - - - \paragraph{\texorpdfstring{\lstinline{tf.RegisterShape.__init__(op_type)} }{tf.RegisterShape.__init__(op_type) }}\label{tf.registershape.ux5fux5finitux5fux5fopux5ftype} Saves the ``op_type'' as the Operation type. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.TensorShape} }{class tf.TensorShape }}\label{class-tf.tensorshape} Represents the shape of a \lstinline{Tensor}. A \lstinline{TensorShape} represents a possibly-partial shape specification for a \lstinline{Tensor}. It may be one of the following: \begin{itemize} \tightlist \item \emph{Fully-known shape:} has a known number of dimensions and a known size for each dimension. \item \emph{Partially-known shape:} has a known number of dimensions, and an unknown size for one or more dimension. \item \emph{Unknown shape:} has an unknown number of dimensions, and an unknown size in all dimensions. \end{itemize} If a tensor is produced by an operation of type \lstinline{"Foo"}, its shape may be inferred if there is a registered shape function for \lstinline{"Foo"}. See \href{../../api_docs/python/framework.md\#RegisterShape}{\lstinline{tf.RegisterShape()}} for details of shape functions and how to register them. Alternatively, the shape may be set explicitly using \href{../../api_docs/python/framework.md\#Tensor.set_shape}{\lstinline{Tensor.set_shape()}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.merge_with(other)} }{tf.TensorShape.merge_with(other) }}\label{tf.tensorshape.mergeux5fwithother} Returns a \lstinline{TensorShape} combining the information in \lstinline{self} and \lstinline{other}. The dimensions in \lstinline{self} and \lstinline{other} are merged elementwise, according to the rules defined for \lstinline{Dimension.merge_with()}. \subparagraph{Args: }\label{args-30} \begin{itemize} \tightlist \item \lstinline{other}: Another \lstinline{TensorShape}. \end{itemize} \subparagraph{Returns: }\label{returns-31} A \lstinline{TensorShape} containing the combined information of \lstinline{self} and \lstinline{other}. \subparagraph{Raises: }\label{raises-17} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} and \lstinline{other} are not compatible. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.concatenate(other)} }{tf.TensorShape.concatenate(other) }}\label{tf.tensorshape.concatenateother} Returns the concatenation of the dimension in \lstinline{self} and \lstinline{other}. \emph{N.B.} If either \lstinline{self} or \lstinline{other} is completely unknown, concatenation will discard information about the other shape. In future, we might support concatenation that preserves this information for use with slicing. \subparagraph{Args: }\label{args-31} \begin{itemize} \tightlist \item \lstinline{other}: Another \lstinline{TensorShape}. \end{itemize} \subparagraph{Returns: }\label{returns-32} A \lstinline{TensorShape} whose dimensions are the concatenation of the dimensions in \lstinline{self} and \lstinline{other}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.ndims} }{tf.TensorShape.ndims }}\label{tf.tensorshape.ndims} Returns the rank of this shape, or None if it is unspecified. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.dims} }{tf.TensorShape.dims }}\label{tf.tensorshape.dims} Returns a list of Dimensions, or None if the shape is unspecified. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.as_list()} }{tf.TensorShape.as_list() }}\label{tf.tensorshape.asux5flist} Returns a list of integers or None for each dimension. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.is_compatible_with(other)} }{tf.TensorShape.is_compatible_with(other) }}\label{tf.tensorshape.isux5fcompatibleux5fwithother} Returns True iff \lstinline{self} is compatible with \lstinline{other}. Two possibly-partially-defined shapes are compatible if there exists a fully-defined shape that both shapes can represent. Thus, compatibility allows the shape inference code to reason about partially-defined shapes. For example: \begin{itemize} \item TensorShape(None) is compatible with all shapes. \item TensorShape({[}None, None{]}) is compatible with all two-dimensional shapes, such as TensorShape({[}32, 784{]}), and also TensorShape(None). It is not compatible with, for example, TensorShape({[}None{]}) or TensorShape({[}None, None, None{]}). \item TensorShape({[}32, None{]}) is compatible with all two-dimensional shapes with size 32 in the 0th dimension, and also TensorShape({[}None, None{]}) and TensorShape(None). It is not compatible with, for example, TensorShape({[}32{]}), TensorShape({[}32, None, 1{]}) or TensorShape({[}64, None{]}). \item TensorShape({[}32, 784{]}) is compatible with itself, and also TensorShape({[}32, None{]}), TensorShape({[}None, 784{]}), TensorShape({[}None, None{]}) and TensorShape(None). It is not compatible with, for example, TensorShape({[}32, 1, 784{]}) or TensorShape({[}None{]}). \end{itemize} The compatibility relation is reflexive and symmetric, but not transitive. For example, TensorShape({[}32, 784{]}) is compatible with TensorShape(None), and TensorShape(None) is compatible with TensorShape({[}4, 4{]}), but TensorShape({[}32, 784{]}) is not compatible with TensorShape({[}4, 4{]}). \subparagraph{Args: }\label{args-32} \begin{itemize} \tightlist \item \lstinline{other}: Another TensorShape. \end{itemize} \subparagraph{Returns: }\label{returns-33} True iff \lstinline{self} is compatible with \lstinline{other}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.is_fully_defined()} }{tf.TensorShape.is_fully_defined() }}\label{tf.tensorshape.isux5ffullyux5fdefined} Returns True iff \lstinline{self} is fully defined in every dimension. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.with_rank(rank)} }{tf.TensorShape.with_rank(rank) }}\label{tf.tensorshape.withux5frankrank} Returns a shape based on \lstinline{self} with the given rank. This method promotes a completely unknown shape to one with a known rank. \subparagraph{Args: }\label{args-33} \begin{itemize} \tightlist \item \lstinline{rank}: An integer. \end{itemize} \subparagraph{Returns: }\label{returns-34} A shape that is at least as specific as \lstinline{self} with the given rank. \subparagraph{Raises: }\label{raises-18} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} does not represent a shape with the given \lstinline{rank}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.with_rank_at_least(rank)} }{tf.TensorShape.with_rank_at_least(rank) }}\label{tf.tensorshape.withux5frankux5fatux5fleastrank} Returns a shape based on \lstinline{self} with at least the given rank. \subparagraph{Args: }\label{args-34} \begin{itemize} \tightlist \item \lstinline{rank}: An integer. \end{itemize} \subparagraph{Returns: }\label{returns-35} A shape that is at least as specific as \lstinline{self} with at least the given rank. \subparagraph{Raises: }\label{raises-19} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} does not represent a shape with at least the given \lstinline{rank}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.with_rank_at_most(rank)} }{tf.TensorShape.with_rank_at_most(rank) }}\label{tf.tensorshape.withux5frankux5fatux5fmostrank} Returns a shape based on \lstinline{self} with at most the given rank. \subparagraph{Args: }\label{args-35} \begin{itemize} \tightlist \item \lstinline{rank}: An integer. \end{itemize} \subparagraph{Returns: }\label{returns-36} A shape that is at least as specific as \lstinline{self} with at most the given rank. \subparagraph{Raises: }\label{raises-20} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} does not represent a shape with at most the given \lstinline{rank}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.assert_has_rank(rank)} }{tf.TensorShape.assert_has_rank(rank) }}\label{tf.tensorshape.assertux5fhasux5frankrank} Raises an exception if \lstinline{self} is not compatible with the given \lstinline{rank}. \subparagraph{Args: }\label{args-36} \begin{itemize} \tightlist \item \lstinline{rank}: An integer. \end{itemize} \subparagraph{Raises: }\label{raises-21} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} does not represent a shape with the given \lstinline{rank}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.assert_same_rank(other)} }{tf.TensorShape.assert_same_rank(other) }}\label{tf.tensorshape.assertux5fsameux5frankother} Raises an exception if \lstinline{self} and \lstinline{other} do not have compatible ranks. \subparagraph{Args: }\label{args-37} \begin{itemize} \tightlist \item \lstinline{other}: Another \lstinline{TensorShape}. \end{itemize} \subparagraph{Raises: }\label{raises-22} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} and \lstinline{other} do not represent shapes with the same rank. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.assert_is_compatible_with(other)} }{tf.TensorShape.assert_is_compatible_with(other) }}\label{tf.tensorshape.assertux5fisux5fcompatibleux5fwithother} Raises exception if \lstinline{self} and \lstinline{other} do not represent the same shape. This method can be used to assert that there exists a shape that both \lstinline{self} and \lstinline{other} represent. \subparagraph{Args: }\label{args-38} \begin{itemize} \tightlist \item \lstinline{other}: Another TensorShape. \end{itemize} \subparagraph{Raises: }\label{raises-23} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} and \lstinline{other} do not represent the same shape. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.assert_is_fully_defined()} }{tf.TensorShape.assert_is_fully_defined() }}\label{tf.tensorshape.assertux5fisux5ffullyux5fdefined} Raises an exception if \lstinline{self} is not fully defined in every dimension. \subparagraph{Raises: }\label{raises-24} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} does not have a known value for every dimension. \end{itemize} \paragraph{Other Methods }\label{other-methods-3} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.__init__(dims)} }{tf.TensorShape.__init__(dims) }}\label{tf.tensorshape.ux5fux5finitux5fux5fdims} Creates a new TensorShape with the given dimensions. \subparagraph{Args: }\label{args-39} \begin{itemize} \tightlist \item \lstinline{dims}: A list of Dimensions, or None if the shape is unspecified. \item \lstinline{DEPRECATED}: A single integer is treated as a singleton list. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.as_dimension_list()} }{tf.TensorShape.as_dimension_list() }}\label{tf.tensorshape.asux5fdimensionux5flist} DEPRECATED: use as_list(). \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.TensorShape.num_elements()} }{tf.TensorShape.num_elements() }}\label{tf.tensorshape.numux5felements} Returns the total number of elements, or none for incomplete shapes. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class tf.Dimension} }{class tf.Dimension }}\label{class-tf.dimension} Represents the value of one dimension in a TensorShape. - - - \paragraph{\texorpdfstring{\lstinline{tf.Dimension.__init__(value)} }{tf.Dimension.__init__(value) }}\label{tf.dimension.ux5fux5finitux5fux5fvalue} Creates a new Dimension with the given value. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Dimension.assert_is_compatible_with(other)} }{tf.Dimension.assert_is_compatible_with(other) }}\label{tf.dimension.assertux5fisux5fcompatibleux5fwithother} Raises an exception if \lstinline{other} is not compatible with this Dimension. \subparagraph{Args: }\label{args-40} \begin{itemize} \tightlist \item \lstinline{other}: Another Dimension. \end{itemize} \subparagraph{Raises: }\label{raises-25} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} and \lstinline{other} are not compatible (see is_compatible_with). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Dimension.is_compatible_with(other)} }{tf.Dimension.is_compatible_with(other) }}\label{tf.dimension.isux5fcompatibleux5fwithother} Returns true if \lstinline{other} is compatible with this Dimension. Two known Dimensions are compatible if they have the same value. An unknown Dimension is compatible with all other Dimensions. \subparagraph{Args: }\label{args-41} \begin{itemize} \tightlist \item \lstinline{other}: Another Dimension. \end{itemize} \subparagraph{Returns: }\label{returns-37} True if this Dimension and \lstinline{other} are compatible. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Dimension.merge_with(other)} }{tf.Dimension.merge_with(other) }}\label{tf.dimension.mergeux5fwithother} Returns a Dimension that combines the information in \lstinline{self} and \lstinline{other}. Dimensions are combined as follows: Dimension(n) .merge_with(Dimension(n)) == Dimension(n) Dimension(n) .merge_with(Dimension(None)) == Dimension(n) Dimension(None).merge_with(Dimension(n)) == Dimension(n) Dimension(None).merge_with(Dimension(None)) == Dimension(None) Dimension(n) .merge_with(Dimension(m)) raises ValueError for n != m \subparagraph{Args: }\label{args-42} \begin{itemize} \tightlist \item \lstinline{other}: Another Dimension. \end{itemize} \subparagraph{Returns: }\label{returns-38} A Dimension containing the combined information of \lstinline{self} and \lstinline{other}. \subparagraph{Raises: }\label{raises-26} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{self} and \lstinline{other} are not compatible (see is_compatible_with). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Dimension.value} }{tf.Dimension.value }}\label{tf.dimension.value} The value of this dimension, or None if it is unknown. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.op_scope(values, name, default_name)} }{tf.op_scope(values, name, default_name) }}\label{tf.opux5fscopevalues-name-defaultux5fname} Returns a context manager for use when defining a Python op. This context manager validates that the given \lstinline{values} are from the same graph, ensures that that graph is the default graph, and pushes a name scope. For example, to define a new Python op called \lstinline{my_op}: \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{my_op(a, b, c, name}\OperatorTok{=}\VariableTok{None}\NormalTok{):} \ControlFlowTok{with} \NormalTok{tf.op_scope([a, b, c], name, }\StringTok{"MyOp"}\NormalTok{) }\ImportTok{as} \NormalTok{scope:} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.convert_to_tensor(a, name}\OperatorTok{=}\StringTok{"a"}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.convert_to_tensor(b, name}\OperatorTok{=}\StringTok{"b"}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.convert_to_tensor(c, name}\OperatorTok{=}\StringTok{"c"}\NormalTok{)} \CommentTok{# Define some computation that uses `a`, `b`, and `c`.} \ControlFlowTok{return} \NormalTok{foo_op(..., name}\OperatorTok{=}\NormalTok{scope)} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-43} \begin{itemize} \tightlist \item \lstinline{values}: The list of \lstinline{Tensor} arguments that are passed to the op function. \item \lstinline{name}: The name argument that is passed to the op function. \item \lstinline{default_name}: The default name to use if the \lstinline{name} argument is \lstinline{None}. \end{itemize} \subparagraph{Returns: }\label{returns-39} A context manager for use in defining a Python op. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.get_seed(op_seed)} }{tf.get_seed(op_seed) }}\label{tf.getux5fseedopux5fseed} Returns the local seeds an operation should use given an op-specific seed. Given operation-specific seed, \lstinline{op_seed}, this helper function returns two seeds derived from graph-level and op-level seeds. Many random operations internally use the two seeds to allow user to change the seed globally for a graph, or for only specific operations. For details on how the graph-level seed interacts with op seeds, see \href{../../api_docs/python/constant_op.md\#set_random_seed}{\lstinline{set_random_seed}}. \subparagraph{Args: }\label{args-44} \begin{itemize} \tightlist \item \lstinline{op_seed}: integer. \end{itemize} \subparagraph{Returns: }\label{returns-40} A tuple of two integers that should be used for the local seed of this operation. ================================================ FILE: tex_pdf/api/python/c4s02_constant_op.tex ================================================ \section{Constants, Sequences, and Random Values} \label{const_seq_rnd} \footnote{Note: Functions taking Tensor arguments can also take anything accepted by tf.convert_to_tensor.} \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-constants--sequences--and-random-values}{Constants, Sequences, and Random Values}}{Constants, Sequences, and Random Values}}\label{constants-sequences-and-random-values-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-constant-value-tensors}{Constant Value Tensors} \item \protect\hyperlink{zeros}{\texttt{tf.zeros(shape,\ dtype=tf.float32,\ name=None)}} \item \protect\hyperlink{zerosux5flike}{\texttt{tf.zeros\_like(tensor,\ dtype=None,\ name=None)}} \item \protect\hyperlink{ones}{\texttt{tf.ones(shape,\ dtype=tf.float32,\ name=None)}} \item \protect\hyperlink{onesux5flike}{\texttt{tf.ones\_like(tensor,\ dtype=None,\ name=None)}} \item \protect\hyperlink{fill}{\texttt{tf.fill(dims,\ value,\ name=None)}} \item \protect\hyperlink{constant}{\texttt{tf.constant(value,\ dtype=None,\ shape=None,\ name=\textquotesingle{}Const\textquotesingle{})}} \item \protect\hyperlink{AUTOGENERATED-sequences}{Sequences} \item \protect\hyperlink{linspace}{\texttt{tf.linspace(start,\ stop,\ num,\ name=None)}} \item \protect\hyperlink{range}{\texttt{tf.range(start,\ limit,\ delta=1,\ name=\textquotesingle{}range\textquotesingle{})}} \item \protect\hyperlink{AUTOGENERATED-random-tensors}{Random Tensors} \item \protect\hyperlink{AUTOGENERATED-examples-}{Examples:} \item \protect\hyperlink{randomux5fnormal}{\texttt{tf.random\_normal(shape,\ mean=0.0,\ stddev=1.0,\ dtype=tf.float32,\ seed=None,\ name=None)}} \item \protect\hyperlink{truncatedux5fnormal}{\texttt{tf.truncated\_normal(shape,\ mean=0.0,\ stddev=1.0,\ dtype=tf.float32,\ seed=None,\ name=None)}} \item \protect\hyperlink{randomux5funiform}{\texttt{tf.random\_uniform(shape,\ minval=0.0,\ maxval=1.0,\ dtype=tf.float32,\ seed=None,\ name=None)}} \item \protect\hyperlink{randomux5fshuffle}{\texttt{tf.random\_shuffle(value,\ seed=None,\ name=None)}} \item \protect\hyperlink{setux5frandomux5fseed}{\texttt{tf.set\_random\_seed(seed)}} \end{itemize} \subsection{Constant Value Tensors }\label{constant-value-tensors} TensorFlow provides several operations that you can use to generate constants. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.zeros(shape,\ dtype=tf.float32,\ name=None)} }{tf.zeros(shape, dtype=tf.float32, name=None) }}\label{tf.zerosshape-dtypetf.float32-namenone} Creates a tensor with all elements set to zero. This operation returns a tensor of type \texttt{dtype} with shape \texttt{shape} and all elements set to zero. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{tf.zeros([}\DecValTok{3}\NormalTok{, }\DecValTok{4}\NormalTok{], int32) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{], [}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{], [}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{shape}: Either a list of integers, or a 1-D \texttt{Tensor} of type \texttt{int32}. \item \texttt{dtype}: The type of an element in the resulting \texttt{Tensor}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns} A \texttt{Tensor} with all elements set to zero. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.zeros\_like(tensor,\ dtype=None,\ name=None)} }{tf.zeros\_like(tensor, dtype=None, name=None) }}\label{tf.zerosux5fliketensor-dtypenone-namenone} Creates a tensor with all elements set to zero. Given a single tensor (\texttt{tensor}), this operation returns a tensor of the same type and shape as \texttt{tensor} with all elements set to zero. Optionally, you can use \texttt{dtype} to specify a new type for the returned tensor. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'tensor' is [[1, 2, 3], [4, 5, 6]]} \NormalTok{tf.zeros_like(tensor) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{], [}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-1} \begin{itemize} \item \texttt{tensor}: A \texttt{Tensor}. \item \texttt{dtype}: A type for the returned \texttt{Tensor}. Must be \texttt{float32}, \texttt{float64}, \texttt{int8}, \texttt{int16}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, or \texttt{complex64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-1} A \texttt{Tensor} with all elements set to zero. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.ones(shape,\ dtype=tf.float32,\ name=None)} }{tf.ones(shape, dtype=tf.float32, name=None) }}\label{tf.onesshape-dtypetf.float32-namenone} Creates a tensor with all elements set to 1. This operation returns a tensor of type \texttt{dtype} with shape \texttt{shape} and all elements set to 1. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{tf.ones([}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], int32) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], [}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{shape}: Either a list of integers, or a 1-D \texttt{Tensor} of type \texttt{int32}. \item \texttt{dtype}: The type of an element in the resulting \texttt{Tensor}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-2} A \texttt{Tensor} with all elements set to 1. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.ones\_like(tensor,\ dtype=None,\ name=None)} }{tf.ones\_like(tensor, dtype=None, name=None) }}\label{tf.onesux5fliketensor-dtypenone-namenone} Creates a tensor with all elements set to 1. Given a single tensor (\texttt{tensor}), this operation returns a tensor of the same type and shape as \texttt{tensor} with all elements set to 1. Optionally, you can specify a new type (\texttt{dtype}) for the returned tensor. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'tensor' is [[1, 2, 3], [4, 5, 6]]} \NormalTok{tf.ones_like(tensor) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], [}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-3} \begin{itemize} \item \texttt{tensor}: A \texttt{Tensor}. \item \texttt{dtype}: A type for the returned \texttt{Tensor}. Must be \texttt{float32}, \texttt{float64}, \texttt{int8}, \texttt{int16}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, or \texttt{complex64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} A \texttt{Tensor} with all elements set to 1. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.fill(dims,\ value,\ name=None)} }{tf.fill(dims, value, name=None) }}\label{tf.filldims-value-namenone} Creates a tensor filled with a scalar value. This operation creates a tensor of shape \texttt{dims} and fills it with \texttt{value}. For example: \begin{verbatim} # output tensor shape needs to be [2, 3] # so 'dims' is [2, 3] fill(dims, 9) ==> [[9, 9, 9] [9, 9, 9]] \end{verbatim} \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{dims}: A \texttt{Tensor} of type \texttt{int32}. 1-D. Represents the shape of the output tensor. \item \texttt{value}: A \texttt{Tensor}. 0-D (scalar). Value to fill the returned tensor. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-4} A \texttt{Tensor}. Has the same type as \texttt{value}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.constant(value,\ dtype=None,\ shape=None,\ name=\textquotesingle{}Const\textquotesingle{})} }{tf.constant(value, dtype=None, shape=None, name='Const') }}\label{tf.constantvalue-dtypenone-shapenone-nameconst} Creates a constant tensor. The resulting tensor is populated with values of type \texttt{dtype}, as specified by arguments \texttt{value} and (optionally) \texttt{shape} (see examples below). The argument \texttt{value} can be a constant value, or a list of values of type \texttt{dtype}. If \texttt{value} is a list, then the length of the list must be less than or equal to the number of elements implied by the \texttt{shape} argument (if specified). In the case where the list length is less than the number of elements specified by \texttt{shape}, the last element in the list will be used to fill the remaining entries. The argument \texttt{shape} is optional. If present, it specifies the dimensions of the resulting tensor. If not present, then the tensor is a scalar (0-D) if \texttt{value} is a scalar, or 1-D otherwise. If the argument \texttt{dtype} is not specified, then the type is inferred from the type of \texttt{value}. For example: ```python \# Constant 1-D Tensor populated with value list. tensor = tf.constant({[}1, 2, 3, 4, 5, 6, 7{]}) =\textgreater{} {[}1 2 3 4 5 6 7{]} \# Constant 2-D tensor populated with scalar value -1. tensor = tf.constant(-1.0, shape={[}2, 3{]}) =\textgreater{} {[}{[}-1. -1. -1.{]}{[}-1. -1. -1.{]}{]} ``` \subparagraph{Args: }\label{args-5} \begin{itemize} \item \texttt{value}: A constant value (or list) of output type \texttt{dtype}. \item \texttt{dtype}: The type of the elements of the resulting tensor. \item \texttt{shape}: Optional dimensions of resulting tensor. \item \texttt{name}: Optional name for the tensor. \end{itemize} \subparagraph{Returns: }\label{returns-5} A Constant Tensor. \subsection{Sequences }\label{sequences} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.linspace(start,\ stop,\ num,\ name=None)} }{tf.linspace(start, stop, num, name=None) }}\label{tf.linspacestart-stop-num-namenone} Generates values in an interval. A sequence of \texttt{num} evenly-spaced values are generated beginning at \texttt{start}. If \texttt{num\ \textgreater{}\ 1}, the values in the sequence increase by \texttt{stop\ -\ start\ /\ num\ -\ 1}, so that the last one is exactly \texttt{stop}. For example: \begin{verbatim} tf.linspace(10.0, 12.0, 3, name="linspace") => [ 10.0 11.0 12.0] \end{verbatim} \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{start}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. First entry in the range. \item \texttt{stop}: A \texttt{Tensor}. Must have the same type as \texttt{start}. Last entry in the range. \item \texttt{num}: A \texttt{Tensor} of type \texttt{int32}. Number of values to generate. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A \texttt{Tensor}. Has the same type as \texttt{start}. 1-D. The generated values. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.range(start,\ limit,\ delta=1,\ name=\textquotesingle{}range\textquotesingle{})} }{tf.range(start, limit, delta=1, name='range') }}\label{tf.rangestart-limit-delta1-namerange} Creates a sequence of integers. This operation creates a sequence of integers that begins at \texttt{start} and extends by increments of \texttt{delta} up to but not including \texttt{limit}. For example: \begin{verbatim} # 'start' is 3 # 'limit' is 18 # 'delta' is 3 tf.range(start, limit, delta) ==> [3, 6, 9, 12, 15] \end{verbatim} \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{start}: A 0-D (scalar) of type \texttt{int32}. First entry in sequence. \item \texttt{limit}: A 0-D (scalar) of type \texttt{int32}. Upper limit of sequence, exclusive. \item \texttt{delta}: A 0-D \texttt{Tensor} (scalar) of type \texttt{int32}. Optional. Default is 1. Number that increments \texttt{start}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} An 1-D \texttt{int32} \texttt{Tensor}. \subsection{Random Tensors }\label{random-tensors} TensorFlow has several ops that create random tensors with different distributions. The random ops are stateful, and create new random values each time they are evaluated. The \texttt{seed} keyword argument in these functions acts in conjunction with the graph-level random seed. Changing either the graph-level seed using \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} or the op-level seed will change the underlying seed of these operations. Setting neither graph-level nor op-level seed, results in a random seed for all operations. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for details on the interaction between operation-level and graph-level random seeds. \subsubsection{Examples: }\label{examples} \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Create a tensor of shape [2, 3] consisting of random normal values, with mean} \CommentTok{# -1 and standard deviation 4.} \NormalTok{norm }\OperatorTok{=} \NormalTok{tf.random_normal([}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], mean}\OperatorTok{=-}\DecValTok{1}\NormalTok{, stddev}\OperatorTok{=}\DecValTok{4}\NormalTok{)} \CommentTok{# Shuffle the first dimension of a tensor} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant([[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{], [}\DecValTok{3}\NormalTok{, }\DecValTok{4}\NormalTok{], [}\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{]])} \NormalTok{shuff }\OperatorTok{=} \NormalTok{tf.random_shuffle(c)} \CommentTok{# Each time we run these ops, different results are generated} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \BuiltInTok{print} \NormalTok{sess.run(norm)} \BuiltInTok{print} \NormalTok{sess.run(norm)} \CommentTok{# Set an op-level seed to generate repeatable sequences across sessions.} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant([[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{], [}\DecValTok{3}\NormalTok{, }\DecValTok{4}\NormalTok{], [}\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{]])} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \NormalTok{norm }\OperatorTok{=} \NormalTok{tf.random_normal(c, seed}\OperatorTok{=}\DecValTok{1234}\NormalTok{)} \BuiltInTok{print} \NormalTok{sess.run(norm)} \BuiltInTok{print} \NormalTok{sess.run(norm)} \end{Highlighting} \end{Shaded} Another common use of random values is the intialization of variables. Also see the \href{../../how_tos/variables/index.md}{Variables How To}. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Use random uniform values in [0, 1) as the initializer for a variable of shape} \CommentTok{# [2, 3]. The default type is float32.} \NormalTok{var }\OperatorTok{=} \NormalTok{tf.Variable(tf.random_uniform([}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{]), name}\OperatorTok{=}\StringTok{"var"}\NormalTok{)} \NormalTok{init }\OperatorTok{=} \NormalTok{tf.initialize_all_variables()} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \NormalTok{sess.run(init)} \BuiltInTok{print} \NormalTok{sess.run(var)} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.random\_normal(shape,\ mean=0.0,\ stddev=1.0,\ dtype=tf.float32,\ seed=None,\ name=None)} }{tf.random\_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None) }}\label{tf.randomux5fnormalshape-mean0.0-stddev1.0-dtypetf.float32-seednone-namenone} Outputs random values from a normal distribution. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \texttt{shape}: A 1-D integer Tensor or Python array. The shape of the output tensor. \item \texttt{mean}: A 0-D Tensor or Python value of type \texttt{dtype}. The mean of the normal distribution. \item \texttt{stddev}: A 0-D Tensor or Python value of type \texttt{dtype}. The standard deviation of the normal distribution. \item \texttt{dtype}: The type of the output. \item \texttt{seed}: A Python integer. Used to create a random seed for the distribution. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} A tensor of the specified shape filled with random normal values. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.truncated\_normal(shape,\ mean=0.0,\ stddev=1.0,\ dtype=tf.float32,\ seed=None,\ name=None)} }{tf.truncated\_normal(shape, mean=0.0, stddev=1.0, dtype=tf.float32, seed=None, name=None) }}\label{tf.truncatedux5fnormalshape-mean0.0-stddev1.0-dtypetf.float32-seednone-namenone} Outputs random values from a truncated normal distribution. The generated values follow a normal distribution with specified mean and standard deviation, except that values whose magnitude is more than 2 standard deviations from the mean are dropped and re-picked. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \texttt{shape}: A 1-D integer Tensor or Python array. The shape of the output tensor. \item \texttt{mean}: A 0-D Tensor or Python value of type \texttt{dtype}. The mean of the truncated normal distribution. \item \texttt{stddev}: A 0-D Tensor or Python value of type \texttt{dtype}. The standard deviation of the truncated normal distribution. \item \texttt{dtype}: The type of the output. \item \texttt{seed}: A Python integer. Used to create a random seed for the distribution. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-9} A tensor of the specified shape filled with random truncated normal values. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.random\_uniform(shape,\ minval=0.0,\ maxval=1.0,\ dtype=tf.float32,\ seed=None,\ name=None)} }{tf.random\_uniform(shape, minval=0.0, maxval=1.0, dtype=tf.float32, seed=None, name=None) }}\label{tf.randomux5funiformshape-minval0.0-maxval1.0-dtypetf.float32-seednone-namenone} Outputs random values from a uniform distribution. The generated values follow a uniform distribution in the range \texttt{{[}minval,\ maxval)}. The lower bound \texttt{minval} is included in the range, while the upper bound \texttt{maxval} is excluded. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \texttt{shape}: A 1-D integer Tensor or Python array. The shape of the output tensor. \item \texttt{minval}: A 0-D Tensor or Python value of type \texttt{dtype}. The lower bound on the range of random values to generate. \item \texttt{maxval}: A 0-D Tensor or Python value of type \texttt{dtype}. The upper bound on the range of random values to generate. \item \texttt{dtype}: The type of the output. \item \texttt{seed}: A Python integer. Used to create a random seed for the distribution. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-10} A tensor of the specified shape filled with random uniform values. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.random\_shuffle(value,\ seed=None,\ name=None)} }{tf.random\_shuffle(value, seed=None, name=None) }}\label{tf.randomux5fshufflevalue-seednone-namenone} Randomly shuffles a tensor along its first dimension. The tensor is shuffled along dimension 0, such that each \texttt{value{[}j{]}} is mapped to one and only one \texttt{output{[}i{]}}. For example, a mapping that might occur for a 3x2 tensor is: \begin{Shaded} \begin{Highlighting}[] \NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{], [[}\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{],} \NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{4}\NormalTok{], }\OperatorTok{==>} \NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{],} \NormalTok{[}\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{]] [}\DecValTok{3}\NormalTok{, }\DecValTok{4}\NormalTok{]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \texttt{value}: A Tensor to be shuffled. \item \texttt{seed}: A Python integer. Used to create a random seed for the distribution. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-11} A tensor of same shape and type as \texttt{value}, shuffled along its first dimension. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.set\_random\_seed(seed)} }{tf.set\_random\_seed(seed) }}\label{tf.setux5frandomux5fseedseed} Sets the graph-level random seed. Operations that rely on a random seed actually derive it from two seeds: the graph-level and operation-level seeds. This sets the graph-level seed. Its interactions with operation-level seeds is as follows: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item If neither the graph-level nor the operation seed is set: A random seed is used for this op. \item If the graph-level seed is set, but the operation seed is not: The system deterministically picks an operation seed in conjunction with the graph-level seed so that it gets a unique random sequence. \item If the graph-level seed is not set, but the operation seed is set: A default graph-level seed and the specified operation seed are used to determine the random sequence. \item If both the graph-level and the operation seed are set: Both seeds are used in conjunction to determine the random sequence. \end{enumerate} To illustrate the user-visible effects, consider these examples: To generate different sequences across sessions, set neither graph-level nor op-level seeds: \begin{Shaded} \begin{Highlighting}[] \NormalTok{a }\OperatorTok{=} \NormalTok{tf.random_uniform([}\DecValTok{1}\NormalTok{])} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.random_normal([}\DecValTok{1}\NormalTok{])} \BuiltInTok{print} \StringTok{"Session 1"} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess1:} \BuiltInTok{print} \NormalTok{sess1.run(a) }\CommentTok{# generates 'A1'} \BuiltInTok{print} \NormalTok{sess1.run(a) }\CommentTok{# generates 'A2'} \BuiltInTok{print} \NormalTok{sess1.run(b) }\CommentTok{# generates 'B1'} \BuiltInTok{print} \NormalTok{sess1.run(b) }\CommentTok{# generates 'B2'} \BuiltInTok{print} \StringTok{"Session 2"} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess2:} \BuiltInTok{print} \NormalTok{sess2.run(a) }\CommentTok{# generates 'A3'} \BuiltInTok{print} \NormalTok{sess2.run(a) }\CommentTok{# generates 'A4'} \BuiltInTok{print} \NormalTok{sess2.run(b) }\CommentTok{# generates 'B3'} \BuiltInTok{print} \NormalTok{sess2.run(b) }\CommentTok{# generates 'B4'} \end{Highlighting} \end{Shaded} To generate the same repeatable sequence for an op across sessions, set the seed for the op: \begin{Shaded} \begin{Highlighting}[] \NormalTok{a }\OperatorTok{=} \NormalTok{tf.random_uniform([}\DecValTok{1}\NormalTok{], seed}\OperatorTok{=}\DecValTok{1}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.random_normal([}\DecValTok{1}\NormalTok{])} \CommentTok{# Repeatedly running this block with the same graph will generate the same} \CommentTok{# sequence of values for 'a', but different sequences of values for 'b'.} \BuiltInTok{print} \StringTok{"Session 1"} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess1:} \BuiltInTok{print} \NormalTok{sess1.run(a) }\CommentTok{# generates 'A1'} \BuiltInTok{print} \NormalTok{sess1.run(a) }\CommentTok{# generates 'A2'} \BuiltInTok{print} \NormalTok{sess1.run(b) }\CommentTok{# generates 'B1'} \BuiltInTok{print} \NormalTok{sess1.run(b) }\CommentTok{# generates 'B2'} \BuiltInTok{print} \StringTok{"Session 2"} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess2:} \BuiltInTok{print} \NormalTok{sess2.run(a) }\CommentTok{# generates 'A1'} \BuiltInTok{print} \NormalTok{sess2.run(a) }\CommentTok{# generates 'A2'} \BuiltInTok{print} \NormalTok{sess2.run(b) }\CommentTok{# generates 'B3'} \BuiltInTok{print} \NormalTok{sess2.run(b) }\CommentTok{# generates 'B4'} \end{Highlighting} \end{Shaded} To make the random sequences generated by all ops be repeatable across sessions, set a graph-level seed: \begin{Shaded} \begin{Highlighting}[] \NormalTok{tf.set_random_seed(}\DecValTok{1234}\NormalTok{)} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.random_uniform([}\DecValTok{1}\NormalTok{])} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.random_normal([}\DecValTok{1}\NormalTok{])} \CommentTok{# Repeatedly running this block with the same graph will generate different} \CommentTok{# sequences of 'a' and 'b'.} \BuiltInTok{print} \StringTok{"Session 1"} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess1:} \BuiltInTok{print} \NormalTok{sess1.run(a) }\CommentTok{# generates 'A1'} \BuiltInTok{print} \NormalTok{sess1.run(a) }\CommentTok{# generates 'A2'} \BuiltInTok{print} \NormalTok{sess1.run(b) }\CommentTok{# generates 'B1'} \BuiltInTok{print} \NormalTok{sess1.run(b) }\CommentTok{# generates 'B2'} \BuiltInTok{print} \StringTok{"Session 2"} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess2:} \BuiltInTok{print} \NormalTok{sess2.run(a) }\CommentTok{# generates 'A1'} \BuiltInTok{print} \NormalTok{sess2.run(a) }\CommentTok{# generates 'A2'} \BuiltInTok{print} \NormalTok{sess2.run(b) }\CommentTok{# generates 'B1'} \BuiltInTok{print} \NormalTok{sess2.run(b) }\CommentTok{# generates 'B2'} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \texttt{seed}: integer. \end{itemize} ================================================ FILE: tex_pdf/api/python/c4s03_state_ops.tex ================================================ \section{Variables }\label{variables} Note: Functions taking \lstinline{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\lstinline{tf.convert\_to\_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-variables}{Variables}}{Variables}}\label{variables-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-variables}{Variables} \item \protect\hyperlink{Variable}{\lstinline{class\ tf.Variable}} \item \protect\hyperlink{AUTOGENERATED-variable-helper-functions}{Variable helper functions} \item \protect\hyperlink{allux5fvariables}{\lstinline{tf.all\_variables()}} \item \protect\hyperlink{trainableux5fvariables}{\lstinline{tf.trainable\_variables()}} \item \protect\hyperlink{initializeux5fallux5fvariables}{\lstinline{tf.initialize\_all\_variables()}} \item \protect\hyperlink{initializeux5fvariables}{\lstinline{tf.initialize\_variables(var\_list,\ name=\textquotesingle{}init\textquotesingle{})}} \item \protect\hyperlink{assertux5fvariablesux5finitialized}{\lstinline{tf.assert\_variables\_initialized(var\_list=None)}} \item \protect\hyperlink{AUTOGENERATED-saving-and-restoring-variables}{Saving and Restoring Variables} \item \protect\hyperlink{Saver}{\lstinline{class\ tf.train.Saver}} \item \protect\hyperlink{latestux5fcheckpoint}{\lstinline{tf.train.latest\_checkpoint(checkpoint\_dir,\ latest\_filename=None)}} \item \protect\hyperlink{getux5fcheckpointux5fstate}{\lstinline{tf.train.get\_checkpoint\_state(checkpoint\_dir,\ latest\_filename=None)}} \item \protect\hyperlink{updateux5fcheckpointux5fstate}{\lstinline{tf.train.update\_checkpoint\_state(save\_dir,\ model\_checkpoint\_path,\ all\_model\_checkpoint\_paths=None,\ latest\_filename=None)}} \item \protect\hyperlink{AUTOGENERATED-sharing-variables}{Sharing Variables} \item \protect\hyperlink{getux5fvariable}{\lstinline{tf.get\_variable(name,\ shape=None,\ dtype=tf.float32,\ initializer=None,\ trainable=True,\ collections=None)}} \item \protect\hyperlink{getux5fvariableux5fscope}{\lstinline{tf.get\_variable\_scope()}} \item \protect\hyperlink{variableux5fscope}{\lstinline{tf.variable\_scope(name\_or\_scope,\ reuse=None,\ initializer=None)}} \item \protect\hyperlink{constantux5finitializer}{\lstinline{tf.constant\_initializer(value=0.0)}} \item \protect\hyperlink{randomux5fnormalux5finitializer}{\lstinline{tf.random\_normal\_initializer(mean=0.0,\ stddev=1.0,\ seed=None)}} \item \protect\hyperlink{truncatedux5fnormalux5finitializer}{\lstinline{tf.truncated\_normal\_initializer(mean=0.0,\ stddev=1.0,\ seed=None)}} \item \protect\hyperlink{randomux5funiformux5finitializer}{\lstinline{tf.random\_uniform\_initializer(minval=0.0,\ maxval=1.0,\ seed=None)}} \item \protect\hyperlink{uniformux5funitux5fscalingux5finitializer}{\lstinline{tf.uniform\_unit\_scaling\_initializer(factor=1.0,\ seed=None)}} \item \protect\hyperlink{zerosux5finitializer}{\lstinline{tf.zeros\_initializer(shape,\ dtype=tf.float32)}} \item \protect\hyperlink{AUTOGENERATED-sparse-variable-updates}{Sparse Variable Updates} \item \protect\hyperlink{scatterux5fupdate}{\lstinline{tf.scatter\_update(ref,\ indices,\ updates,\ use\_locking=None,\ name=None)}} \item \protect\hyperlink{scatterux5fadd}{\lstinline{tf.scatter\_add(ref,\ indices,\ updates,\ use\_locking=None,\ name=None)}} \item \protect\hyperlink{scatterux5fsub}{\lstinline{tf.scatter\_sub(ref,\ indices,\ updates,\ use\_locking=None,\ name=None)}} \item \protect\hyperlink{sparseux5fmask}{\lstinline{tf.sparse\_mask(a,\ mask\_indices,\ name=None)}} \item \protect\hyperlink{IndexedSlices}{\lstinline{class\ tf.IndexedSlices}} \end{itemize} \subsection{Variables }\label{variables-2} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class\ tf.Variable} }{class tf.Variable }}\label{class-tf.variable} See the \href{../../how_tos/variables/index.md}{Variables How To} for a high level overview. A variable maintains state in the graph across calls to \lstinline{run()}. You add a variable to the graph by constructing an instance of the class \lstinline{Variable}. The \lstinline{Variable()} constructor requires an initial value for the variable, which can be a \lstinline{Tensor} of any type and shape. The initial value defines the type and shape of the variable. After construction, the type and shape of the variable are fixed. The value can be changed using one of the assign methods. If you want to change the shape of a variable later you have to use an \lstinline{assign} Op with \lstinline{validate\_shape=False}. Just like any \lstinline{Tensor}, variables created with \lstinline{Variable()} can be used as inputs for other Ops in the graph. Additionally, all the operators overloaded for the \lstinline{Tensor} class are carried over to variables, so you can also add nodes to the graph by just doing arithmetic on variables. \begin{Shaded} \begin{Highlighting}[] \ImportTok{import} \NormalTok{tensorflow }\ImportTok{as} \NormalTok{tf} \CommentTok{# Create a variable.} \NormalTok{w }\OperatorTok{=} \NormalTok{tf.Variable(}\OperatorTok{<}\NormalTok{initial}\OperatorTok{-}\NormalTok{value}\OperatorTok{>}\NormalTok{, name}\OperatorTok{=<}\NormalTok{optional}\OperatorTok{-}\NormalTok{name}\OperatorTok{>}\NormalTok{)} \CommentTok{# Use the variable in the graph like any Tensor.} \NormalTok{y }\OperatorTok{=} \NormalTok{tf.matmul(w, ...another variable }\OperatorTok{or} \NormalTok{tensor...)} \CommentTok{# The overloaded operators are available too.} \NormalTok{z }\OperatorTok{=} \NormalTok{tf.sigmoid(w }\OperatorTok{+} \NormalTok{b)} \CommentTok{# Assign a new value to the variable with `assign()` or a related method.} \NormalTok{w.assign(w }\OperatorTok{+} \FloatTok{1.0}\NormalTok{)} \NormalTok{w.assign_add(}\FloatTok{1.0}\NormalTok{)} \end{Highlighting} \end{Shaded} When you launch the graph, variables have to be explicitly initialized before you can run Ops that use their value. You can initialize a variable by running its \emph{initializer op}, restoring the variable from a save file, or simply running an \lstinline{assign} Op that assigns a value to the variable. In fact, the variable \emph{initializer op} is just an \lstinline{assign} Op that assigns the variable's initial value to the variable itself. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Launch the graph in a session.} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess:} \CommentTok{# Run the variable initializer.} \NormalTok{sess.run(w.initializer)} \CommentTok{# ...you now can run ops that use the value of 'w'...} \end{Highlighting} \end{Shaded} The most common initialization pattern is to use the convenience function \lstinline{initialize_all_variables()} to add an Op to the graph that initializes all the variables. You then run that Op after launching the graph. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Add an Op to initialize all variables.} \NormalTok{init_op }\OperatorTok{=} \NormalTok{tf.initialize_all_variables()} \CommentTok{# Launch the graph in a session.} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess:} \CommentTok{# Run the Op that initializes all variables.} \NormalTok{sess.run(init_op)} \CommentTok{# ...you can now run any Op that uses variable values...} \end{Highlighting} \end{Shaded} If you need to create a variable with an initial value dependent on another variable, use the other variable's \lstinline{initialized_value()}. This ensures that variables are initialized in the right order. All variables are automatically collected in the graph where they are created. By default, the constructor adds the new variable to the graph collection \lstinline{GraphKeys.VARIABLES}. The convenience function \lstinline{all_variables()} returns the contents of that collection. When building a machine learning model it is often convenient to distinguish betwen variables holding the trainable model parameters and other variables such as a \lstinline{global step} variable used to count training steps. To make this easier, the variable constructor supports a \lstinline{trainable=\textless{}bool\textgreater{}} parameter. If \lstinline{True}, the new variable is also added to the graph collection \lstinline{GraphKeys.TRAINABLE_VARIABLES}. The convenience function \lstinline{trainable_variables()} returns the contents of this collection. The various \lstinline{Optimizer} classes use this collection as the default list of variables to optimize. Creating a variable. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.__init__(initial_value, trainable=True, collections=None, validate_shape=True, name=None)} }{tf.Variable.__init__(initial_value, trainable=True, collections=None, validate_shape=True, name=None) }}\label{tf.variable.ux5fux5finitux5fux5finitialux5fvalue-trainabletrue-collectionsnone-validateux5fshapetrue-namenone} Creates a new variable with value \lstinline{initial_value}. The new variable is added to the graph collections listed in \lstinline{collections}, which defaults to \lstinline{{[}GraphKeys.VARIABLES{]}}. If \lstinline{trainable} is \lstinline{True} the variable is also added to the graph collection \lstinline{GraphKeys.TRAINABLE_VARIABLES}. This constructor creates both a \lstinline{variable} Op and an \lstinline{assign} Op to set the variable to its initial value. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \lstinline{initial_value}: A \lstinline{Tensor}, or Python object convertible to a \lstinline{Tensor}. The initial value for the Variable. Must have a shape specified unless \lstinline{validate_shape} is set to False. \item \lstinline{trainable}: If \lstinline{True}, the default, also adds the variable to the graph collection \lstinline{GraphKeys.TRAINABLE_VARIABLES}. This collection is used as the default list of variables to use by the \lstinline{Optimizer} classes. \item \lstinline{collections}: List of graph collections keys. The new variable is added to these collections. Defaults to \lstinline{{[}GraphKeys.VARIABLES{]}}. \item \lstinline{validate_shape}: If \lstinline{False}, allows the variable to be initialized with a value of unknown shape. If \lstinline{True}, the default, the shape of \lstinline{initial_value} must be known. \item \lstinline{name}: Optional name for the variable. Defaults to \lstinline{\textquotesingle{}Variable\textquotesingle{}} and gets uniquified automatically. \end{itemize} \subparagraph{Returns: }\label{returns} A Variable. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \lstinline{ValueError}: If the initial value does not have a shape and \lstinline{validate_shape} is \lstinline{True}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.initialized_value()} }{tf.Variable.initialized_value() }}\label{tf.variable.initializedux5fvalue} Returns the value of the initialized variable. You should use this instead of the variable itself to initialize another variable with a value that depends on the value of this variable. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Initialize 'v' with a random tensor.} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.Variable(tf.truncated_normal([}\DecValTok{10}\NormalTok{, }\DecValTok{40}\NormalTok{]))} \CommentTok{# Use `initialized_value` to guarantee that `v` has been} \CommentTok{# initialized before its value is used to initialize `w`.} \CommentTok{# The random values are picked only once.} \NormalTok{w }\OperatorTok{=} \NormalTok{tf.Variable(v.initialized_value() }\OperatorTok{*} \FloatTok{2.0}\NormalTok{)} \end{Highlighting} \end{Shaded} \subparagraph{Returns: }\label{returns-1} A \lstinline{Tensor} holding the value of this variable after its initializer has run. Changing a variable value. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.assign(value, use_locking=False)} }{tf.Variable.assign(value, use_locking=False) }}\label{tf.variable.assignvalue-useux5flockingfalse} Assigns a new value to the variable. This is essentially a shortcut for \lstinline{assign(self,\ value)}. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \lstinline{value}: A \lstinline{Tensor}. The new value for this variable. \item \lstinline{use\_locking}: If \lstinline{True}, use locking during the assignment. \end{itemize} \subparagraph{Returns: }\label{returns-2} A \lstinline{Tensor} that will hold the new value of this variable after the assignment has completed. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.assign\_add(delta,\ use\_locking=False)} }{tf.Variable.assign\_add(delta, use\_locking=False) }}\label{tf.variable.assignux5fadddelta-useux5flockingfalse} Adds a value to this variable. This is essentially a shortcut for \lstinline{assign\_add(self,\ delta)}. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \lstinline{delta}: A \lstinline{Tensor}. The value to add to this variable. \item \lstinline{use\_locking}: If \lstinline{True}, use locking during the operation. \end{itemize} \subparagraph{Returns: }\label{returns-3} A \lstinline{Tensor} that will hold the new value of this variable after the addition has completed. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.assign\_sub(delta,\ use\_locking=False)} }{tf.Variable.assign\_sub(delta, use\_locking=False) }}\label{tf.variable.assignux5fsubdelta-useux5flockingfalse} Subtracts a value from this variable. This is essentially a shortcut for \lstinline{assign\_sub(self,\ delta)}. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \lstinline{delta}: A \lstinline{Tensor}. The value to subtract from this variable. \item \lstinline{use\_locking}: If \lstinline{True}, use locking during the operation. \end{itemize} \subparagraph{Returns: }\label{returns-4} A \lstinline{Tensor} that will hold the new value of this variable after the subtraction has completed. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.scatter\_sub(sparse\_delta,\ use\_locking=False)} }{tf.Variable.scatter\_sub(sparse\_delta, use\_locking=False) }}\label{tf.variable.scatterux5fsubsparseux5fdelta-useux5flockingfalse} Subtracts \lstinline{IndexedSlices} from this variable. This is essentially a shortcut for \lstinline{scatter\_sub(self,\ sparse\_delta.indices,\ sparse\_delta.values)}. \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \lstinline{sparse\_delta}: \lstinline{IndexedSlices} to be subtracted from this variable. \item \lstinline{use\_locking}: If \lstinline{True}, use locking during the operation. \end{itemize} \subparagraph{Returns: }\label{returns-5} A \lstinline{Tensor} that will hold the new value of this variable after the scattered subtraction has completed. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \lstinline{ValueError}: if \lstinline{sparse\_delta} is not an \lstinline{IndexedSlices}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.count\_up\_to(limit)} }{tf.Variable.count\_up\_to(limit) }}\label{tf.variable.countux5fupux5ftolimit} Increments this variable until it reaches \lstinline{limit}. When that Op is run it tries to increment the variable by \lstinline{1}. If incrementing the variable would bring it above \lstinline{limit} then the Op raises the exception \lstinline{OutOfRangeError}. If no error is raised, the Op outputs the value of the variable before the increment. This is essentially a shortcut for \lstinline{count\_up\_to(self,\ limit)}. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \lstinline{limit}: value at which incrementing the variable raises an error. \end{itemize} \subparagraph{Returns: }\label{returns-6} A \lstinline{Tensor} that will hold the variable value before the increment. If no other Op modifies this variable, the values produced will all be distinct. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.eval(session=None)} }{tf.Variable.eval(session=None) }}\label{tf.variable.evalsessionnone} In a session, computes and returns the value of this variable. This is not a graph construction method, it does not add ops to the graph. This convenience method requires a session where the graph containing this variable has been launched. If no session is passed, the default session is used. See the \href{../../api_docs/python/client.md\#Session}{Session class} for more information on launching a graph and on sessions. \begin{Shaded} \begin{Highlighting}[] \NormalTok{v }\OperatorTok{=} \NormalTok{tf.Variable([}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{])} \NormalTok{init }\OperatorTok{=} \NormalTok{tf.initialize_all_variables()} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess:} \NormalTok{sess.run(init)} \CommentTok{# Usage passing the session explicitly.} \BuiltInTok{print} \NormalTok{v.}\BuiltInTok{eval}\NormalTok{(sess)} \CommentTok{# Usage with the default session. The 'with' block} \CommentTok{# above makes 'sess' the default session.} \BuiltInTok{print} \NormalTok{v.}\BuiltInTok{eval}\NormalTok{()} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \lstinline{session}: The session to use to evaluate this variable. If none, the default session is used. \end{itemize} \subparagraph{Returns: }\label{returns-7} A numpy \lstinline{ndarray} with a copy of the value of this variable. Properties. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.name} }{tf.Variable.name }}\label{tf.variable.name} The name of this variable. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.dtype} }{tf.Variable.dtype }}\label{tf.variable.dtype} The \lstinline{DType} of this variable. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.get\_shape()} }{tf.Variable.get\_shape() }}\label{tf.variable.getux5fshape} The \lstinline{TensorShape} of this variable. \subparagraph{Returns: }\label{returns-8} A \lstinline{TensorShape}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.device} }{tf.Variable.device }}\label{tf.variable.device} The device of this variable. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.initializer} }{tf.Variable.initializer }}\label{tf.variable.initializer} The initializer operation for this variable. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.graph} }{tf.Variable.graph }}\label{tf.variable.graph} The \lstinline{Graph} of this variable. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.Variable.op} }{tf.Variable.op }}\label{tf.variable.op} The \lstinline{Operation} of this variable. \subsection{Variable helper functions }\label{variable-helper-functions} TensorFlow provides a set of functions to help manage the set of variables collected in the graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.all\_variables()} }{tf.all\_variables() }}\label{tf.allux5fvariables} Returns all variables collected in the graph. The \lstinline{Variable()} constructor automatically adds new variables to the graph collection \lstinline{GraphKeys.VARIABLES}. This convenience function returns the contents of that collection. \subparagraph{Returns: }\label{returns-9} A list of \lstinline{Variable} objects. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.trainable\_variables()} }{tf.trainable\_variables() }}\label{tf.trainableux5fvariables} Returns all variables created with \lstinline{trainable=True}. When passed \lstinline{trainable=True}, the \lstinline{Variable()} constructor automatically adds new variables to the graph collection \lstinline{GraphKeys.TRAINABLE\_VARIABLES}. This convenience function returns the contents of that collection. \subparagraph{Returns: }\label{returns-10} A list of Variable objects. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.initialize\_all\_variables()} }{tf.initialize\_all\_variables() }}\label{tf.initializeux5fallux5fvariables} Returns an Op that initializes all variables. This is just a shortcut for \lstinline{initialize\_variables(all\_variables())} \subparagraph{Returns: }\label{returns-11} An Op that initializes all variables in the graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.initialize\_variables(var\_list,\ name=\textquotesingle{}init\textquotesingle{})} }{tf.initialize\_variables(var\_list, name='init') }}\label{tf.initializeux5fvariablesvarux5flist-nameinit} Returns an Op that initializes a list of variables. After you launch the graph in a session, you can run the returned Op to initialize all the variables in \lstinline{var\_list}. This Op runs all the initializers of the variables in \lstinline{var\_list} in parallel. Calling \lstinline{initialize\_variables()} is equivalent to passing the list of initializers to \lstinline{Group()}. If \lstinline{var\_list} is empty, however, the function still returns an Op that can be run. That Op just has no effect. \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \lstinline{var\_list}: List of \lstinline{Variable} objects to initialize. \item \lstinline{name}: Optional name for the returned operation. \end{itemize} \subparagraph{Returns: }\label{returns-12} An Op that run the initializers of all the specified variables. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.assert\_variables\_initialized(var\_list=None)} }{tf.assert\_variables\_initialized(var\_list=None) }}\label{tf.assertux5fvariablesux5finitializedvarux5flistnone} Returns an Op to check if variables are initialized. When run, the returned Op will raise the exception \lstinline{FailedPreconditionError} if any of the variables has not yet been initialized. Note: This function is implemented by trying to fetch the values of the variables. If one of the variables is not initialized a message may be logged by the C++ runtime. This is expected. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \lstinline{var\_list}: List of \lstinline{Variable} objects to check. Defaults to the value of \lstinline{all\_variables().} \end{itemize} \subparagraph{Returns: }\label{returns-13} An Op, or None if there are no variables. \subsection{Saving and Restoring Variables }\label{saving-and-restoring-variables} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class\ tf.train.Saver} }{class tf.train.Saver }}\label{class-tf.train.saver} Saves and restores variables. See \href{../../how_tos/variables/index.md}{Variables} for an overview of variables, saving and restoring. The \lstinline{Saver} class adds ops to save and restore variables to and from \emph{checkpoints}. It also provides convenience methods to run these ops. Checkpoints are binary files in a proprietary format which map variable names to tensor values. The best way to examine the contents of a checkpoint is to load it using a \lstinline{Saver}. Savers can automatically number checkpoint filenames with a provided counter. This lets you keep multiple checkpoints at different steps while training a model. For example you can number the checkpoint filenames with the training step number. To avoid filling up disks, savers manage checkpoint files automatically. For example, they can keep only the N most recent files, or one checkpoint for every N hours of training. You number checkpoint filenames by passing a value to the optional \lstinline{global\_step} argument to \lstinline{save()}: \begin{Shaded} \begin{Highlighting}[] \NormalTok{saver.save(sess, }\StringTok{'my-model'}\NormalTok{, global_step}\OperatorTok{=}\DecValTok{0}\NormalTok{) }\OperatorTok{==>} \NormalTok{filename: }\StringTok{'my-model-0'} \NormalTok{...} \NormalTok{saver.save(sess, }\StringTok{'my-model'}\NormalTok{, global_step}\OperatorTok{=}\DecValTok{1000}\NormalTok{) }\OperatorTok{==>} \NormalTok{filename: }\StringTok{'my-model-1000'} \end{Highlighting} \end{Shaded} Additionally, optional arguments to the \lstinline{Saver()} constructor let you control the proliferation of checkpoint files on disk: \begin{itemize} \item \lstinline{max\_to\_keep} indicates the maximum number of recent checkpoint files to keep. As new files are created, older files are deleted. If None or 0, all checkpoint files are kept. Defaults to 5 (that is, the 5 most recent checkpoint files are kept.) \item \lstinline{keep\_checkpoint\_every\_n\_hours}: In addition to keeping the most recent \lstinline{max\_to\_keep} checkpoint files, you might want to keep one checkpoint file for every N hours of training. This can be useful if you want to later analyze how a model progressed during a long training session. For example, passing \lstinline{keep\_checkpoint\_every\_n\_hours=2} ensures that you keep one checkpoint file for every 2 hours of training. The default value of 10,000 hours effectively disables the feature. \end{itemize} Note that you still have to call the \lstinline{save()} method to save the model. Passing these arguments to the constructor will not save variables automatically for you. A training program that saves regularly looks like: \begin{Shaded} \begin{Highlighting}[] \NormalTok{...} \CommentTok{# Create a saver.} \NormalTok{saver }\OperatorTok{=} \NormalTok{tf.train.Saver(...variables...)} \CommentTok{# Launch the graph and train, saving the model every 1,000 steps.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \ControlFlowTok{for} \NormalTok{step }\OperatorTok{in} \BuiltInTok{xrange}\NormalTok{(}\DecValTok{1000000}\NormalTok{):} \NormalTok{sess.run(..training_op..)} \ControlFlowTok{if} \NormalTok{step }\OperatorTok{%} \DecValTok{1000} \OperatorTok{==} \DecValTok{0}\NormalTok{:} \CommentTok{# Append the step number to the checkpoint name:} \NormalTok{saver.save(sess, }\StringTok{'my-model'}\NormalTok{, global_step}\OperatorTok{=}\NormalTok{step)} \end{Highlighting} \end{Shaded} In addition to checkpoint files, savers keep a protocol buffer on disk with the list of recent checkpoints. This is used to manage numbered checkpoint files and by \lstinline{latest\_checkpoint()}, which makes it easy to discover the path to the most recent checkpoint. That protocol buffer is stored in a file named `checkpoint' next to the checkpoint files. If you create several savers, you can specify a different filename for the protocol buffer file in the call to \lstinline{save()}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.train.Saver.\_\_init\_\_(var\_list=None,\ reshape=False,\ sharded=False,\ max\_to\_keep=5,\ keep\_checkpoint\_every\_n\_hours=10000.0,\ name=None,\ restore\_sequentially=False,\ saver\_def=None,\ builder=None)} }{tf.train.Saver.\_\_init\_\_(var\_list=None, reshape=False, sharded=False, max\_to\_keep=5, keep\_checkpoint\_every\_n\_hours=10000.0, name=None, restore\_sequentially=False, saver\_def=None, builder=None) }}\label{tf.train.saver.ux5fux5finitux5fux5fvarux5flistnone-reshapefalse-shardedfalse-maxux5ftoux5fkeep5-keepux5fcheckpointux5feveryux5fnux5fhours10000.0-namenone-restoreux5fsequentiallyfalse-saverux5fdefnone-buildernone} Creates a \lstinline{Saver}. The constructor adds ops to save and restore variables. \lstinline{var\_list} specifies the variables that will be saved and restored. It can be passed as a \lstinline{dict} or a list: \begin{itemize} \tightlist \item A \lstinline{dict} of names to variables: The keys are the names that will be used to save or restore the variables in the checkpoint files. \item A list of variables: The variables will be keyed with their op name in the checkpoint files. \end{itemize} For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{v1 }\OperatorTok{=} \NormalTok{tf.Variable(..., name}\OperatorTok{=}\StringTok{'v1'}\NormalTok{)} \NormalTok{v2 }\OperatorTok{=} \NormalTok{tf.Variable(..., name}\OperatorTok{=}\StringTok{'v2'}\NormalTok{)} \CommentTok{# Pass the variables as a dict:} \NormalTok{saver }\OperatorTok{=} \NormalTok{tf.train.Saver(\{}\StringTok{'v1'}\NormalTok{: v1, }\StringTok{'v2'}\NormalTok{: v2\})} \CommentTok{# Or pass them as a list.} \NormalTok{saver }\OperatorTok{=} \NormalTok{tf.train.Saver([v1, v2])} \CommentTok{# Passing a list is equivalent to passing a dict with the variable op names} \CommentTok{# as keys:} \NormalTok{saver }\OperatorTok{=} \NormalTok{tf.train.Saver(\{v.op.name: v }\ControlFlowTok{for} \NormalTok{v }\OperatorTok{in} \NormalTok{[v1, v2]\})} \end{Highlighting} \end{Shaded} The optional \lstinline{reshape} argument, if True, allows restoring a variable from a save file where the variable had a different shape, but the same number of elements and type. This is useful if you have reshaped a variable and want to reload it from an older checkpoint. The optional \lstinline{sharded} argument, if True, instructs the saver to shard checkpoints per device. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \lstinline{var\_list}: A list of Variables or a dictionary mapping names to Variables. If None, defaults to the list of all variables. \item \lstinline{reshape}: If True, allows restoring parameters from a checkpoint where the variables have a different shape. \item \lstinline{sharded}: If True, shard the checkpoints, one per device. \item \lstinline{max\_to\_keep}: maximum number of recent checkpoints to keep. Defaults to 10,000 hours. \item \lstinline{keep\_checkpoint\_every\_n\_hours}: How often to keep checkpoints. Defaults to 10,000 hours. \item \lstinline{name}: string. Optional name to use as a prefix when adding operations. \item \lstinline{restore\_sequentially}: A Bool, which if true, causes restore of different variables to happen sequentially within each device. This can lower memory usage when restoring very large models. \item \lstinline{saver\_def}: Optional SaverDef proto to use instead of running the builder. This is only useful for specialty code that wants to recreate a Saver object for a previously built Graph that had a Saver. The saver\_def proto should be the one returned by the as\_saver\_def() call of the Saver that was created for that Graph. \item \lstinline{builder}: Optional SaverBuilder to use if a saver\_def was not provided. Defaults to BaseSaverBuilder(). \end{itemize} \subparagraph{Raises: }\label{raises-2} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{var\_list} is invalid. \item \lstinline{ValueError}: If any of the keys or values in \lstinline{var\_list} is not unique. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.train.Saver.save(sess,\ save\_path,\ global\_step=None,\ latest\_filename=None)} }{tf.train.Saver.save(sess, save\_path, global\_step=None, latest\_filename=None) }}\label{tf.train.saver.savesess-saveux5fpath-globalux5fstepnone-latestux5ffilenamenone} Saves variables. This method runs the ops added by the constructor for saving variables. It requires a session in which the graph was launched. The variables to save must also have been initialized. The method returns the path of the newly created checkpoint file. This path can be passed directly to a call to \lstinline{restore()}. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \lstinline{sess}: A Session to use to save the variables. \item \lstinline{save\_path}: string. Path to the checkpoint filename. If the saver is \lstinline{sharded}, this is the prefix of the sharded checkpoint filename. \item \lstinline{global\_step}: If provided the global step number is appended to \lstinline{save\_path} to create the checkpoint filename. The optional argument can be a Tensor, a Tensor name or an integer. \item \lstinline{latest\_filename}: Optional name for the protocol buffer file that will contains the list of most recent checkpoint filenames. That file, kept in the same directory as the checkpoint files, is automatically managed by the saver to keep track of recent checkpoints. Defaults to `checkpoint'. \end{itemize} \subparagraph{Returns: }\label{returns-14} A string: path at which the variables were saved. If the saver is sharded, this string ends with: `-?????-of-nnnnn' where `nnnnn' is the number of shards created. \subparagraph{Raises: }\label{raises-3} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{sess} is not a Session. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.train.Saver.restore(sess,\ save\_path)} }{tf.train.Saver.restore(sess, save\_path) }}\label{tf.train.saver.restoresess-saveux5fpath} Restores previously saved variables. This method runs the ops added by the constructor for restoring variables. It requires a session in which the graph was launched. The variables to restore do not have to have been initialized, as restoring is itself a way to initialize variables. The \lstinline{save\_path} argument is typically a value previously returned from a \lstinline{save()} call, or a call to \lstinline{latest\_checkpoint()}. \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \lstinline{sess}: A Session to use to restore the parameters. \item \lstinline{save\_path}: Path where parameters were previously saved. \end{itemize} Other utility methods. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.train.Saver.last\_checkpoints} }{tf.train.Saver.last\_checkpoints }}\label{tf.train.saver.lastux5fcheckpoints} List of not-yet-deleted checkpoint filenames. You can pass any of the returned values to \lstinline{restore()}. \subparagraph{Returns: }\label{returns-15} A list of checkpoint filenames, sorted from oldest to newest. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.train.Saver.set\_last\_checkpoints(last\_checkpoints)} }{tf.train.Saver.set\_last\_checkpoints(last\_checkpoints) }}\label{tf.train.saver.setux5flastux5fcheckpointslastux5fcheckpoints} Sets the list of not-yet-deleted checkpoint filenames. \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \lstinline{last\_checkpoints}: a list of checkpoint filenames. \end{itemize} \subparagraph{Raises: }\label{raises-4} \begin{itemize} \tightlist \item \lstinline{AssertionError}: if the list of checkpoint filenames has already been set. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.train.Saver.as\_saver\_def()} }{tf.train.Saver.as\_saver\_def() }}\label{tf.train.saver.asux5fsaverux5fdef} Generates a \lstinline{SaverDef} representation of this saver. \subparagraph{Returns: }\label{returns-16} A \lstinline{SaverDef} proto. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.train.latest\_checkpoint(checkpoint\_dir,\ latest\_filename=None)} }{tf.train.latest\_checkpoint(checkpoint\_dir, latest\_filename=None) }}\label{tf.train.latestux5fcheckpointcheckpointux5fdir-latestux5ffilenamenone} Finds the filename of latest saved checkpoint file. \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \lstinline{checkpoint\_dir}: Directory where the variables were saved. \item \lstinline{latest\_filename}: Optional name for the protocol buffer file that contains the list of most recent checkpoint filenames. See the corresponding argument to \lstinline{Saver.save()}. \end{itemize} \subparagraph{Returns: }\label{returns-17} The full path to the latest checkpoint or None if no checkpoint was found. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.train.get\_checkpoint\_state(checkpoint\_dir,\ latest\_filename=None)} }{tf.train.get\_checkpoint\_state(checkpoint\_dir, latest\_filename=None) }}\label{tf.train.getux5fcheckpointux5fstatecheckpointux5fdir-latestux5ffilenamenone} Returns CheckpointState proto from the ``checkpoint'' file. If the ``checkpoint'' file contains a valid CheckpointState proto, returns it. \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \lstinline{checkpoint\_dir}: The directory of checkpoints. \item \lstinline{latest\_filename}: Optional name of the checkpoint file. Default to `checkpoint'. \end{itemize} \subparagraph{Returns: }\label{returns-18} A CheckpointState if the state was available, None otherwise. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.train.update\_checkpoint\_state(save\_dir,\ model\_checkpoint\_path,\ all\_model\_checkpoint\_paths=None,\ latest\_filename=None)} }{tf.train.update\_checkpoint\_state(save\_dir, model\_checkpoint\_path, all\_model\_checkpoint\_paths=None, latest\_filename=None) }}\label{tf.train.updateux5fcheckpointux5fstatesaveux5fdir-modelux5fcheckpointux5fpath-allux5fmodelux5fcheckpointux5fpathsnone-latestux5ffilenamenone} Updates the content of the `checkpoint' file. This updates the checkpoint file containing a CheckpointState proto. \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \lstinline{save\_dir}: Directory where the model was saved. \item \lstinline{model\_checkpoint\_path}: The checkpoint file. \item \lstinline{all\_model\_checkpoint\_paths}: list of strings. Paths to all not-yet-deleted checkpoints, sorted from oldest to newest. If this is a non-empty list, the last element must be equal to model\_checkpoint\_path. These paths are also saved in the CheckpointState proto. \item \lstinline{latest\_filename}: Optional name of the checkpoint file. Default to `checkpoint'. \end{itemize} \subparagraph{Raises: }\label{raises-5} \begin{itemize} \tightlist \item \lstinline{RuntimeError}: If the save paths conflict. \end{itemize} \subsection{Sharing Variables }\label{sharing-variables} TensorFlow provides several classes and operations that you can use to create variables contingent on certain conditions. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.get\_variable(name,\ shape=None,\ dtype=tf.float32,\ initializer=None,\ trainable=True,\ collections=None)} }{tf.get\_variable(name, shape=None, dtype=tf.float32, initializer=None, trainable=True, collections=None) }}\label{tf.getux5fvariablename-shapenone-dtypetf.float32-initializernone-trainabletrue-collectionsnone} Gets an existing variable with these parameters or create a new one. This function prefixes the name with the current variable scope and performs reuse checks. See the \href{../../how_tos/variable_scope/index.md}{Variable Scope How To} for an extensive description of how reusing works. Here is a basic example: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{]) }\CommentTok{# v.name == "foo/v:0"} \NormalTok{w }\OperatorTok{=} \NormalTok{get_variable(}\StringTok{"w"}\NormalTok{, [}\DecValTok{1}\NormalTok{]) }\CommentTok{# w.name == "foo/w:0"} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{, reuse}\OperatorTok{=}\VariableTok{True}\NormalTok{)} \NormalTok{v1 }\OperatorTok{=} \NormalTok{get_variable(}\StringTok{"v"}\NormalTok{) }\CommentTok{# The same as v above.} \end{Highlighting} \end{Shaded} If initializer is \lstinline{None} (the default), the default initializer passed in the constructor is used. If that one is \lstinline{None} too, a \lstinline{UniformUnitScalingInitializer} will be used. \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \lstinline{name}: the name of the new or existing variable. \item \lstinline{shape}: shape of the new or existing variable. \item \lstinline{dtype}: type of the new or existing variable (defaults to \lstinline{DT\_FLOAT}). \item \lstinline{initializer}: initializer for the variable if one is created. \item \lstinline{trainable}: If \lstinline{True} also add the variable to the graph collection \lstinline{GraphKeys.TRAINABLE\_VARIABLES} (see variables.Variable). \item \lstinline{collections}: List of graph collections keys to add the Variable to. Defaults to \lstinline{{[}GraphKeys.VARIABLES{]}} (see variables.Variable). \end{itemize} \subparagraph{Returns: }\label{returns-19} The created or existing variable. \subparagraph{Raises: }\label{raises-6} \begin{itemize} \tightlist \item \lstinline{ValueError}: when creating a new variable and shape is not declared, or when violating reuse during variable creation. Reuse is set inside \lstinline{variable\_scope}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.get\_variable\_scope()} }{tf.get\_variable\_scope() }}\label{tf.getux5fvariableux5fscope} Returns the current variable scope. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.variable\_scope(name\_or\_scope,\ reuse=None,\ initializer=None)} }{tf.variable\_scope(name\_or\_scope, reuse=None, initializer=None) }}\label{tf.variableux5fscopenameux5forux5fscope-reusenone-initializernone} Returns a context for variable scope. Variable scope allows to create new variables and to share already created ones while providing checks to not create or share by accident. For details, see the \href{../../how_tos/variable_scope/index.md}{Variable Scope How To}, here we present only a few basic examples. Simple example of how to create a new variable: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"bar"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v.name }\OperatorTok{==} \StringTok{"foo/bar/v:0"} \end{Highlighting} \end{Shaded} Basic example of sharing a variable: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{, reuse}\OperatorTok{=}\VariableTok{True}\NormalTok{):} \NormalTok{v1 }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v1 }\OperatorTok{==} \NormalTok{v} \end{Highlighting} \end{Shaded} Sharing a variable by capturing a scope and setting reuse: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{) }\ImportTok{as} \NormalTok{scope.} \NormalTok{v }\OperatorTok{=} \NormalTok{get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \NormalTok{scope.reuse_variables()} \NormalTok{v1 }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v1 }\OperatorTok{==} \NormalTok{v} \end{Highlighting} \end{Shaded} To prevent accidental sharing of variables, we raise an exception when getting an existing variable in a non-reusing scope. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{) }\ImportTok{as} \NormalTok{scope.} \NormalTok{v }\OperatorTok{=} \NormalTok{get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \NormalTok{v1 }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \CommentTok{# Raises ValueError("... v already exists ...").} \end{Highlighting} \end{Shaded} Similarly, we raise an exception when trying to get a variable that does not exist in reuse mode. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{, reuse}\OperatorTok{=}\VariableTok{True}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \CommentTok{# Raises ValueError("... v does not exists ...").} \end{Highlighting} \end{Shaded} Note that the \lstinline{reuse} flag is inherited: if we open a reusing scope, then all its sub-scopes become reusing as well. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \lstinline{name\_or\_scope}: \lstinline{string} or \lstinline{VariableScope}: the scope to open. \item \lstinline{reuse}: \lstinline{True} or \lstinline{None}; if \lstinline{True}, we go into reuse mode for this scope as well as all sub-scopes; if \lstinline{None}, we just inherit the parent scope reuse. \item \lstinline{initializer}: default initializer for variables within this scope. \end{itemize} \subparagraph{Yields: }\label{yields} A scope that can be to captured and reused. \subparagraph{Raises: }\label{raises-7} \begin{itemize} \tightlist \item \lstinline{ValueError}: when trying to reuse within a create scope, or create within a reuse scope, or if reuse is not \lstinline{None} or \lstinline{True}. \item \lstinline{TypeError}: when the types of some arguments are not appropriate. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.constant\_initializer(value=0.0)} }{tf.constant\_initializer(value=0.0) }}\label{tf.constantux5finitializervalue0.0} Returns an initializer that generates Tensors with a single value. \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \lstinline{value}: A Python scalar. All elements of the initialized variable will be set to this value. \end{itemize} \subparagraph{Returns: }\label{returns-20} An initializer that generates Tensors with a single value. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.random\_normal\_initializer(mean=0.0,\ stddev=1.0,\ seed=None)} }{tf.random\_normal\_initializer(mean=0.0, stddev=1.0, seed=None) }}\label{tf.randomux5fnormalux5finitializermean0.0-stddev1.0-seednone} Returns an initializer that generates Tensors with a normal distribution. \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \lstinline{mean}: a python scalar or a scalar tensor. Mean of the random values to generate. \item \lstinline{stddev}: a python scalar or a scalar tensor. Standard deviation of the random values to generate. \item \lstinline{seed}: A Python integer. Used to create random seeds. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\lstinline{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-21} An initializer that generates Tensors with a normal distribution. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.truncated\_normal\_initializer(mean=0.0,\ stddev=1.0,\ seed=None)} }{tf.truncated\_normal\_initializer(mean=0.0, stddev=1.0, seed=None) }}\label{tf.truncatedux5fnormalux5finitializermean0.0-stddev1.0-seednone} Returns an initializer that generates a truncated normal distribution. These values are similar to values from a random\_normal\_initializer except that values more than two standard deviations from the mean are discarded and re-drawn. This is the recommended initializer for neural network weights and filters. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \lstinline{mean}: a python scalar or a scalar tensor. Mean of the random values to generate. \item \lstinline{stddev}: a python scalar or a scalar tensor. Standard deviation of the random values to generate. \item \lstinline{seed}: A Python integer. Used to create random seeds. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\lstinline{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-22} An initializer that generates Tensors with a truncated normal distribution. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.random\_uniform\_initializer(minval=0.0,\ maxval=1.0,\ seed=None)} }{tf.random\_uniform\_initializer(minval=0.0, maxval=1.0, seed=None) }}\label{tf.randomux5funiformux5finitializerminval0.0-maxval1.0-seednone} Returns an initializer that generates Tensors with a uniform distribution. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \lstinline{minval}: a python scalar or a scalar tensor. lower bound of the range of random values to generate. \item \lstinline{maxval}: a python scalar or a scalar tensor. upper bound of the range of random values to generate. \item \lstinline{seed}: A Python integer. Used to create random seeds. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\lstinline{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-23} An initializer that generates Tensors with a uniform distribution. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.uniform\_unit\_scaling\_initializer(factor=1.0,\ seed=None)} }{tf.uniform\_unit\_scaling\_initializer(factor=1.0, seed=None) }}\label{tf.uniformux5funitux5fscalingux5finitializerfactor1.0-seednone} Returns an initializer that generates tensors without scaling variance. When initializing a deep network, it is in principle advantageous to keep the scale of the input variance constant, so it does not explode or diminish by reaching the final layer. If the input is \lstinline{x} and the operation \lstinline{x\ *\ W}, and we want to initialize \lstinline{W} uniformly at random, we need to pick \lstinline{W} from \begin{verbatim} [-sqrt(3) / sqrt(dim), sqrt(3) / sqrt(dim)] \end{verbatim} to keep the scale intact, where \lstinline{dim\ =\ W.shape{[}0{]}} (the size of the input). A similar calculation for convolutional networks gives an analogous result with \lstinline{dim} equal to the product of the first 3 dimensions. When nonlinearities are present, we need to multiply this by a constant \lstinline{factor}. See \url{https://arxiv.org/pdf/1412.6558v3.pdf} for deeper motivation, experiments and the calculation of constants. In section 2.3 there, the constants were numerically computed: for a linear layer it's 1.0, relu: \textasciitilde{}1.43, tanh: \textasciitilde{}1.15. \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \lstinline{factor}: Float. A multiplicative factor by which the values will be scaled. \item \lstinline{seed}: A Python integer. Used to create random seeds. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\lstinline{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-24} An initializer that generates tensors with unit variance. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.zeros\_initializer(shape,\ dtype=tf.float32)} }{tf.zeros\_initializer(shape, dtype=tf.float32) }}\label{tf.zerosux5finitializershape-dtypetf.float32} An adaptor for zeros() to match the Initializer spec. \subsection{Sparse Variable Updates }\label{sparse-variable-updates} The sparse update ops modify a subset of the entries in a dense \lstinline{Variable}, either overwriting the entries or adding / subtracting a delta. These are useful for training embedding models and similar lookup-based networks, since only a small subset of embedding vectors change in any given step. Since a sparse update of a large tensor may be generated automatically during gradient computation (as in the gradient of \href{../../api_docs/python/array_ops.md\#gather}{\lstinline{tf.gather}}), an \protect\hyperlink{IndexedSlices}{\lstinline{IndexedSlices}} class is provided that encapsulates a set of sparse indices and values. \lstinline{IndexedSlices} objects are detected and handled automatically by the optimizers in most cases. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.scatter\_update(ref,\ indices,\ updates,\ use\_locking=None,\ name=None)} }{tf.scatter\_update(ref, indices, updates, use\_locking=None, name=None) }}\label{tf.scatterux5fupdateref-indices-updates-useux5flockingnone-namenone} Applies sparse updates to a variable reference. This operation computes \begin{verbatim} # Scalar indices ref[indices, ...] = updates[...] # Vector indices (for each i) ref[indices[i], ...] = updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] = updates[i, ..., j, ...] \end{verbatim} This operation outputs \lstinline{ref} after the update is done. This makes it easier to chain operations that need to use the reset value. If \lstinline{indices} contains duplicate entries, lexicographically later entries override earlier entries. Requires \lstinline{updates.shape\ =\ indices.shape\ +\ ref.shape{[}1:{]}}. \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \lstinline{ref}: A mutable \lstinline{Tensor}. Should be from a \lstinline{Variable} node. \item \lstinline{indices}: A \lstinline{Tensor}. Must be one of the following types: \lstinline{int32}, \lstinline{int64}. A tensor of indices into the first dimension of \lstinline{ref}. \item \lstinline{updates}: A \lstinline{Tensor}. Must have the same type as \lstinline{ref}. A tensor of updated values to store in \lstinline{ref}. \item \lstinline{use\_locking}: An optional \lstinline{bool}. Defaults to \lstinline{True}. If True, the assignment will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-25} Same as \lstinline{ref}. Returned as a convenience for operations that want to use the updated values after the update is done. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.scatter\_add(ref,\ indices,\ updates,\ use\_locking=None,\ name=None)} }{tf.scatter\_add(ref, indices, updates, use\_locking=None, name=None) }}\label{tf.scatterux5faddref-indices-updates-useux5flockingnone-namenone} Adds sparse updates to a variable reference. This operation computes \begin{verbatim} # Scalar indices ref[indices, ...] += updates[...] # Vector indices (for each i) ref[indices[i], ...] += updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] += updates[i, ..., j, ...] \end{verbatim} This operation outputs \lstinline{ref} after the update is done. This makes it easier to chain operations that need to use the reset value. Duplicate entries are handled correctly: if multiple \lstinline{indices} reference the same location, their contributions add. Requires \lstinline{updates.shape\ =\ indices.shape\ +\ ref.shape{[}1:{]}}. \subparagraph{Args: }\label{args-24} \begin{itemize} \tightlist \item \lstinline{ref}: A mutable \lstinline{Tensor}. Must be one of the following types: \lstinline{float32}, \lstinline{float64}, \lstinline{int64}, \lstinline{int32}, \lstinline{uint8}, \lstinline{int16}, \lstinline{int8}, \lstinline{complex64}, \lstinline{qint8}, \lstinline{quint8}, \lstinline{qint32}. Should be from a \lstinline{Variable} node. \item \lstinline{indices}: A \lstinline{Tensor}. Must be one of the following types: \lstinline{int32}, \lstinline{int64}. A tensor of indices into the first dimension of \lstinline{ref}. \item \lstinline{updates}: A \lstinline{Tensor}. Must have the same type as \lstinline{ref}. A tensor of updated values to add to \lstinline{ref}. \item \lstinline{use\_locking}: An optional \lstinline{bool}. Defaults to \lstinline{False}. If True, the addition will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-26} Same as \lstinline{ref}. Returned as a convenience for operations that want to use the updated values after the update is done. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.scatter\_sub(ref,\ indices,\ updates,\ use\_locking=None,\ name=None)} }{tf.scatter\_sub(ref, indices, updates, use\_locking=None, name=None) }}\label{tf.scatterux5fsubref-indices-updates-useux5flockingnone-namenone} Subtracts sparse updates to a variable reference. \begin{verbatim} # Scalar indices ref[indices, ...] -= updates[...] # Vector indices (for each i) ref[indices[i], ...] -= updates[i, ...] # High rank indices (for each i, ..., j) ref[indices[i, ..., j], ...] -= updates[i, ..., j, ...] \end{verbatim} This operation outputs \lstinline{ref} after the update is done. This makes it easier to chain operations that need to use the reset value. Duplicate entries are handled correctly: if multiple \lstinline{indices} reference the same location, their (negated) contributions add. Requires \lstinline{updates.shape\ =\ indices.shape\ +\ ref.shape{[}1:{]}}. \subparagraph{Args: }\label{args-25} \begin{itemize} \tightlist \item \lstinline{ref}: A mutable \lstinline{Tensor}. Must be one of the following types: \lstinline{float32}, \lstinline{float64}, \lstinline{int64}, \lstinline{int32}, \lstinline{uint8}, \lstinline{int16}, \lstinline{int8}, \lstinline{complex64}, \lstinline{qint8}, \lstinline{quint8}, \lstinline{qint32}. Should be from a \lstinline{Variable} node. \item \lstinline{indices}: A \lstinline{Tensor}. Must be one of the following types: \lstinline{int32}, \lstinline{int64}. A tensor of indices into the first dimension of \lstinline{ref}. \item \lstinline{updates}: A \lstinline{Tensor}. Must have the same type as \lstinline{ref}. A tensor of updated values to subtract from \lstinline{ref}. \item \lstinline{use\_locking}: An optional \lstinline{bool}. Defaults to \lstinline{False}. If True, the subtraction will be protected by a lock; otherwise the behavior is undefined, but may exhibit less contention. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-27} Same as \lstinline{ref}. Returned as a convenience for operations that want to use the updated values after the update is done. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.sparse\_mask(a,\ mask\_indices,\ name=None)} }{tf.sparse\_mask(a, mask\_indices, name=None) }}\label{tf.sparseux5fmaska-maskux5findices-namenone} Masks elements of \lstinline{IndexedSlices}. Given an \lstinline{IndexedSlices} instance \lstinline{a}, returns another \lstinline{IndexedSlices} that contains a subset of the slices of \lstinline{a}. Only the slices at indices specified in \lstinline{mask\_indices} are returned. This is useful when you need to extract a subset of slices in an \lstinline{IndexedSlices} object. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# `a` contains slices at indices [12, 26, 37, 45] from a large tensor} \CommentTok{# with shape [1000, 10]} \NormalTok{a.indices }\OperatorTok{=>} \NormalTok{[}\DecValTok{12}\NormalTok{, }\DecValTok{26}\NormalTok{, }\DecValTok{37}\NormalTok{, }\DecValTok{45}\NormalTok{]} \NormalTok{tf.shape(a.values) }\OperatorTok{=>} \NormalTok{[}\DecValTok{4}\NormalTok{, }\DecValTok{10}\NormalTok{]} \CommentTok{# `b` will be the subset of `a` slices at its second and third indices, so} \CommentTok{# we want to mask of its first and last indices (which are at absolute} \CommentTok{# indices 12, 45)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.sparse_mask(a, [}\DecValTok{12}\NormalTok{, }\DecValTok{45}\NormalTok{])} \NormalTok{b.indices }\OperatorTok{=>} \NormalTok{[}\DecValTok{26}\NormalTok{, }\DecValTok{37}\NormalTok{]} \NormalTok{tf.shape(b.values) }\OperatorTok{=>} \NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{10}\NormalTok{]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-26} \begin{itemize} \tightlist \item \lstinline{a}: An \lstinline{IndexedSlices} instance. \item \lstinline{mask\_indices}: Indices of elements to mask. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-28} The masked \lstinline{IndexedSlices} instance. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{class\ tf.IndexedSlices} }{class tf.IndexedSlices }}\label{class-tf.indexedslices} A sparse representation of a set of tensor slices at given indices. This class is a simple wrapper for a pair of \lstinline{Tensor} objects: \begin{itemize} \tightlist \item \lstinline{values}: A \lstinline{Tensor} of any dtype with shape \lstinline{{[}D0,\ D1,\ ...,\ Dn{]}}. \item \lstinline{indices}: A 1-D integer \lstinline{Tensor} with shape \lstinline{{[}D0{]}}. \end{itemize} An \lstinline{IndexedSlices} is typically used to represent a subset of a larger tensor \lstinline{dense} of shape \lstinline{{[}LARGE0,\ D1,\ ..\ ,\ DN{]}} where \lstinline{LARGE0\ \textgreater{}\textgreater{}\ D0}. The values in \lstinline{indices} are the indices in the first dimension of the slices that have been extracted from the larger tensor. The dense tensor \lstinline{dense} represented by an \lstinline{IndexedSlices} \lstinline{slices} has \begin{Shaded} \begin{Highlighting}[] \NormalTok{dense[slices.indices[i], :, :, :, ...] }\OperatorTok{=} \NormalTok{slices.values[i, :, :, :, ...]} \end{Highlighting} \end{Shaded} The \lstinline{IndexedSlices} class is used principally in the definition of gradients for operations that have sparse gradients (e.g. \href{../../api_docs/python/array_ops.md\#gather}{\lstinline{tf.gather}}). Contrast this representation with \href{../../api_docs/python/sparse_ops.md\#SparseTensor}{\lstinline{SparseTensor}}, which uses multi-dimensional indices and scalar values. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.\_\_init\_\_(values,\ indices,\ dense\_shape=None)} }{tf.IndexedSlices.\_\_init\_\_(values, indices, dense\_shape=None) }}\label{tf.indexedslices.ux5fux5finitux5fux5fvalues-indices-denseux5fshapenone} Creates an \lstinline{IndexedSlices}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.values} }{tf.IndexedSlices.values }}\label{tf.indexedslices.values} A \lstinline{Tensor} containing the values of the slices. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.indices} }{tf.IndexedSlices.indices }}\label{tf.indexedslices.indices} A 1-D \lstinline{Tensor} containing the indices of the slices. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.dense\_shape} }{tf.IndexedSlices.dense\_shape }}\label{tf.indexedslices.denseux5fshape} A 1-D \lstinline{Tensor} containing the shape of the corresponding dense tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.name} }{tf.IndexedSlices.name }}\label{tf.indexedslices.name} The name of this \lstinline{IndexedSlices}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.dtype} }{tf.IndexedSlices.dtype }}\label{tf.indexedslices.dtype} The \lstinline{DType} of elements in this tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.device} }{tf.IndexedSlices.device }}\label{tf.indexedslices.device} The name of the device on which \lstinline{values} will be produced, or \lstinline{None}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\lstinline{tf.IndexedSlices.op} }{tf.IndexedSlices.op }}\label{tf.indexedslices.op} The \lstinline{Operation} that produces \lstinline{values} as an output. ================================================ FILE: tex_pdf/api/python/c4s04_array_ops.tex ================================================ \section{Tensor Transformations }\label{tensor-transformations} Note: Functions taking \lstinline{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\lstinline{tf.convert_to_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-tensor-transformations}{Tensor Transformations}}{Tensor Transformations}}\label{tensor-transformations-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-casting}{Casting} \item \protect\hyperlink{stringux5ftoux5fnumber}{\lstinline{tf.string_to_number(string_tensor, out_type=None, name=None)}} \item \protect\hyperlink{toux5fdouble}{\lstinline{tf.to_double(x, name=\textquotesingle{}ToDouble\textquotesingle{})}} \item \protect\hyperlink{toux5ffloat}{\lstinline{tf.to_float(x, name=\textquotesingle{}ToFloat\textquotesingle{})}} \item \protect\hyperlink{toux5fbfloat16}{\lstinline{tf.to_bfloat16(x, name=\textquotesingle{}ToBFloat16\textquotesingle{})}} \item \protect\hyperlink{toux5fint32}{\lstinline{tf.to_int32(x, name=\textquotesingle{}ToInt32\textquotesingle{})}} \item \protect\hyperlink{toux5fint64}{\lstinline{tf.to_int64(x, name=\textquotesingle{}ToInt64\textquotesingle{})}} \item \protect\hyperlink{cast}{\lstinline{tf.cast(x, dtype, name=None)}} \item \protect\hyperlink{AUTOGENERATED-shapes-and-shaping}{Shapes and Shaping} \item \protect\hyperlink{shape}{\lstinline{tf.shape(input, name=None)}} \item \protect\hyperlink{size}{\lstinline{tf.size(input, name=None)}} \item \protect\hyperlink{rank}{\lstinline{tf.rank(input, name=None)}} \item \protect\hyperlink{reshape}{\lstinline{tf.reshape(tensor, shape, name=None)}} \item \protect\hyperlink{squeeze}{\lstinline{tf.squeeze(input, squeeze_dims=None, name=None)}} \item \protect\hyperlink{expandux5fdims}{\lstinline{tf.expand_dims(input, dim, name=None)}} \item \protect\hyperlink{AUTOGENERATED-slicing-and-joining}{Slicing and Joining} \item \protect\hyperlink{slice}{\lstinline{tf.slice(input_, begin, size, name=None)}} \item \protect\hyperlink{split}{\lstinline{tf.split(split_dim, num_split, value, name=\textquotesingle{}split\textquotesingle{})}} \item \protect\hyperlink{tile}{\lstinline{tf.tile(input, multiples, name=None)}} \item \protect\hyperlink{pad}{\lstinline{tf.pad(input, paddings, name=None)}} \item \protect\hyperlink{concat}{\lstinline{tf.concat(concat_dim, values, name=\textquotesingle{}concat\textquotesingle{})}} \item \protect\hyperlink{pack}{\lstinline{tf.pack(values, name=\textquotesingle{}pack\textquotesingle{})}} \item \protect\hyperlink{unpack}{\lstinline{tf.unpack(value, num=None, name=\textquotesingle{}unpack\textquotesingle{})}} \item \protect\hyperlink{reverseux5fsequence}{\lstinline{tf.reverse_sequence(input, seq_lengths, seq_dim, name=None)}} \item \protect\hyperlink{reverse}{\lstinline{tf.reverse(tensor, dims, name=None)}} \item \protect\hyperlink{transpose}{\lstinline{tf.transpose(a, perm=None, name=\textquotesingle{}transpose\textquotesingle{})}} \item \protect\hyperlink{gather}{\lstinline{tf.gather(params, indices, name=None)}} \item \protect\hyperlink{dynamicux5fpartition}{\lstinline{tf.dynamic_partition(data, partitions, num_partitions, name=None)}} \item \protect\hyperlink{dynamicux5fstitch}{\lstinline{tf.dynamic_stitch(indices, data, name=None)}} \end{itemize} \subsection{Casting }\label{casting} TensorFlow provides several operations that you can use to cast tensor data types in your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.string_to_number(string_tensor, out_type=None, name=None)} }{tf.string_to_number(string_tensor, out_type=None, name=None) }}\label{tf.stringux5ftoux5fnumberstringux5ftensor-outux5ftypenone-namenone} Converts each string in the input Tensor to the specified numeric type. (Note that int32 overflow results in an error while float overflow results in a rounded value.) \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \lstinline{string_tensor}: A \lstinline{Tensor} of type \lstinline{string}. \item \lstinline{out_type}: An optional \lstinline{tf.DType} from: \lstinline{tf.float32, tf.int32}. Defaults to \lstinline{tf.float32}. The numeric type to interpret each string in string_tensor as. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns} A \lstinline{Tensor} of type \lstinline{out_type}. A Tensor of the same shape as the input string_tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.to_double(x, name=\textquotesingle{}ToDouble\textquotesingle{})} }{tf.to_double(x, name='ToDouble') }}\label{tf.toux5fdoublex-nametodouble} Casts a tensor to type \lstinline{float64}. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \lstinline{x}: A \lstinline{Tensor} or \lstinline{SparseTensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-1} A \lstinline{Tensor} or \lstinline{SparseTensor} with same shape as \lstinline{x} with type \lstinline{float64}. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{x} cannot be cast to the \lstinline{float64}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.to_float(x, name=\textquotesingle{}ToFloat\textquotesingle{})} }{tf.to_float(x, name='ToFloat') }}\label{tf.toux5ffloatx-nametofloat} Casts a tensor to type \lstinline{float32}. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \lstinline{x}: A \lstinline{Tensor} or \lstinline{SparseTensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-2} A \lstinline{Tensor} or \lstinline{SparseTensor} with same shape as \lstinline{x} with type \lstinline{float32}. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{x} cannot be cast to the \lstinline{float32}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.to_bfloat16(x, name=\textquotesingle{}ToBFloat16\textquotesingle{})} }{tf.to_bfloat16(x, name='ToBFloat16') }}\label{tf.toux5fbfloat16x-nametobfloat16} Casts a tensor to type \lstinline{bfloat16}. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \lstinline{x}: A \lstinline{Tensor} or \lstinline{SparseTensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} A \lstinline{Tensor} or \lstinline{SparseTensor} with same shape as \lstinline{x} with type \lstinline{bfloat16}. \subparagraph{Raises: }\label{raises-2} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{x} cannot be cast to the \lstinline{bfloat16}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.to_int32(x, name=\textquotesingle{}ToInt32\textquotesingle{})} }{tf.to_int32(x, name='ToInt32') }}\label{tf.toux5fint32x-nametoint32} Casts a tensor to type \lstinline{int32}. \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \lstinline{x}: A \lstinline{Tensor} or \lstinline{SparseTensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-4} A \lstinline{Tensor} or \lstinline{SparseTensor} with same shape as \lstinline{x} with type \lstinline{int32}. \subparagraph{Raises: }\label{raises-3} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{x} cannot be cast to the \lstinline{int32}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.to_int64(x, name=\textquotesingle{}ToInt64\textquotesingle{})} }{tf.to_int64(x, name='ToInt64') }}\label{tf.toux5fint64x-nametoint64} Casts a tensor to type \lstinline{int64}. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \lstinline{x}: A \lstinline{Tensor} or \lstinline{SparseTensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-5} A \lstinline{Tensor} or \lstinline{SparseTensor} with same shape as \lstinline{x} with type \lstinline{int64}. \subparagraph{Raises: }\label{raises-4} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{x} cannot be cast to the \lstinline{int64}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.cast(x, dtype, name=None)} }{tf.cast(x, dtype, name=None) }}\label{tf.castx-dtype-namenone} Casts a tensor to a new type. The operation casts \lstinline{x} (in case of \lstinline{Tensor}) or \lstinline{x.values} (in case of \lstinline{SparseTensor}) to \lstinline{dtype}. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# tensor `a` is [1.8, 2.2], dtype=tf.float} \NormalTok{tf.cast(a, tf.int32) }\OperatorTok{==>} \NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{] }\CommentTok{# dtype=tf.int32} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \lstinline{x}: A \lstinline{Tensor} or \lstinline{SparseTensor}. \item \lstinline{dtype}: The destination type. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A \lstinline{Tensor} or \lstinline{SparseTensor} with same shape as \lstinline{x}. \subparagraph{Raises: }\label{raises-5} \begin{itemize} \tightlist \item \lstinline{TypeError}: If \lstinline{x} cannot be cast to the \lstinline{dtype}. \end{itemize} \subsection{Shapes and Shaping }\label{shapes-and-shaping} TensorFlow provides several operations that you can use to determine the shape of a tensor and change the shape of a tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.shape(input, name=None)} }{tf.shape(input, name=None) }}\label{tf.shapeinput-namenone} Returns the shape of a tensor. This operation returns a 1-D integer tensor representing the shape of \lstinline{input}. For example: \begin{lstlisting} # 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]] shape(t) ==> [2, 2, 3] \end{lstlisting} \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} A \lstinline{Tensor} of type \lstinline{int32}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.size(input, name=None)} }{tf.size(input, name=None) }}\label{tf.sizeinput-namenone} Returns the size of a tensor. This operation returns an integer representing the number of elements in \lstinline{input}. For example: \begin{lstlisting} # 't' is [[[1, 1,, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]]] size(t) ==> 12 \end{lstlisting} \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} A \lstinline{Tensor} of type \lstinline{int32}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.rank(input, name=None)} }{tf.rank(input, name=None) }}\label{tf.rankinput-namenone} Returns the rank of a tensor. This operation returns an integer representing the rank of \lstinline{input}. For example: \begin{lstlisting} # 't' is [[[1, 1, 1], [2, 2, 2]], [[3, 3, 3], [4, 4, 4]]] # shape of tensor 't' is [2, 2, 3] rank(t) ==> 3 \end{lstlisting} \textbf{Note}: The rank of a tensor is not the same as the rank of a matrix. The rank of a tensor is the number of indices required to uniquely select each element of the tensor. Rank is also known as ``order'', ``degree'', or ``ndims.'' \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-9} A \lstinline{Tensor} of type \lstinline{int32}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.reshape(tensor, shape, name=None)} }{tf.reshape(tensor, shape, name=None) }}\label{tf.reshapetensor-shape-namenone} Reshapes a tensor. Given \lstinline{tensor}, this operation returns a tensor that has the same values as \lstinline{tensor} with shape \lstinline{shape}. If \lstinline{shape} is the special value \lstinline{{[}-1{]}}, then \lstinline{tensor} is flattened and the operation outputs a 1-D tensor with all elements of \lstinline{tensor}. If \lstinline{shape} is 1-D or higher, then the operation returns a tensor with shape \lstinline{shape} filled with the values of \lstinline{tensor}. In this case, the number of elements implied by \lstinline{shape} must be the same as the number of elements in \lstinline{tensor}. For example: \begin{lstlisting} # tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9] # tensor 't' has shape [9] reshape(t, [3, 3]) ==> [[1, 2, 3] [4, 5, 6] [7, 8, 9]] # tensor 't' is [[[1, 1], [2, 2]] # [[3, 3], [4, 4]]] # tensor 't' has shape [2, 2] reshape(t, [2, 4]) ==> [[1, 1, 2, 2] [3, 3, 4, 4]] # tensor 't' is [[[1, 1, 1], # [2, 2, 2]], # [[3, 3, 3], # [4, 4, 4]], # [[5, 5, 5], # [6, 6, 6]]] # tensor 't' has shape [3, 2, 3] # pass '[-1]' to flatten 't' reshape(t, [-1]) ==> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6] \end{lstlisting} \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \lstinline{tensor}: A \lstinline{Tensor}. \item \lstinline{shape}: A \lstinline{Tensor} of type \lstinline{int32}. Defines the shape of the output tensor. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-10} A \lstinline{Tensor}. Has the same type as \lstinline{tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.squeeze(input, squeeze_dims=None, name=None)} }{tf.squeeze(input, squeeze_dims=None, name=None) }}\label{tf.squeezeinput-squeezeux5fdimsnone-namenone} Removes dimensions of size 1 from the shape of a tensor. Given a tensor \lstinline{input}, this operation returns a tensor of the same type with all dimensions of size 1 removed. If you don't want to remove all size 1 dimensions, you can remove specific size 1 dimensions by specifying \lstinline{squeeze_dims}. For example: \begin{lstlisting} # 't' is a tensor of shape [1, 2, 1, 3, 1, 1] shape(squeeze(t)) ==> [2, 3] \end{lstlisting} Or, to remove specific size 1 dimensions: \begin{lstlisting} # 't' is a tensor of shape [1, 2, 1, 3, 1, 1] shape(squeeze(t, [2, 4])) ==> [1, 2, 3, 1] \end{lstlisting} \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. The \lstinline{input} to squeeze. \item \lstinline{squeeze_dims}: An optional list of \lstinline{ints}. Defaults to \lstinline{{[}{]}}. If specified, only squeezes the dimensions listed. The dimension index starts at 0. It is an error to squeeze a dimension that is not 1. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-11} A \lstinline{Tensor}. Has the same type as \lstinline{input}. Contains the same data as \lstinline{input}, but has one or more dimensions of size 1 removed. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.expand_dims(input, dim, name=None)} }{tf.expand_dims(input, dim, name=None) }}\label{tf.expandux5fdimsinput-dim-namenone} Inserts a dimension of 1 into a tensor's shape. Given a tensor \lstinline{input}, this operation inserts a dimension of 1 at the dimension index \lstinline{dim} of \lstinline{input}'s shape. The dimension index \lstinline{dim} starts at zero; if you specify a negative number for \lstinline{dim} it is counted backward from the end. This operation is useful if you want to add a batch dimension to a single element. For example, if you have a single image of shape \lstinline{{[}height, width, channels{]}}, you can make it a batch of 1 image with \lstinline{expand_dims(image, 0)}, which will make the shape \lstinline{{[}1, height, width, channels{]}}. Other examples: \begin{lstlisting} # 't' is a tensor of shape [2] shape(expand_dims(t, 0)) ==> [1, 2] shape(expand_dims(t, 1)) ==> [2, 1] shape(expand_dims(t, -1)) ==> [2, 1] # 't2' is a tensor of shape [2, 3, 5] shape(expand_dims(t2, 0)) ==> [1, 2, 3, 5] shape(expand_dims(t2, 2)) ==> [2, 3, 1, 5] shape(expand_dims(t2, 3)) ==> [2, 3, 5, 1] \end{lstlisting} This operation requires that: \lstinline{-1-input.dims() \textless{}= dim \textless{}= input.dims()} This operation is related to \lstinline{squeeze()}, which removes dimensions of size 1. \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. \item \lstinline{dim}: A \lstinline{Tensor} of type \lstinline{int32}. 0-D (scalar). Specifies the dimension index at which to expand the shape of \lstinline{input}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-12} A \lstinline{Tensor}. Has the same type as \lstinline{input}. Contains the same data as \lstinline{input}, but its shape has an additional dimension of size 1 added. \subsection{Slicing and Joining }\label{slicing-and-joining} TensorFlow provides several operations to slice or extract parts of a tensor, or join multiple tensors together. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.slice(input_, begin, size, name=None)} }{tf.slice(input_, begin, size, name=None) }}\label{tf.sliceinputux5f-begin-size-namenone} Extracts a slice from a tensor. This operation extracts a slice of size \lstinline{size} from a tensor \lstinline{input} starting at the location specified by \lstinline{begin}. The slice \lstinline{size} is represented as a tensor shape, where \lstinline{size{[}i{]}} is the number of elements of the 'i'th dimension of \lstinline{input} that you want to slice. The starting location (\lstinline{begin}) for the slice is represented as an offset in each dimension of \lstinline{input}. In other words, \lstinline{begin{[}i{]}} is the offset into the 'i'th dimension of \lstinline{input} that you want to slice from. \lstinline{begin} is zero-based; \lstinline{size} is one-based. If \lstinline{size{[}i{]}} is -1, all remaining elements in dimension i are included in the slice. In other words, this is equivalent to setting: \lstinline{size{[}i{]} = input.dim_size(i) - begin{[}i{]}} This operation requires that: \lstinline{0 \textless{}= begin{[}i{]} \textless{}= begin{[}i{]} + size{[}i{]} \textless{}= Di for i in {[}0, n{]}} For example: \begin{lstlisting} # 'input' is [[[1, 1, 1], [2, 2, 2]], # [[3, 3, 3], [4, 4, 4]], # [[5, 5, 5], [6, 6, 6]]] tf.slice(input, [1, 0, 0], [1, 1, 3]) ==> [[[3, 3, 3]]] tf.slice(input, [1, 0, 0], [1, 2, 3]) ==> [[[3, 3, 3], [4, 4, 4]]] tf.slice(input, [1, 0, 0], [2, 1, 3]) ==> [[[3, 3, 3]], [[5, 5, 5]]] \end{lstlisting} \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \lstinline{input_}: A \lstinline{Tensor}. \item \lstinline{begin}: An \lstinline{int32} or \lstinline{int64} \lstinline{Tensor}. \item \lstinline{size}: An \lstinline{int32} or \lstinline{int64} \lstinline{Tensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-13} A \lstinline{Tensor} the same type as \lstinline{input}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.split(split_dim, num_split, value, name=\textquotesingle{}split\textquotesingle{})} }{tf.split(split_dim, num_split, value, name='split') }}\label{tf.splitsplitux5fdim-numux5fsplit-value-namesplit} Splits a tensor into \lstinline{num_split} tensors along one dimension. Splits \lstinline{value} along dimension \lstinline{split_dim} into \lstinline{num_split} smaller tensors. Requires that \lstinline{num_split} evenly divide \lstinline{value.shape{[}split_dim{]}}. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'value' is a tensor with shape [5, 30]} \CommentTok{# Split 'value' into 3 tensors along dimension 1} \NormalTok{split0, split1, split2 }\OperatorTok{=} \NormalTok{tf.split(}\DecValTok{1}\NormalTok{, }\DecValTok{3}\NormalTok{, value)} \NormalTok{tf.shape(split0) }\OperatorTok{==>} \NormalTok{[}\DecValTok{5}\NormalTok{, }\DecValTok{10}\NormalTok{]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \lstinline{split_dim}: A 0-D \lstinline{int32} \lstinline{Tensor}. The dimension along which to split. Must be in the range \lstinline{{[}0, rank(value))}. \item \lstinline{num_split}: A 0-D \lstinline{int32} \lstinline{Tensor}. The number of ways to split. \item \lstinline{value}: The \lstinline{Tensor} to split. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-14} \lstinline{num_split} \lstinline{Tensor} objects resulting from splitting \lstinline{value}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.tile(input, multiples, name=None)} }{tf.tile(input, multiples, name=None) }}\label{tf.tileinput-multiples-namenone} Constructs a tensor by tiling a given tensor. This operation creates a new tensor by replicating \lstinline{input} \lstinline{multiples} times. The output tensor's i'th dimension has \lstinline{input.dims(i) * multiples{[}i{]}} elements, and the values of \lstinline{input} are replicated \lstinline{multiples{[}i{]}} times along the 'i'th dimension. For example, tiling \lstinline{{[}a b c d{]}} by \lstinline{{[}2{]}} produces \lstinline{{[}a b c d a b c d{]}}. \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. 1-D or higher. \item \lstinline{multiples}: A \lstinline{Tensor} of type \lstinline{int32}. 1-D. Length must be the same as the number of dimensions in \lstinline{input} \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-15} A \lstinline{Tensor}. Has the same type as \lstinline{input}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.pad(input, paddings, name=None)} }{tf.pad(input, paddings, name=None) }}\label{tf.padinput-paddings-namenone} Pads a tensor with zeros. This operation pads a \lstinline{input} with zeros according to the \lstinline{paddings} you specify. \lstinline{paddings} is an integer tensor with shape \lstinline{{[}Dn, 2{]}}, where n is the rank of \lstinline{input}. For each dimension D of \lstinline{input}, \lstinline{paddings{[}D, 0{]}} indicates how many zeros to add before the contents of \lstinline{input} in that dimension, and \lstinline{paddings{[}D, 1{]}} indicates how many zeros to add after the contents of \lstinline{input} in that dimension. The padded size of each dimension D of the output is: \lstinline{paddings(D, 0) + input.dim_size(D) + paddings(D, 1)} For example: \begin{lstlisting} # 't' is [[1, 1], [2, 2]] # 'paddings' is [[1, 1]], [2, 2]] # rank of 't' is 2 pad(t, paddings) ==> [[0, 0, 0, 0, 0] [0, 0, 0, 0, 0] [0, 1, 1, 0, 0] [[0, 2, 2, 0, 0] [0, 0, 0, 0, 0]] \end{lstlisting} \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. \item \lstinline{paddings}: A \lstinline{Tensor} of type \lstinline{int32}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-16} A \lstinline{Tensor}. Has the same type as \lstinline{input}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.concat(concat_dim, values, name=\textquotesingle{}concat\textquotesingle{})} }{tf.concat(concat_dim, values, name='concat') }}\label{tf.concatconcatux5fdim-values-nameconcat} Concatenates tensors along one dimension. Concatenates the list of tensors \lstinline{values} along dimension \lstinline{concat_dim}. If \lstinline{values{[}i{]}.shape = {[}D0, D1, ... Dconcat_dim(i), ...Dn{]}}, the concatenated result has shape \begin{lstlisting} [D0, D1, ... Rconcat_dim, ...Dn] \end{lstlisting} where \begin{lstlisting} Rconcat_dim = sum(Dconcat_dim(i)) \end{lstlisting} That is, the data from the input tensors is joined along the \lstinline{concat_dim} dimension. The number of dimensions of the input tensors must match, and all dimensions except \lstinline{concat_dim} must be equal. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{t1 }\OperatorTok{=} \NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], [}\DecValTok{4}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{]]} \NormalTok{t2 }\OperatorTok{=} \NormalTok{[[}\DecValTok{7}\NormalTok{, }\DecValTok{8}\NormalTok{, }\DecValTok{9}\NormalTok{], [}\DecValTok{10}\NormalTok{, }\DecValTok{11}\NormalTok{, }\DecValTok{12}\NormalTok{]]} \NormalTok{tf.concat(}\DecValTok{0}\NormalTok{, [t1, t2]) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], [}\DecValTok{4}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{], [}\DecValTok{7}\NormalTok{, }\DecValTok{8}\NormalTok{, }\DecValTok{9}\NormalTok{], [}\DecValTok{10}\NormalTok{, }\DecValTok{11}\NormalTok{, }\DecValTok{12}\NormalTok{]]} \NormalTok{tf.concat(}\DecValTok{1}\NormalTok{, [t1, t2]) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{, }\DecValTok{7}\NormalTok{, }\DecValTok{8}\NormalTok{, }\DecValTok{9}\NormalTok{], [}\DecValTok{4}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{, }\DecValTok{10}\NormalTok{, }\DecValTok{11}\NormalTok{, }\DecValTok{12}\NormalTok{]]} \CommentTok{# tensor t3 with shape [2, 3]} \CommentTok{# tensor t4 with shape [2, 3]} \NormalTok{tf.shape(tf.concat(}\DecValTok{0}\NormalTok{, [t3, t4])) }\OperatorTok{==>} \NormalTok{[}\DecValTok{4}\NormalTok{, }\DecValTok{3}\NormalTok{]} \NormalTok{tf.shape(tf.concat(}\DecValTok{1}\NormalTok{, [t3, t4])) }\OperatorTok{==>} \NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{6}\NormalTok{]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \lstinline{concat_dim}: 0-D \lstinline{int32} \lstinline{Tensor}. Dimension along which to concatenate. \item \lstinline{values}: A list of \lstinline{Tensor} objects or a single \lstinline{Tensor}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-17} A \lstinline{Tensor} resulting from concatenation of the input tensors. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.pack(values, name=\textquotesingle{}pack\textquotesingle{})} }{tf.pack(values, name='pack') }}\label{tf.packvalues-namepack} Packs a list of rank-\lstinline{R} tensors into one rank-\lstinline{(R+1)} tensor. Packs tensors in \lstinline{values} into a tensor with rank one higher than each tensor in \lstinline{values} and shape \lstinline{{[}len(values){]} + values{[}0{]}.shape}. The output satisfies \lstinline{output{[}i, ...{]} = values{[}i{]}{[}...{]}}. This is the opposite of unpack. The numpy equivalent is \begin{lstlisting} tf.pack([x, y, z]) = np.asarray([x, y, z]) \end{lstlisting} \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \lstinline{values}: A list of \lstinline{Tensor} objects with the same shape and type. \item \lstinline{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-18} \begin{itemize} \tightlist \item \lstinline{output}: A packed \lstinline{Tensor} with the same type as \lstinline{values}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.unpack(value, num=None, name=\textquotesingle{}unpack\textquotesingle{})} }{tf.unpack(value, num=None, name='unpack') }}\label{tf.unpackvalue-numnone-nameunpack} Unpacks the outer dimension of a rank-\lstinline{R} tensor into rank-\lstinline{(R-1)} tensors. Unpacks \lstinline{num} tensors from \lstinline{value} along the first dimension. If \lstinline{num} is not specified (the default), it is inferred from \lstinline{value}'s shape. If \lstinline{value.shape{[}0{]}} is not known, \lstinline{ValueError} is raised. The ith tensor in \lstinline{output} is the slice \lstinline{value{[}i, ...{]}}. Each tensor in \lstinline{output} has shape \lstinline{value.shape{[}1:{]}}. This is the opposite of pack. The numpy equivalent is \begin{lstlisting} tf.unpack(x, n) = list(x) \end{lstlisting} \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \lstinline{value}: A rank \lstinline{R \textgreater{} 0} \lstinline{Tensor} to be unpacked. \item \lstinline{num}: An \lstinline{int}. The first dimension of value. Automatically inferred if \lstinline{None} (the default). \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-19} The list of \lstinline{Tensor} objects unpacked from \lstinline{value}. \subparagraph{Raises: }\label{raises-6} \begin{itemize} \tightlist \item \lstinline{ValueError}: If \lstinline{num} is unspecified and cannot be inferred. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.reverse_sequence(input, seq_lengths, seq_dim, name=None)} }{tf.reverse_sequence(input, seq_lengths, seq_dim, name=None) }}\label{tf.reverseux5fsequenceinput-sequx5flengths-sequx5fdim-namenone} Reverses variable length slices in dimension \lstinline{seq_dim}. This op first slices \lstinline{input} along the first dimension, and for each slice \lstinline{i}, reverses the first \lstinline{seq_lengths{[}i{]}} elements along the dimension \lstinline{seq_dim}. The elements of \lstinline{seq_lengths} must obey \lstinline{seq_lengths{[}i{]} \textless{} input.dims{[}seq_dim{]}}, and \lstinline{seq_lengths} must be a vector of length \lstinline{input.dims(0)}. The output slice \lstinline{i} along dimension 0 is then given by input slice \lstinline{i}, with the first \lstinline{seq_lengths{[}i{]}} slices along dimension \lstinline{seq_dim} reversed. For example: \begin{lstlisting} # Given this: seq_dim = 1 input.dims = (4, ...) seq_lengths = [7, 2, 3, 5] # then slices of input are reversed on seq_dim, but only up to seq_lengths: output[0, 0:7, :, ...] = input[0, 7:0:-1, :, ...] output[1, 0:2, :, ...] = input[1, 2:0:-1, :, ...] output[2, 0:3, :, ...] = input[2, 3:0:-1, :, ...] output[3, 0:5, :, ...] = input[3, 5:0:-1, :, ...] # while entries past seq_lens are copied through: output[0, 7:, :, ...] = input[0, 7:, :, ...] output[1, 2:, :, ...] = input[1, 2:, :, ...] output[2, 3:, :, ...] = input[2, 3:, :, ...] output[3, 2:, :, ...] = input[3, 2:, :, ...] \end{lstlisting} \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \lstinline{input}: A \lstinline{Tensor}. The input to reverse. \item \lstinline{seq_lengths}: A \lstinline{Tensor} of type \lstinline{int64}. 1-D with length \lstinline{input.dims(0)} and \lstinline{max(seq_lengths) \textless{} input.dims(seq_dim)} \item \lstinline{seq_dim}: An \lstinline{int}. The dimension which is partially reversed. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-20} A \lstinline{Tensor}. Has the same type as \lstinline{input}. The partially reversed input. It has the same shape as \lstinline{input}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.reverse(tensor, dims, name=None)} }{tf.reverse(tensor, dims, name=None) }}\label{tf.reversetensor-dims-namenone} Reverses specific dimensions of a tensor. Given a \lstinline{tensor}, and a \lstinline{bool} tensor \lstinline{dims} representing the dimensions of \lstinline{tensor}, this operation reverses each dimension i of \lstinline{tensor} where \lstinline{dims{[}i{]}} is \lstinline{True}. \lstinline{tensor} can have up to 8 dimensions. The number of dimensions of \lstinline{tensor} must equal the number of elements in \lstinline{dims}. In other words: \lstinline{rank(tensor) = size(dims)} For example: \begin{lstlisting} # tensor 't' is [[[[ 0, 1, 2, 3], # [ 4, 5, 6, 7], # [ 8, 9, 10, 11]], # [[12, 13, 14, 15], # [16, 17, 18, 19], # [20, 21, 22, 23]]]] # tensor 't' shape is [1, 2, 3, 4] # 'dims' is [False, False, False, True] reverse(t, dims) ==> [[[[ 3, 2, 1, 0], [ 7, 6, 5, 4], [ 11, 10, 9, 8]], [[15, 14, 13, 12], [19, 18, 17, 16], [23, 22, 21, 20]]]] # 'dims' is [False, True, False, False] reverse(t, dims) ==> [[[[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23] [[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]]] # 'dims' is [False, False, True, False] reverse(t, dims) ==> [[[[8, 9, 10, 11], [4, 5, 6, 7], [0, 1, 2, 3]] [[20, 21, 22, 23], [16, 17, 18, 19], [12, 13, 14, 15]]]] \end{lstlisting} \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \lstinline{tensor}: A \lstinline{Tensor}. Must be one of the following types: \lstinline{uint8}, \lstinline{int8}, \lstinline{int32}, \lstinline{bool}, \lstinline{float32}, \lstinline{float64}. Up to 8-D. \item \lstinline{dims}: A \lstinline{Tensor} of type \lstinline{bool}. 1-D. The dimensions to reverse. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-21} A \lstinline{Tensor}. Has the same type as \lstinline{tensor}. The same shape as \lstinline{tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.transpose(a, perm=None, name=\textquotesingle{}transpose\textquotesingle{})} }{tf.transpose(a, perm=None, name='transpose') }}\label{tf.transposea-permnone-nametranspose} Transposes \lstinline{a}. Permutes the dimensions according to \lstinline{perm}. The returned tensor's dimension i will correspond to the input dimension \lstinline{perm{[}i{]}}. If \lstinline{perm} is not given, it is set to (n-1\ldots{}0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'x' is [[1 2 3]} \CommentTok{# [4 5 6]]} \NormalTok{tf.transpose(x) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1} \DecValTok{4}\NormalTok{]} \NormalTok{[}\DecValTok{2} \DecValTok{5}\NormalTok{]} \NormalTok{[}\DecValTok{3} \DecValTok{6}\NormalTok{]]} \CommentTok{# Equivalently} \NormalTok{tf.transpose(x perm}\OperatorTok{=}\NormalTok{[}\DecValTok{0}\NormalTok{, }\DecValTok{1}\NormalTok{]) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1} \DecValTok{4}\NormalTok{]} \NormalTok{[}\DecValTok{2} \DecValTok{5}\NormalTok{]} \NormalTok{[}\DecValTok{3} \DecValTok{6}\NormalTok{]]} \CommentTok{# 'perm' is more useful for n-dimensional tensors, for n > 2} \CommentTok{# 'x' is [[[1 2 3]} \CommentTok{# [4 5 6]]} \CommentTok{# [[7 8 9]} \CommentTok{# [10 11 12]]]} \CommentTok{# Take the transpose of the matrices in dimension-0} \NormalTok{tf.transpose(b, perm}\OperatorTok{=}\NormalTok{[}\DecValTok{0}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{1}\NormalTok{]) }\OperatorTok{==>} \NormalTok{[[[}\DecValTok{1} \DecValTok{4}\NormalTok{]} \NormalTok{[}\DecValTok{2} \DecValTok{5}\NormalTok{]} \NormalTok{[}\DecValTok{3} \DecValTok{6}\NormalTok{]]} \NormalTok{[[}\DecValTok{7} \DecValTok{10}\NormalTok{]} \NormalTok{[}\DecValTok{8} \DecValTok{11}\NormalTok{]} \NormalTok{[}\DecValTok{9} \DecValTok{12}\NormalTok{]]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \lstinline{a}: A \lstinline{Tensor}. \item \lstinline{perm}: A permutation of the dimensions of \lstinline{a}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-22} A transposed \lstinline{Tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.gather(params, indices, name=None)} }{tf.gather(params, indices, name=None) }}\label{tf.gatherparams-indices-namenone} Gather slices from \lstinline{params} according to \lstinline{indices}. \lstinline{indices} must be an integer tensor of any dimension (usually 0-D or 1-D). Produces an output tensor with shape \lstinline{indices.shape + params.shape{[}1:{]}} where: \begin{lstlisting} # Scalar indices output[:, ..., :] = params[indices, :, ... :] # Vector indices output[i, :, ..., :] = params[indices[i], :, ... :] # Higher rank indices output[i, ..., j, :, ... :] = params[indices[i, ..., j], :, ..., :] \end{lstlisting} If \lstinline{indices} is a permutation and \lstinline{len(indices) == params.shape{[}0{]}} then this operation will permute \lstinline{params} accordingly. \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \lstinline{params}: A \lstinline{Tensor}. \item \lstinline{indices}: A \lstinline{Tensor}. Must be one of the following types: \lstinline{int32}, \lstinline{int64}. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-23} A \lstinline{Tensor}. Has the same type as \lstinline{params}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.dynamic_partition(data, partitions, num_partitions, name=None)} }{tf.dynamic_partition(data, partitions, num_partitions, name=None) }}\label{tf.dynamicux5fpartitiondata-partitions-numux5fpartitions-namenone} Partitions \lstinline{data} into \lstinline{num_partitions} tensors using indices from \lstinline{partitions}. For each index tuple \lstinline{js} of size \lstinline{partitions.ndim}, the slice \lstinline{data{[}js, ...{]}} becomes part of \lstinline{outputs{[}partitions{[}js{]}{]}}. The slices with \lstinline{partitions{[}js{]} = i} are placed in \lstinline{outputs{[}i{]}} in lexicographic order of \lstinline{js}, and the first dimension of \lstinline{outputs{[}i{]}} is the number of entries in \lstinline{partitions} equal to \lstinline{i}. In detail, \begin{lstlisting} outputs[i].shape = [sum(partitions == i)] + data.shape[partitions.ndim:] outputs[i] = pack([data[js, ...] for js if partitions[js] == i]) \end{lstlisting} \lstinline{data.shape} must start with \lstinline{partitions.shape}. For example: \begin{lstlisting} # Scalar partitions partitions = 1 num_partitions = 2 data = [10, 20] outputs[0] = [] # Empty with shape [0, 2] outputs[1] = [[10, 20]] # Vector partitions partitions = [0, 0, 1, 1, 0] num_partitions = 2 data = [10, 20, 30, 40, 50] outputs[0] = [10, 20, 50] outputs[1] = [30, 40] \end{lstlisting} \subparagraph{Args: }\label{args-24} \begin{itemize} \tightlist \item \lstinline{data}: A \lstinline{Tensor}. \item \lstinline{partitions}: A \lstinline{Tensor} of type \lstinline{int32}. Any shape. Indices in the range \lstinline{{[}0, num_partitions)}. \item \lstinline{num_partitions}: An \lstinline{int} that is \lstinline{\textgreater{}= 1}. The number of partitions to output. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-24} A list of \lstinline{num_partitions} \lstinline{Tensor} objects of the same type as data. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\lstinline{tf.dynamic_stitch(indices, data, name=None)} }{tf.dynamic_stitch(indices, data, name=None) }}\label{tf.dynamicux5fstitchindices-data-namenone} Interleave the values from the \lstinline{data} tensors into a single tensor. Builds a merged tensor such that \begin{lstlisting} merged[indices[m][i, ..., j], ...] = data[m][i, ..., j, ...] \end{lstlisting} For example, if each \lstinline{indices{[}m{]}} is scalar or vector, we have \begin{lstlisting} # Scalar indices merged[indices[m], ...] = data[m][...] # Vector indices merged[indices[m][i], ...] = data[m][i, ...] \end{lstlisting} Each \lstinline{data{[}i{]}.shape} must start with the corresponding \lstinline{indices{[}i{]}.shape}, and the rest of \lstinline{data{[}i{]}.shape} must be constant w.r.t. \lstinline{i}. That is, we must have \lstinline{data{[}i{]}.shape = indices{[}i{]}.shape + constant}. In terms of this \lstinline{constant}, the output shape is \begin{lstlisting} merged.shape = [max(indices)] + constant \end{lstlisting} Values are merged in order, so if an index appears in both \lstinline{indices{[}m{]}{[}i{]}} and \lstinline{indices{[}n{]}{[}j{]}} for \lstinline{(m,i) \textless{} (n,j)} the slice \lstinline{data{[}n{]}{[}j{]}} will appear in the merged result. For example: \begin{lstlisting} indices[0] = 6 indices[1] = [4, 1] indices[2] = [[5, 2], [0, 3]] data[0] = [61, 62] data[1] = [[41, 42], [11, 12]] data[2] = [[[51, 52], [21, 22]], [[1, 2], [31, 32]]] merged = [[1, 2], [11, 12], [21, 22], [31, 32], [41, 42], [51, 52], [61, 62]] \end{lstlisting} \subparagraph{Args: }\label{args-25} \begin{itemize} \tightlist \item \lstinline{indices}: A list of at least 2 \lstinline{Tensor} objects of type \lstinline{int32}. \item \lstinline{data}: A list with the same number of \lstinline{Tensor} objects as \lstinline{indices} of \lstinline{Tensor} objects of the same type. \item \lstinline{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-25} A \lstinline{Tensor}. Has the same type as \lstinline{data}. ================================================ FILE: tex_pdf/api/python/c4s05_math_ops.tex ================================================ \section{Math }\label{math} Note: Functions taking \texttt{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\texttt{tf.convert\_to\_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-math}{Math}}{Math}}\label{math-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-arithmetic-operators}{Arithmetic Operators} \item \protect\hyperlink{add}{\texttt{tf.add(x,\ y,\ name=None)}} \item \protect\hyperlink{sub}{\texttt{tf.sub(x,\ y,\ name=None)}} \item \protect\hyperlink{mul}{\texttt{tf.mul(x,\ y,\ name=None)}} \item \protect\hyperlink{div}{\texttt{tf.div(x,\ y,\ name=None)}} \item \protect\hyperlink{mod}{\texttt{tf.mod(x,\ y,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-basic-math-functions}{Basic Math Functions} \item \protect\hyperlink{addux5fn}{\texttt{tf.add\_n(inputs,\ name=None)}} \item \protect\hyperlink{abs}{\texttt{tf.abs(x,\ name=None)}} \item \protect\hyperlink{neg}{\texttt{tf.neg(x,\ name=None)}} \item \protect\hyperlink{sign}{\texttt{tf.sign(x,\ name=None)}} \item \protect\hyperlink{inv}{\texttt{tf.inv(x,\ name=None)}} \item \protect\hyperlink{square}{\texttt{tf.square(x,\ name=None)}} \item \protect\hyperlink{round}{\texttt{tf.round(x,\ name=None)}} \item \protect\hyperlink{sqrt}{\texttt{tf.sqrt(x,\ name=None)}} \item \protect\hyperlink{rsqrt}{\texttt{tf.rsqrt(x,\ name=None)}} \item \protect\hyperlink{pow}{\texttt{tf.pow(x,\ y,\ name=None)}} \item \protect\hyperlink{exp}{\texttt{tf.exp(x,\ name=None)}} \item \protect\hyperlink{log}{\texttt{tf.log(x,\ name=None)}} \item \protect\hyperlink{ceil}{\texttt{tf.ceil(x,\ name=None)}} \item \protect\hyperlink{floor}{\texttt{tf.floor(x,\ name=None)}} \item \protect\hyperlink{maximum}{\texttt{tf.maximum(x,\ y,\ name=None)}} \item \protect\hyperlink{minimum}{\texttt{tf.minimum(x,\ y,\ name=None)}} \item \protect\hyperlink{cos}{\texttt{tf.cos(x,\ name=None)}} \item \protect\hyperlink{sin}{\texttt{tf.sin(x,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-matrix-math-functions}{Matrix Math Functions} \item \protect\hyperlink{diag}{\texttt{tf.diag(diagonal,\ name=None)}} \item \protect\hyperlink{transpose}{\texttt{tf.transpose(a,\ perm=None,\ name=\textquotesingle{}transpose\textquotesingle{})}} \item \protect\hyperlink{matmul}{\texttt{tf.matmul(a,\ b,\ transpose\_a=False,\ transpose\_b=False,\ a\_is\_sparse=False,\ b\_is\_sparse=False,\ name=None)}} \item \protect\hyperlink{batchux5fmatmul}{\texttt{tf.batch\_matmul(x,\ y,\ adj\_x=None,\ adj\_y=None,\ name=None)}} \item \protect\hyperlink{matrixux5fdeterminant}{\texttt{tf.matrix\_determinant(input,\ name=None)}} \item \protect\hyperlink{batchux5fmatrixux5fdeterminant}{\texttt{tf.batch\_matrix\_determinant(input,\ name=None)}} \item \protect\hyperlink{matrixux5finverse}{\texttt{tf.matrix\_inverse(input,\ name=None)}} \item \protect\hyperlink{batchux5fmatrixux5finverse}{\texttt{tf.batch\_matrix\_inverse(input,\ name=None)}} \item \protect\hyperlink{cholesky}{\texttt{tf.cholesky(input,\ name=None)}} \item \protect\hyperlink{batchux5fcholesky}{\texttt{tf.batch\_cholesky(input,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-complex-number-functions}{Complex Number Functions} \item \protect\hyperlink{complex}{\texttt{tf.complex(real,\ imag,\ name=None)}} \item \protect\hyperlink{complexux5fabs}{\texttt{tf.complex\_abs(x,\ name=None)}} \item \protect\hyperlink{conj}{\texttt{tf.conj(in\_,\ name=None)}} \item \protect\hyperlink{imag}{\texttt{tf.imag(in\_,\ name=None)}} \item \protect\hyperlink{real}{\texttt{tf.real(in\_,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-reduction}{Reduction} \item \protect\hyperlink{reduceux5fsum}{\texttt{tf.reduce\_sum(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)}} \item \protect\hyperlink{reduceux5fprod}{\texttt{tf.reduce\_prod(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)}} \item \protect\hyperlink{reduceux5fmin}{\texttt{tf.reduce\_min(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)}} \item \protect\hyperlink{reduceux5fmax}{\texttt{tf.reduce\_max(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)}} \item \protect\hyperlink{reduceux5fmean}{\texttt{tf.reduce\_mean(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)}} \item \protect\hyperlink{reduceux5fall}{\texttt{tf.reduce\_all(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)}} \item \protect\hyperlink{reduceux5fany}{\texttt{tf.reduce\_any(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)}} \item \protect\hyperlink{accumulateux5fn}{\texttt{tf.accumulate\_n(inputs,\ shape=None,\ tensor\_dtype=None,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-segmentation}{Segmentation} \item \protect\hyperlink{segmentux5fsum}{\texttt{tf.segment\_sum(data,\ segment\_ids,\ name=None)}} \item \protect\hyperlink{segmentux5fprod}{\texttt{tf.segment\_prod(data,\ segment\_ids,\ name=None)}} \item \protect\hyperlink{segmentux5fmin}{\texttt{tf.segment\_min(data,\ segment\_ids,\ name=None)}} \item \protect\hyperlink{segmentux5fmax}{\texttt{tf.segment\_max(data,\ segment\_ids,\ name=None)}} \item \protect\hyperlink{segmentux5fmean}{\texttt{tf.segment\_mean(data,\ segment\_ids,\ name=None)}} \item \protect\hyperlink{unsortedux5fsegmentux5fsum}{\texttt{tf.unsorted\_segment\_sum(data,\ segment\_ids,\ num\_segments,\ name=None)}} \item \protect\hyperlink{sparseux5fsegmentux5fsum}{\texttt{tf.sparse\_segment\_sum(data,\ indices,\ segment\_ids,\ name=None)}} \item \protect\hyperlink{sparseux5fsegmentux5fmean}{\texttt{tf.sparse\_segment\_mean(data,\ indices,\ segment\_ids,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-sequence-comparison-and-indexing}{Sequence Comparison and Indexing} \item \protect\hyperlink{argmin}{\texttt{tf.argmin(input,\ dimension,\ name=None)}} \item \protect\hyperlink{argmax}{\texttt{tf.argmax(input,\ dimension,\ name=None)}} \item \protect\hyperlink{listdiff}{\texttt{tf.listdiff(x,\ y,\ name=None)}} \item \protect\hyperlink{where}{\texttt{tf.where(input,\ name=None)}} \item \protect\hyperlink{unique}{\texttt{tf.unique(x,\ name=None)}} \item \protect\hyperlink{editux5fdistance}{\texttt{tf.edit\_distance(hypothesis,\ truth,\ normalize=True,\ name=\textquotesingle{}edit\_distance\textquotesingle{})}} \item \protect\hyperlink{invertux5fpermutation}{\texttt{tf.invert\_permutation(x,\ name=None)}} \end{itemize} \subsection{Arithmetic Operators }\label{arithmetic-operators} TensorFlow provides several operations that you can use to add basic arithmetic operators to your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.add(x,\ y,\ name=None)} }{tf.add(x, y, name=None) }}\label{tf.addx-y-namenone} Returns x + y element-wise. \emph{NOTE}: Add supports broadcasting. AddN does not. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int8}, \texttt{int16}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sub(x,\ y,\ name=None)} }{tf.sub(x, y, name=None) }}\label{tf.subx-y-namenone} Returns x - y element-wise. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-1} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.mul(x,\ y,\ name=None)} }{tf.mul(x, y, name=None) }}\label{tf.mulx-y-namenone} Returns x * y element-wise. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int8}, \texttt{int16}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-2} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.div(x,\ y,\ name=None)} }{tf.div(x, y, name=None) }}\label{tf.divx-y-namenone} Returns x / y element-wise. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.mod(x,\ y,\ name=None)} }{tf.mod(x, y, name=None) }}\label{tf.modx-y-namenone} Returns element-wise remainder of division. \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}, \texttt{float32}, \texttt{float64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-4} A \texttt{Tensor}. Has the same type as \texttt{x}. \subsection{Basic Math Functions }\label{basic-math-functions} TensorFlow provides several operations that you can use to add basic mathematical functions to your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.add\_n(inputs,\ name=None)} }{tf.add\_n(inputs, name=None) }}\label{tf.addux5fninputs-namenone} Add all input tensors element wise. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \texttt{inputs}: A list of at least 1 \texttt{Tensor} objects of the same type in: \texttt{float32}, \texttt{float64}, \texttt{int64}, \texttt{int32}, \texttt{uint8}, \texttt{int16}, \texttt{int8}, \texttt{complex64}, \texttt{qint8}, \texttt{quint8}, \texttt{qint32}. Must all be the same size and shape. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-5} A \texttt{Tensor}. Has the same type as \texttt{inputs}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.abs(x,\ name=None)} }{tf.abs(x, name=None) }}\label{tf.absx-namenone} Computes the absolute value of a tensor. Given a tensor of real numbers \texttt{x}, this operation returns a tensor containing the absolute value of each element in \texttt{x}. For example, if x is an input element and y is an output element, this operation computes \textbackslash{}(y = \textbar{}x\textbar{}\textbackslash{}). See \protect\hyperlink{tfux5fcomplexux5fabs}{\texttt{tf.complex\_abs()}} to compute the absolute value of a complex number. \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{float}, \texttt{double}, \texttt{int32}, or \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A \texttt{Tensor} the same size and type as \texttt{x} with absolute values. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.neg(x,\ name=None)} }{tf.neg(x, name=None) }}\label{tf.negx-namenone} Computes numerical negative value element-wise. I.e., \textbackslash{}(y = -x\textbackslash{}). \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sign(x,\ name=None)} }{tf.sign(x, name=None) }}\label{tf.signx-namenone} Returns an element-wise indication of the sign of a number. y = sign(x) = -1 if x \textless{} 0; 0 if x == 0; 1 if x \textgreater{} 0. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.inv(x,\ name=None)} }{tf.inv(x, name=None) }}\label{tf.invx-namenone} Computes the reciprocal of x element-wise. I.e., \textbackslash{}(y = 1 / x\textbackslash{}). \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-9} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.square(x,\ name=None)} }{tf.square(x, name=None) }}\label{tf.squarex-namenone} Computes square of x element-wise. I.e., \textbackslash{}(y = x * x = x\^{}2\textbackslash{}). \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-10} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.round(x,\ name=None)} }{tf.round(x, name=None) }}\label{tf.roundx-namenone} Rounds the values of a tensor to the nearest integer, element-wise. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'a' is [0.9, 2.5, 2.3, -4.4]} \NormalTok{tf.}\BuiltInTok{round}\NormalTok{(a) }\OperatorTok{==>} \NormalTok{[ }\FloatTok{1.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\OperatorTok{-}\FloatTok{4.0} \NormalTok{]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{float} or \texttt{double}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-11} A \texttt{Tensor} of same shape and type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sqrt(x,\ name=None)} }{tf.sqrt(x, name=None) }}\label{tf.sqrtx-namenone} Computes square root of x element-wise. I.e., \textbackslash{}(y = \sqrt{x} = x\^{}\{1/2\}\textbackslash{}). \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-12} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.rsqrt(x,\ name=None)} }{tf.rsqrt(x, name=None) }}\label{tf.rsqrtx-namenone} Computes reciprocal of square root of x element-wise. I.e., \textbackslash{}(y = 1 / \sqrt{x}\textbackslash{}). \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-13} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.pow(x,\ y,\ name=None)} }{tf.pow(x, y, name=None) }}\label{tf.powx-y-namenone} Computes the power of one value to another. Given a tensor \texttt{x} and a tensor \texttt{y}, this operation computes \textbackslash{}(x\^{}y\textbackslash{}) for corresponding elements in \texttt{x} and \texttt{y}. For example: \begin{verbatim} # tensor 'x' is [[2, 2]], [3, 3]] # tensor 'y' is [[8, 16], [2, 3]] tf.pow(x, y) ==> [[256, 65536], [9, 27]] \end{verbatim} \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{float}, \texttt{double}, \texttt{int32}, \texttt{complex64}, or \texttt{int64}. \item \texttt{y}: A \texttt{Tensor} of type \texttt{float}, \texttt{double}, \texttt{int32}, \texttt{complex64}, or \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-14} A \texttt{Tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.exp(x,\ name=None)} }{tf.exp(x, name=None) }}\label{tf.expx-namenone} Computes exponential of x element-wise. \textbackslash{}(y = e\^{}x\textbackslash{}). \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-15} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.log(x,\ name=None)} }{tf.log(x, name=None) }}\label{tf.logx-namenone} Computes natural logrithm of x element-wise. I.e., \textbackslash{}(y = \log\_e x\textbackslash{}). \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-16} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.ceil(x,\ name=None)} }{tf.ceil(x, name=None) }}\label{tf.ceilx-namenone} Returns element-wise smallest integer in not less than x. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-17} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.floor(x,\ name=None)} }{tf.floor(x, name=None) }}\label{tf.floorx-namenone} Returns element-wise largest integer not greater than x. \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-18} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.maximum(x,\ y,\ name=None)} }{tf.maximum(x, y, name=None) }}\label{tf.maximumx-y-namenone} Returns the max of x and y (i.e.~x \textgreater{} y ? x : y) element-wise, broadcasts. \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-19} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.minimum(x,\ y,\ name=None)} }{tf.minimum(x, y, name=None) }}\label{tf.minimumx-y-namenone} Returns the min of x and y (i.e.~x \textless{} y ? x : y) element-wise, broadcasts. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-20} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.cos(x,\ name=None)} }{tf.cos(x, name=None) }}\label{tf.cosx-namenone} Computes cos of x element-wise. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-21} A \texttt{Tensor}. Has the same type as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sin(x,\ name=None)} }{tf.sin(x, name=None) }}\label{tf.sinx-namenone} Computes sin of x element-wise. \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}, \texttt{int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-22} A \texttt{Tensor}. Has the same type as \texttt{x}. \subsection{Matrix Math Functions }\label{matrix-math-functions} TensorFlow provides several operations that you can use to add basic mathematical functions for matrices to your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.diag(diagonal,\ name=None)} }{tf.diag(diagonal, name=None) }}\label{tf.diagdiagonal-namenone} Returns a diagonal tensor with a given diagonal values. Given a \texttt{diagonal}, this operation returns a tensor with the \texttt{diagonal} and everything else padded with zeros. The diagonal is computed as follows: Assume \texttt{diagonal} has dimensions {[}D1,\ldots{}, Dk{]}, then the output is a tensor of rank 2k with dimensions {[}D1,\ldots{}, Dk, D1,\ldots{}, Dk{]} where: \texttt{output{[}i1,...,\ ik,\ i1,...,\ ik{]}\ =\ diagonal{[}i1,\ ...,\ ik{]}} and 0 everywhere else. For example: \begin{verbatim} # 'diagonal' is [1, 2, 3, 4] tf.diag(diagonal) ==> [[1, 0, 0, 0] [0, 2, 0, 0] [0, 0, 3, 0] [0, 0, 0, 4]] \end{verbatim} \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \texttt{diagonal}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. Rank k tensor where k is at most 3. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-23} A \texttt{Tensor}. Has the same type as \texttt{diagonal}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.transpose(a,\ perm=None,\ name=\textquotesingle{}transpose\textquotesingle{})} }{tf.transpose(a, perm=None, name='transpose') }}\label{tf.transposea-permnone-nametranspose} Transposes \texttt{a}. Permutes the dimensions according to \texttt{perm}. The returned tensor's dimension i will correspond to the input dimension \texttt{perm{[}i{]}}. If \texttt{perm} is not given, it is set to (n-1\ldots{}0), where n is the rank of the input tensor. Hence by default, this operation performs a regular matrix transpose on 2-D input Tensors. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'x' is [[1 2 3]} \CommentTok{# [4 5 6]]} \NormalTok{tf.transpose(x) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1} \DecValTok{4}\NormalTok{]} \NormalTok{[}\DecValTok{2} \DecValTok{5}\NormalTok{]} \NormalTok{[}\DecValTok{3} \DecValTok{6}\NormalTok{]]} \CommentTok{# Equivalently} \NormalTok{tf.transpose(x perm}\OperatorTok{=}\NormalTok{[}\DecValTok{0}\NormalTok{, }\DecValTok{1}\NormalTok{]) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{1} \DecValTok{4}\NormalTok{]} \NormalTok{[}\DecValTok{2} \DecValTok{5}\NormalTok{]} \NormalTok{[}\DecValTok{3} \DecValTok{6}\NormalTok{]]} \CommentTok{# 'perm' is more useful for n-dimensional tensors, for n > 2} \CommentTok{# 'x' is [[[1 2 3]} \CommentTok{# [4 5 6]]} \CommentTok{# [[7 8 9]} \CommentTok{# [10 11 12]]]} \CommentTok{# Take the transpose of the matrices in dimension-0} \NormalTok{tf.transpose(b, perm}\OperatorTok{=}\NormalTok{[}\DecValTok{0}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{1}\NormalTok{]) }\OperatorTok{==>} \NormalTok{[[[}\DecValTok{1} \DecValTok{4}\NormalTok{]} \NormalTok{[}\DecValTok{2} \DecValTok{5}\NormalTok{]} \NormalTok{[}\DecValTok{3} \DecValTok{6}\NormalTok{]]} \NormalTok{[[}\DecValTok{7} \DecValTok{10}\NormalTok{]} \NormalTok{[}\DecValTok{8} \DecValTok{11}\NormalTok{]} \NormalTok{[}\DecValTok{9} \DecValTok{12}\NormalTok{]]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-24} \begin{itemize} \tightlist \item \texttt{a}: A \texttt{Tensor}. \item \texttt{perm}: A permutation of the dimensions of \texttt{a}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-24} A transposed \texttt{Tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.matmul(a,\ b,\ transpose\_a=False,\ transpose\_b=False,\ a\_is\_sparse=False,\ b\_is\_sparse=False,\ name=None)} }{tf.matmul(a, b, transpose\_a=False, transpose\_b=False, a\_is\_sparse=False, b\_is\_sparse=False, name=None) }}\label{tf.matmula-b-transposeux5fafalse-transposeux5fbfalse-aux5fisux5fsparsefalse-bux5fisux5fsparsefalse-namenone} Multiplies matrix \texttt{a} by matrix \texttt{b}, producing \texttt{a} * \texttt{b}. The inputs must be two-dimensional matrices, with matching inner dimensions, possibly after transposition. Both matrices must be of the same type. The supported types are: \texttt{float}, \texttt{double}, \texttt{int32}, \texttt{complex64}. Either matrix can be transposed on the fly by setting the corresponding flag to \texttt{True}. This is \texttt{False} by default. If one or both of the matrices contain a lot of zeros, a more efficient multiplication algorithm can be used by setting the corresponding \texttt{a\_is\_sparse} or \texttt{b\_is\_sparse} flag to \texttt{True}. These are \texttt{False} by default. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 2-D tensor `a`} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant([}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{, }\DecValTok{4}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{6}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{]) }\OperatorTok{=>} \NormalTok{[[}\DecValTok{1}\NormalTok{. }\DecValTok{2}\NormalTok{. }\DecValTok{3}\NormalTok{.]} \NormalTok{[}\DecValTok{4}\NormalTok{. }\DecValTok{5}\NormalTok{. }\DecValTok{6}\NormalTok{.]]} \CommentTok{# 2-D tensor `b`} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant([}\DecValTok{7}\NormalTok{, }\DecValTok{8}\NormalTok{, }\DecValTok{9}\NormalTok{, }\DecValTok{10}\NormalTok{, }\DecValTok{11}\NormalTok{, }\DecValTok{12}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{2}\NormalTok{]) }\OperatorTok{=>} \NormalTok{[[}\DecValTok{7}\NormalTok{. }\DecValTok{8}\NormalTok{.]} \NormalTok{[}\DecValTok{9}\NormalTok{. }\DecValTok{10}\NormalTok{.]} \NormalTok{[}\DecValTok{11}\NormalTok{. }\DecValTok{12}\NormalTok{.]]} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.matmul(a, b) }\OperatorTok{=>} \NormalTok{[[}\DecValTok{58} \DecValTok{64}\NormalTok{]} \NormalTok{[}\DecValTok{139} \DecValTok{154}\NormalTok{]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-25} \begin{itemize} \tightlist \item \texttt{a}: \texttt{Tensor} of type \texttt{float}, \texttt{double}, \texttt{int32} or \texttt{complex64}. \item \texttt{b}: \texttt{Tensor} with same type as \texttt{a}. \item \texttt{transpose\_a}: If \texttt{True}, \texttt{a} is transposed before multiplication. \item \texttt{transpose\_b}: If \texttt{True}, \texttt{b} is transposed before multiplication. \item \texttt{a\_is\_sparse}: If \texttt{True}, \texttt{a} is treated as a sparse matrix. \item \texttt{b\_is\_sparse}: If \texttt{True}, \texttt{b} is treated as a sparse matrix. \item \texttt{name}: Name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-25} A \texttt{Tensor} of the same type as \texttt{a}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.batch\_matmul(x,\ y,\ adj\_x=None,\ adj\_y=None,\ name=None)} }{tf.batch\_matmul(x, y, adj\_x=None, adj\_y=None, name=None) }}\label{tf.batchux5fmatmulx-y-adjux5fxnone-adjux5fynone-namenone} Multiplies slices of two tensors in batches. Multiplies all slices of \texttt{Tensor} \texttt{x} and \texttt{y} (each slice can be viewed as an element of a batch), and arranges the individual results in a single output tensor of the same batch size. Each of the individual slices can optionally be adjointed (to adjoint a matrix means to transpose and conjugate it) before multiplication by setting the \texttt{adj\_x} or \texttt{adj\_y} flag to \texttt{True}, which are by default \texttt{False}. The input tensors \texttt{x} and \texttt{y} are 3-D or higher with shape \texttt{{[}...,\ r\_x,\ c\_x{]}} and \texttt{{[}...,\ r\_y,\ c\_y{]}}. The output tensor is 3-D or higher with shape \texttt{{[}...,\ r\_o,\ c\_o{]}}, where: \begin{verbatim} r_o = c_x if adj_x else r_x c_o = r_y if adj_y else c_y \end{verbatim} It is computed as: \begin{verbatim} out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :]) \end{verbatim} \subparagraph{Args: }\label{args-26} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{complex64}. 3-D or higher with shape \texttt{{[}...,\ r\_x,\ c\_x{]}}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. 3-D or higher with shape \texttt{{[}...,\ r\_y,\ c\_y{]}}. \item \texttt{adj\_x}: An optional \texttt{bool}. Defaults to \texttt{False}. If \texttt{True}, adjoint the slices of \texttt{x}. Defaults to \texttt{False}. \item \texttt{adj\_y}: An optional \texttt{bool}. Defaults to \texttt{False}. If \texttt{True}, adjoint the slices of \texttt{y}. Defaults to \texttt{False}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-26} A \texttt{Tensor}. Has the same type as \texttt{x}. 3-D or higher with shape \texttt{{[}...,\ r\_o,\ c\_o{]}} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.matrix\_determinant(input,\ name=None)} }{tf.matrix\_determinant(input, name=None) }}\label{tf.matrixux5fdeterminantinput-namenone} Calculates the determinant of a square matrix. \subparagraph{Args: }\label{args-27} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. A tensor of shape \texttt{{[}M,\ M{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-27} A \texttt{Tensor}. Has the same type as \texttt{input}. A scalar, equal to the determinant of the input. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.batch\_matrix\_determinant(input,\ name=None)} }{tf.batch\_matrix\_determinant(input, name=None) }}\label{tf.batchux5fmatrixux5fdeterminantinput-namenone} Calculates the determinants for a batch of square matrices. The input is a tensor of shape \texttt{{[}...,\ M,\ M{]}} whose inner-most 2 dimensions form square matrices. The output is a 1-D tensor containing the determinants for all input submatrices \texttt{{[}...,\ :,\ :{]}}. \subparagraph{Args: }\label{args-28} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. Shape is \texttt{{[}...,\ M,\ M{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-28} A \texttt{Tensor}. Has the same type as \texttt{input}. Shape is \texttt{{[}...{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.matrix\_inverse(input,\ name=None)} }{tf.matrix\_inverse(input, name=None) }}\label{tf.matrixux5finverseinput-namenone} Calculates the inverse of a square invertible matrix. Checks for invertibility. \subparagraph{Args: }\label{args-29} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. Shape is \texttt{{[}M,\ M{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-29} A \texttt{Tensor}. Has the same type as \texttt{input}. Shape is \texttt{{[}M,\ M{]}} containing the matrix inverse of the input. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.batch\_matrix\_inverse(input,\ name=None)} }{tf.batch\_matrix\_inverse(input, name=None) }}\label{tf.batchux5fmatrixux5finverseinput-namenone} Calculates the inverse of square invertible matrices. Checks for invertibility. The input is a tensor of shape \texttt{{[}...,\ M,\ M{]}} whose inner-most 2 dimensions form square matrices. The output is a tensor of the same shape as the input containing the inverse for all input submatrices \texttt{{[}...,\ :,\ :{]}}. \subparagraph{Args: }\label{args-30} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. Shape is \texttt{{[}...,\ M,\ M{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-30} A \texttt{Tensor}. Has the same type as \texttt{input}. Shape is \texttt{{[}...,\ M,\ M{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.cholesky(input,\ name=None)} }{tf.cholesky(input, name=None) }}\label{tf.choleskyinput-namenone} Calculates the Cholesky decomposition of a square matrix. The input has to be symmetric and positive definite. Only the lower-triangular part of the input will be used for this operation. The upper-triangular part will not be read. The result is the lower-triangular matrix of the Cholesky decomposition of the input. \subparagraph{Args: }\label{args-31} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float64}, \texttt{float32}. Shape is \texttt{{[}M,\ M{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-31} A \texttt{Tensor}. Has the same type as \texttt{input}. Shape is \texttt{{[}M,\ M{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.batch\_cholesky(input,\ name=None)} }{tf.batch\_cholesky(input, name=None) }}\label{tf.batchux5fcholeskyinput-namenone} Calculates the Cholesky decomposition of a batch of square matrices. The input is a tensor of shape \texttt{{[}...,\ M,\ M{]}} whose inner-most 2 dimensions form square matrices, with the same constraints as the single matrix Cholesky decomposition above. The output is a tensor of the same shape as the input containing the Cholesky decompositions for all input submatrices \texttt{{[}...,\ :,\ :{]}}. \subparagraph{Args: }\label{args-32} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float64}, \texttt{float32}. Shape is \texttt{{[}...,\ M,\ M{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-32} A \texttt{Tensor}. Has the same type as \texttt{input}. Shape is \texttt{{[}...,\ M,\ M{]}}. \subsection{Complex Number Functions }\label{complex-number-functions} TensorFlow provides several operations that you can use to add complex number functions to your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.complex(real,\ imag,\ name=None)} }{tf.complex(real, imag, name=None) }}\label{tf.complexreal-imag-namenone} Converts two real numbers to a complex number. Given a tensor \texttt{real} representing the real part of a complex number, and a tensor \texttt{imag} representing the imaginary part of a complex number, this operation computes complex numbers elementwise of the form \textbackslash{}(a + bj\textbackslash{}), where \emph{a} represents the \texttt{real} part and \emph{b} represents the \texttt{imag} part. The input tensors \texttt{real} and \texttt{imag} must be the same shape. For example: \begin{verbatim} # tensor 'real' is [2.25, 3.25] # tensor `imag` is [4.75, 5.75] tf.complex(real, imag) ==> [[2.25 + 4.74j], [3.25 + 5.75j]] \end{verbatim} \subparagraph{Args: }\label{args-33} \begin{itemize} \tightlist \item \texttt{real}: A \texttt{Tensor} of type \texttt{float}. \item \texttt{imag}: A \texttt{Tensor} of type \texttt{float}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-33} A \texttt{Tensor} of type \texttt{complex64}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.complex\_abs(x,\ name=None)} }{tf.complex\_abs(x, name=None) }}\label{tf.complexux5fabsx-namenone} Computes the complex absolute value of a tensor. Given a tensor \texttt{x} of complex numbers, this operation returns a tensor of type \texttt{float} that is the absolute value of each element in \texttt{x}. All elements in \texttt{x} must be complex numbers of the form \textbackslash{}(a + bj\textbackslash{}). The absolute value is computed as \textbackslash{}( \sqrt{a^2 + b^2}\textbackslash{}). For example: \begin{verbatim} # tensor 'x' is [[-2.25 + 4.75j], [-3.25 + 5.75j]] tf.complex_abs(x) ==> [5.25594902, 6.60492229] \end{verbatim} \subparagraph{Args: }\label{args-34} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{complex64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-34} A \texttt{Tensor} of type \texttt{float32}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.conj(in\_,\ name=None)} }{tf.conj(in\_, name=None) }}\label{tf.conjinux5f-namenone} Returns the complex conjugate of a complex number. Given a tensor \texttt{in} of complex numbers, this operation returns a tensor of complex numbers that are the complex conjugate of each element in \texttt{in}. The complex numbers in \texttt{in} must be of the form \textbackslash{}(a + bj\textbackslash{}), where \emph{a} is the real part and \emph{b} is the imaginary part. The complex conjugate returned by this operation is of the form \textbackslash{}(a - bj\textbackslash{}). For example: \begin{verbatim} # tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j] tf.conj(in) ==> [-2.25 - 4.75j, 3.25 - 5.75j] \end{verbatim} \subparagraph{Args: }\label{args-35} \begin{itemize} \tightlist \item \texttt{in\_}: A \texttt{Tensor} of type \texttt{complex64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-35} A \texttt{Tensor} of type \texttt{complex64}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.imag(in\_,\ name=None)} }{tf.imag(in\_, name=None) }}\label{tf.imaginux5f-namenone} Returns the imaginary part of a complex number. Given a tensor \texttt{in} of complex numbers, this operation returns a tensor of type \texttt{float} that is the imaginary part of each element in \texttt{in}. All elements in \texttt{in} must be complex numbers of the form \textbackslash{}(a + bj\textbackslash{}), where \emph{a} is the real part and \emph{b} is the imaginary part returned by this operation. For example: \begin{verbatim} # tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j] tf.imag(in) ==> [4.75, 5.75] \end{verbatim} \subparagraph{Args: }\label{args-36} \begin{itemize} \tightlist \item \texttt{in\_}: A \texttt{Tensor} of type \texttt{complex64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-36} A \texttt{Tensor} of type \texttt{float32}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.real(in\_,\ name=None)} }{tf.real(in\_, name=None) }}\label{tf.realinux5f-namenone} Returns the real part of a complex number. Given a tensor \texttt{in} of complex numbers, this operation returns a tensor of type \texttt{float} that is the real part of each element in \texttt{in}. All elements in \texttt{in} must be complex numbers of the form \textbackslash{}(a + bj\textbackslash{}), where \emph{a} is the real part returned by this operation and \emph{b} is the imaginary part. For example: \begin{verbatim} # tensor 'in' is [-2.25 + 4.75j, 3.25 + 5.75j] tf.real(in) ==> [-2.25, 3.25] \end{verbatim} \subparagraph{Args: }\label{args-37} \begin{itemize} \tightlist \item \texttt{in\_}: A \texttt{Tensor} of type \texttt{complex64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-37} A \texttt{Tensor} of type \texttt{float32}. \subsection{Reduction }\label{reduction} TensorFlow provides several operations that you can use to perform common math computations that reduce various dimensions of a tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.reduce\_sum(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)} }{tf.reduce\_sum(input\_tensor, reduction\_indices=None, keep\_dims=False, name=None) }}\label{tf.reduceux5fsuminputux5ftensor-reductionux5findicesnone-keepux5fdimsfalse-namenone} Computes the sum of elements across dimensions of a tensor. Reduces \texttt{input\_tensor} along the dimensions given in \texttt{reduction\_indices}. Unless \texttt{keep\_dims} is true, the rank of the tensor is reduced by 1 for each entry in \texttt{reduction\_indices}. If \texttt{keep\_dims} is true, the reduced dimensions are retained with length 1. If \texttt{reduction\_indices} has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'x' is [[1, 1, 1]]} \CommentTok{# [1, 1, 1]]} \NormalTok{tf.reduce_sum(x) }\OperatorTok{==>} \DecValTok{6} \NormalTok{tf.reduce_sum(x, }\DecValTok{0}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{2}\NormalTok{]} \NormalTok{tf.reduce_sum(x, }\DecValTok{1}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{3}\NormalTok{]} \NormalTok{tf.reduce_sum(x, }\DecValTok{1}\NormalTok{, keep_dims}\OperatorTok{=}\VariableTok{True}\NormalTok{) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{3}\NormalTok{], [}\DecValTok{3}\NormalTok{]]} \NormalTok{tf.reduce_sum(x, [}\DecValTok{0}\NormalTok{, }\DecValTok{1}\NormalTok{]) }\OperatorTok{==>} \DecValTok{6} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-38} \begin{itemize} \tightlist \item \texttt{input\_tensor}: The tensor to reduce. Should have numeric type. \item \texttt{reduction\_indices}: The dimensions to reduce. If \texttt{None} (the defaut), reduces all dimensions. \item \texttt{keep\_dims}: If true, retains reduced dimensions with length 1. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-38} The reduced tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.reduce\_prod(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)} }{tf.reduce\_prod(input\_tensor, reduction\_indices=None, keep\_dims=False, name=None) }}\label{tf.reduceux5fprodinputux5ftensor-reductionux5findicesnone-keepux5fdimsfalse-namenone} Computes the product of elements across dimensions of a tensor. Reduces \texttt{input\_tensor} along the dimensions given in \texttt{reduction\_indices}. Unless \texttt{keep\_dims} is true, the rank of the tensor is reduced by 1 for each entry in \texttt{reduction\_indices}. If \texttt{keep\_dims} is true, the reduced dimensions are retained with length 1. If \texttt{reduction\_indices} has no entries, all dimensions are reduced, and a tensor with a single element is returned. \subparagraph{Args: }\label{args-39} \begin{itemize} \tightlist \item \texttt{input\_tensor}: The tensor to reduce. Should have numeric type. \item \texttt{reduction\_indices}: The dimensions to reduce. If \texttt{None} (the defaut), reduces all dimensions. \item \texttt{keep\_dims}: If true, retains reduced dimensions with length 1. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-39} The reduced tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.reduce\_min(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)} }{tf.reduce\_min(input\_tensor, reduction\_indices=None, keep\_dims=False, name=None) }}\label{tf.reduceux5fmininputux5ftensor-reductionux5findicesnone-keepux5fdimsfalse-namenone} Computes the minimum of elements across dimensions of a tensor. Reduces \texttt{input\_tensor} along the dimensions given in \texttt{reduction\_indices}. Unless \texttt{keep\_dims} is true, the rank of the tensor is reduced by 1 for each entry in \texttt{reduction\_indices}. If \texttt{keep\_dims} is true, the reduced dimensions are retained with length 1. If \texttt{reduction\_indices} has no entries, all dimensions are reduced, and a tensor with a single element is returned. \subparagraph{Args: }\label{args-40} \begin{itemize} \tightlist \item \texttt{input\_tensor}: The tensor to reduce. Should have numeric type. \item \texttt{reduction\_indices}: The dimensions to reduce. If \texttt{None} (the defaut), reduces all dimensions. \item \texttt{keep\_dims}: If true, retains reduced dimensions with length 1. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-40} The reduced tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.reduce\_max(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)} }{tf.reduce\_max(input\_tensor, reduction\_indices=None, keep\_dims=False, name=None) }}\label{tf.reduceux5fmaxinputux5ftensor-reductionux5findicesnone-keepux5fdimsfalse-namenone} Computes the maximum of elements across dimensions of a tensor. Reduces \texttt{input\_tensor} along the dimensions given in \texttt{reduction\_indices}. Unless \texttt{keep\_dims} is true, the rank of the tensor is reduced by 1 for each entry in \texttt{reduction\_indices}. If \texttt{keep\_dims} is true, the reduced dimensions are retained with length 1. If \texttt{reduction\_indices} has no entries, all dimensions are reduced, and a tensor with a single element is returned. \subparagraph{Args: }\label{args-41} \begin{itemize} \tightlist \item \texttt{input\_tensor}: The tensor to reduce. Should have numeric type. \item \texttt{reduction\_indices}: The dimensions to reduce. If \texttt{None} (the defaut), reduces all dimensions. \item \texttt{keep\_dims}: If true, retains reduced dimensions with length 1. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-41} The reduced tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.reduce\_mean(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)} }{tf.reduce\_mean(input\_tensor, reduction\_indices=None, keep\_dims=False, name=None) }}\label{tf.reduceux5fmeaninputux5ftensor-reductionux5findicesnone-keepux5fdimsfalse-namenone} Computes the mean of elements across dimensions of a tensor. Reduces \texttt{input\_tensor} along the dimensions given in \texttt{reduction\_indices}. Unless \texttt{keep\_dims} is true, the rank of the tensor is reduced by 1 for each entry in \texttt{reduction\_indices}. If \texttt{keep\_dims} is true, the reduced dimensions are retained with length 1. If \texttt{reduction\_indices} has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'x' is [[1., 1. ]]} \CommentTok{# [2., 2.]]} \NormalTok{tf.reduce_mean(x) }\OperatorTok{==>} \FloatTok{1.5} \NormalTok{tf.reduce_mean(x, }\DecValTok{0}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\FloatTok{1.5}\NormalTok{, }\FloatTok{1.5}\NormalTok{]} \NormalTok{tf.reduce_mean(x, }\DecValTok{1}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\DecValTok{1}\NormalTok{., }\DecValTok{2}\NormalTok{.]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-42} \begin{itemize} \tightlist \item \texttt{input\_tensor}: The tensor to reduce. Should have numeric type. \item \texttt{reduction\_indices}: The dimensions to reduce. If \texttt{None} (the defaut), reduces all dimensions. \item \texttt{keep\_dims}: If true, retains reduced dimensions with length 1. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-42} The reduced tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.reduce\_all(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)} }{tf.reduce\_all(input\_tensor, reduction\_indices=None, keep\_dims=False, name=None) }}\label{tf.reduceux5fallinputux5ftensor-reductionux5findicesnone-keepux5fdimsfalse-namenone} Computes the ``logical and'' of elements across dimensions of a tensor. Reduces \texttt{input\_tensor} along the dimensions given in \texttt{reduction\_indices}. Unless \texttt{keep\_dims} is true, the rank of the tensor is reduced by 1 for each entry in \texttt{reduction\_indices}. If \texttt{keep\_dims} is true, the reduced dimensions are retained with length 1. If \texttt{reduction\_indices} has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'x' is [[True, True]]} \CommentTok{# [False, False]]} \NormalTok{tf.reduce_all(x) }\OperatorTok{==>} \VariableTok{False} \NormalTok{tf.reduce_all(x, }\DecValTok{0}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\VariableTok{False}\NormalTok{, }\VariableTok{False}\NormalTok{]} \NormalTok{tf.reduce_all(x, }\DecValTok{1}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\VariableTok{True}\NormalTok{, }\VariableTok{False}\NormalTok{]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-43} \begin{itemize} \tightlist \item \texttt{input\_tensor}: The boolean tensor to reduce. \item \texttt{reduction\_indices}: The dimensions to reduce. If \texttt{None} (the defaut), reduces all dimensions. \item \texttt{keep\_dims}: If true, retains reduced dimensions with length 1. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-43} The reduced tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.reduce\_any(input\_tensor,\ reduction\_indices=None,\ keep\_dims=False,\ name=None)} }{tf.reduce\_any(input\_tensor, reduction\_indices=None, keep\_dims=False, name=None) }}\label{tf.reduceux5fanyinputux5ftensor-reductionux5findicesnone-keepux5fdimsfalse-namenone} Computes the ``logical or'' of elements across dimensions of a tensor. Reduces \texttt{input\_tensor} along the dimensions given in \texttt{reduction\_indices}. Unless \texttt{keep\_dims} is true, the rank of the tensor is reduced by 1 for each entry in \texttt{reduction\_indices}. If \texttt{keep\_dims} is true, the reduced dimensions are retained with length 1. If \texttt{reduction\_indices} has no entries, all dimensions are reduced, and a tensor with a single element is returned. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'x' is [[True, True]]} \CommentTok{# [False, False]]} \NormalTok{tf.reduce_any(x) }\OperatorTok{==>} \VariableTok{True} \NormalTok{tf.reduce_any(x, }\DecValTok{0}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\VariableTok{True}\NormalTok{, }\VariableTok{True}\NormalTok{]} \NormalTok{tf.reduce_any(x, }\DecValTok{1}\NormalTok{) }\OperatorTok{==>} \NormalTok{[}\VariableTok{True}\NormalTok{, }\VariableTok{False}\NormalTok{]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-44} \begin{itemize} \tightlist \item \texttt{input\_tensor}: The boolean tensor to reduce. \item \texttt{reduction\_indices}: The dimensions to reduce. If \texttt{None} (the defaut), reduces all dimensions. \item \texttt{keep\_dims}: If true, retains reduced dimensions with length 1. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-44} The reduced tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.accumulate\_n(inputs,\ shape=None,\ tensor\_dtype=None,\ name=None)} }{tf.accumulate\_n(inputs, shape=None, tensor\_dtype=None, name=None) }}\label{tf.accumulateux5fninputs-shapenone-tensorux5fdtypenone-namenone} Returns the element-wise sum of a list of tensors. Optionally, pass \texttt{shape} and \texttt{tensor\_dtype} for shape and type checking, otherwise, these are inferred. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# tensor 'a' is [[1, 2], [3, 4]} \CommentTok{# tensor `b` is [[5, 0], [0, 6]]} \NormalTok{tf.accumulate_n([a, b, a]) }\OperatorTok{==>} \NormalTok{[[}\DecValTok{7}\NormalTok{, }\DecValTok{4}\NormalTok{], [}\DecValTok{6}\NormalTok{, }\DecValTok{14}\NormalTok{]]} \CommentTok{# Explicitly pass shape and type} \NormalTok{tf.accumulate_n([a, b, a], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{2}\NormalTok{], tensor_dtype}\OperatorTok{=}\NormalTok{tf.int32)} \OperatorTok{==>} \NormalTok{[[}\DecValTok{7}\NormalTok{, }\DecValTok{4}\NormalTok{], [}\DecValTok{6}\NormalTok{, }\DecValTok{14}\NormalTok{]]} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-45} \begin{itemize} \tightlist \item \texttt{inputs}: A list of \texttt{Tensor} objects, each with same shape and type. \item \texttt{shape}: Shape of elements of \texttt{inputs}. \item \texttt{tensor\_dtype}: The type of \texttt{inputs}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-45} A \texttt{Tensor} of same shape and type as the elements of \texttt{inputs}. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{ValueError}: If \texttt{inputs} don't all have same shape and dtype or the shape cannot be inferred. \end{itemize} \subsection{Segmentation }\label{segmentation} TensorFlow provides several operations that you can use to perform common math computations on tensor segments. Here a segmentation is a partitioning of a tensor along the first dimension, i.e.~it defines a mapping from the first dimension onto \texttt{segment\_ids}. The \texttt{segment\_ids} tensor should be the size of the first dimension, \texttt{d0}, with consecutive IDs in the range \texttt{0} to \texttt{k}, where \texttt{k\textless{}d0}. In particular, a segmentation of a matrix tensor is a mapping of rows to segments. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant([[}\DecValTok{1}\NormalTok{,}\DecValTok{2}\NormalTok{,}\DecValTok{3}\NormalTok{,}\DecValTok{4}\NormalTok{], [}\OperatorTok{-}\DecValTok{1}\NormalTok{,}\OperatorTok{-}\DecValTok{2}\NormalTok{,}\OperatorTok{-}\DecValTok{3}\NormalTok{,}\OperatorTok{-}\DecValTok{4}\NormalTok{], [}\DecValTok{5}\NormalTok{,}\DecValTok{6}\NormalTok{,}\DecValTok{7}\NormalTok{,}\DecValTok{8}\NormalTok{]])} \NormalTok{tf.segment_sum(c, tf.constant([}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{1}\NormalTok{]))} \OperatorTok{==>} \NormalTok{[[}\DecValTok{0} \DecValTok{0} \DecValTok{0} \DecValTok{0}\NormalTok{]} \NormalTok{[}\DecValTok{5} \DecValTok{6} \DecValTok{7} \DecValTok{8}\NormalTok{]]} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.segment\_sum(data,\ segment\_ids,\ name=None)} }{tf.segment\_sum(data, segment\_ids, name=None) }}\label{tf.segmentux5fsumdata-segmentux5fids-namenone} Computes the sum along segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Computes a tensor such that \textbackslash{}(output\_i = \sum\_j data\_j\textbackslash{}) where sum is over \texttt{j} such that \texttt{segment\_ids{[}j{]}\ ==\ i}. \subparagraph{Args: }\label{args-46} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{segment\_ids}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. A 1-D tensor whose rank is equal to the rank of \texttt{data}'s first dimension. Values should be sorted and can be repeated. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-46} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{k}, the number of segments. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.segment\_prod(data,\ segment\_ids,\ name=None)} }{tf.segment\_prod(data, segment\_ids, name=None) }}\label{tf.segmentux5fproddata-segmentux5fids-namenone} Computes the product along segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Computes a tensor such that \textbackslash{}(output\_i = \prod\_j data\_j\textbackslash{}) where the product is over \texttt{j} such that \texttt{segment\_ids{[}j{]}\ ==\ i}. \subparagraph{Args: }\label{args-47} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{segment\_ids}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. A 1-D tensor whose rank is equal to the rank of \texttt{data}'s first dimension. Values should be sorted and can be repeated. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-47} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{k}, the number of segments. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.segment\_min(data,\ segment\_ids,\ name=None)} }{tf.segment\_min(data, segment\_ids, name=None) }}\label{tf.segmentux5fmindata-segmentux5fids-namenone} Computes the minimum along segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Computes a tensor such that \textbackslash{}(output\_i = \min\_j(data\_j)\textbackslash{}) where \texttt{min} is over \texttt{j} such that \texttt{segment\_ids{[}j{]}\ ==\ i}. \subparagraph{Args: }\label{args-48} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{segment\_ids}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. A 1-D tensor whose rank is equal to the rank of \texttt{data}'s first dimension. Values should be sorted and can be repeated. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-48} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{k}, the number of segments. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.segment\_max(data,\ segment\_ids,\ name=None)} }{tf.segment\_max(data, segment\_ids, name=None) }}\label{tf.segmentux5fmaxdata-segmentux5fids-namenone} Computes the maximum along segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Computes a tensor such that \textbackslash{}(output\_i = \max\_j(data\_j)\textbackslash{}) where \texttt{max} is over \texttt{j} such that \texttt{segment\_ids{[}j{]}\ ==\ i}. \subparagraph{Args: }\label{args-49} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{segment\_ids}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. A 1-D tensor whose rank is equal to the rank of \texttt{data}'s first dimension. Values should be sorted and can be repeated. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-49} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{k}, the number of segments. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.segment\_mean(data,\ segment\_ids,\ name=None)} }{tf.segment\_mean(data, segment\_ids, name=None) }}\label{tf.segmentux5fmeandata-segmentux5fids-namenone} Computes the mean along segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Computes a tensor such that \textbackslash{}(output\_i = \frac{\sum_j data_j}{N}\textbackslash{}) where \texttt{mean} is over \texttt{j} such that \texttt{segment\_ids{[}j{]}\ ==\ i} and \texttt{N} is the total number of values summed. \subparagraph{Args: }\label{args-50} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{segment\_ids}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. A 1-D tensor whose rank is equal to the rank of \texttt{data}'s first dimension. Values should be sorted and can be repeated. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-50} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{k}, the number of segments. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.unsorted\_segment\_sum(data,\ segment\_ids,\ num\_segments,\ name=None)} }{tf.unsorted\_segment\_sum(data, segment\_ids, num\_segments, name=None) }}\label{tf.unsortedux5fsegmentux5fsumdata-segmentux5fids-numux5fsegments-namenone} Computes the sum along segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Computes a tensor such that \textbackslash{}(output\_i = \sum\_j data\_j\textbackslash{}) where sum is over \texttt{j} such that \texttt{segment\_ids{[}j{]}\ ==\ i}. Unlike \texttt{SegmentSum}, \texttt{segment\_ids} need not be sorted and need not cover all values in the full range of valid values. If the sum is empty for a given segment ID \texttt{i}, \texttt{output{[}i{]}\ =\ 0}. \texttt{num\_segments} should equal the number of distinct segment IDs. \subparagraph{Args: }\label{args-51} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{segment\_ids}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. A 1-D tensor whose rank is equal to the rank of \texttt{data}'s first dimension. \item \texttt{num\_segments}: A \texttt{Tensor} of type \texttt{int32}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-51} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{num\_segments}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_segment\_sum(data,\ indices,\ segment\_ids,\ name=None)} }{tf.sparse\_segment\_sum(data, indices, segment\_ids, name=None) }}\label{tf.sparseux5fsegmentux5fsumdata-indices-segmentux5fids-namenone} Computes the sum along sparse segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Like \texttt{SegmentSum}, but \texttt{segment\_ids} can have rank less than \texttt{data}'s first dimension, selecting a subset of dimension\_0, specified by \texttt{indices}. For example: \begin{verbatim} c = tf.constant([[1,2,3,4], [-1,-2,-3,-4], [5,6,7,8]]) # Select two rows, one segment. tf.sparse_segment_sum(c, tf.constant([0, 1]), tf.constant([0, 0])) ==> [[0 0 0 0]] # Select two rows, two segment. tf.sparse_segment_sum(c, tf.constant([0, 1]), tf.constant([0, 1])) ==> [[ 1 2 3 4] [-1 -2 -3 -4]] # Select all rows, two segments. tf.sparse_segment_sum(c, tf.constant([0, 1, 2]), tf.constant([0, 0, 1])) ==> [[0 0 0 0] [5 6 7 8]] # Which is equivalent to: tf.segment_sum(c, tf.constant([0, 0, 1])) \end{verbatim} \subparagraph{Args: }\label{args-52} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{indices}: A \texttt{Tensor} of type \texttt{int32}. A 1-D tensor. Has same rank as \texttt{segment\_ids}. \item \texttt{segment\_ids}: A \texttt{Tensor} of type \texttt{int32}. A 1-D tensor. Values should be sorted and can be repeated. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-52} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{k}, the number of segments. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_segment\_mean(data,\ indices,\ segment\_ids,\ name=None)} }{tf.sparse\_segment\_mean(data, indices, segment\_ids, name=None) }}\label{tf.sparseux5fsegmentux5fmeandata-indices-segmentux5fids-namenone} Computes the mean along sparse segments of a tensor. Read \href{../../api_docs/python/math_ops.md\#segmentation}{the section on Segmentation} for an explanation of segments. Like \texttt{SegmentMean}, but \texttt{segment\_ids} can have rank less than \texttt{data}'s first dimension, selecting a subset of dimension\_0, specified by \texttt{indices}. \subparagraph{Args: }\label{args-53} \begin{itemize} \tightlist \item \texttt{data}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{indices}: A \texttt{Tensor} of type \texttt{int32}. A 1-D tensor. Has same rank as \texttt{segment\_ids}. \item \texttt{segment\_ids}: A \texttt{Tensor} of type \texttt{int32}. A 1-D tensor. Values should be sorted and can be repeated. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-53} A \texttt{Tensor}. Has the same type as \texttt{data}. Has same shape as data, except for dimension\_0 which has size \texttt{k}, the number of segments. \subsection{Sequence Comparison and Indexing }\label{sequence-comparison-and-indexing} TensorFlow provides several operations that you can use to add sequence comparison and index extraction to your graph. You can use these operations to determine sequence differences and determine the indexes of specific values in a tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.argmin(input,\ dimension,\ name=None)} }{tf.argmin(input, dimension, name=None) }}\label{tf.argmininput-dimension-namenone} Returns the index with the smallest value across dimensions of a tensor. \subparagraph{Args: }\label{args-54} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int64}, \texttt{int32}, \texttt{uint8}, \texttt{int16}, \texttt{int8}, \texttt{complex64}, \texttt{qint8}, \texttt{quint8}, \texttt{qint32}. \item \texttt{dimension}: A \texttt{Tensor} of type \texttt{int32}. int32, 0 \textless{}= dimension \textless{} rank(input). Describes which dimension of the input Tensor to reduce across. For vectors, use dimension = 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-54} A \texttt{Tensor} of type \texttt{int64}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.argmax(input,\ dimension,\ name=None)} }{tf.argmax(input, dimension, name=None) }}\label{tf.argmaxinput-dimension-namenone} Returns the index with the largest value across dimensions of a tensor. \subparagraph{Args: }\label{args-55} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int64}, \texttt{int32}, \texttt{uint8}, \texttt{int16}, \texttt{int8}, \texttt{complex64}, \texttt{qint8}, \texttt{quint8}, \texttt{qint32}. \item \texttt{dimension}: A \texttt{Tensor} of type \texttt{int32}. int32, 0 \textless{}= dimension \textless{} rank(input). Describes which dimension of the input Tensor to reduce across. For vectors, use dimension = 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-55} A \texttt{Tensor} of type \texttt{int64}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.listdiff(x,\ y,\ name=None)} }{tf.listdiff(x, y, name=None) }}\label{tf.listdiffx-y-namenone} Computes the difference between two lists of numbers. Given a list \texttt{x} and a list \texttt{y}, this operation returns a list \texttt{out} that represents all numbers that are in \texttt{x} but not in \texttt{y}. The returned list \texttt{out} is sorted in the same order that the numbers appear in \texttt{x} (duplicates are preserved). This operation also returns a list \texttt{idx} that represents the position of each \texttt{out} element in \texttt{x}. In other words: \texttt{out{[}i{]}\ =\ x{[}idx{[}i{]}{]}\ for\ i\ in\ {[}0,\ 1,\ ...,\ len(out)\ -\ 1{]}} For example, given this input: \begin{verbatim} x = [1, 2, 3, 4, 5, 6] y = [1, 3, 5] \end{verbatim} This operation would return: \begin{verbatim} out ==> [2, 4, 6] idx ==> [1, 3, 5] \end{verbatim} \subparagraph{Args: }\label{args-56} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. 1-D. Values to keep. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. 1-D. Values to remove. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-56} A tuple of \texttt{Tensor} objects (out, idx). \begin{itemize} \tightlist \item \texttt{out}: A \texttt{Tensor}. Has the same type as \texttt{x}. 1-D. Values present in \texttt{x} but not in \texttt{y}. \item \texttt{idx}: A \texttt{Tensor} of type \texttt{int32}. 1-D. Positions of \texttt{x} values preserved in \texttt{out}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.where(input,\ name=None)} }{tf.where(input, name=None) }}\label{tf.whereinput-namenone} Returns locations of true values in a boolean tensor. This operation returns the coordinates of true elements in \texttt{input}. The coordinates are returned in a 2-D tensor where the first dimension (rows) represents the number of true elements, and the second dimension (columns) represents the coordinates of the true elements. Keep in mind, the shape of the output tensor can vary depending on how many true values there are in \texttt{input}. Indices are output in row-major order. For example: \begin{verbatim} # 'input' tensor is [[True, False] # [True, False]] # 'input' has two true values, so output has two coordinates. # 'input' has rank of 2, so coordinates have two indices. where(input) ==> [[0, 0], [1, 0]] # `input` tensor is [[[True, False] # [True, False]] # [[False, True] # [False, True]] # [[False, False] # [False, True]]] # 'input' has 5 true values, so output has 5 coordinates. # 'input' has rank of 3, so coordinates have three indices. where(input) ==> [[0, 0, 0], [0, 1, 0], [1, 0, 1], [1, 1, 1], [2, 1, 1]] \end{verbatim} \subparagraph{Args: }\label{args-57} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-57} A \texttt{Tensor} of type \texttt{int64}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.unique(x,\ name=None)} }{tf.unique(x, name=None) }}\label{tf.uniquex-namenone} Finds unique elements in a 1-D tensor. This operation returns a tensor \texttt{y} containing all of the unique elements of \texttt{x} sorted in the same order that they occur in \texttt{x}. This operation also returns a tensor \texttt{idx} the same size as \texttt{x} that contains the index of each value of \texttt{x} in the unique output \texttt{y}. In other words: \texttt{y{[}idx{[}i{]}{]}\ =\ x{[}i{]}\ for\ i\ in\ {[}0,\ 1,...,rank(x)\ -\ 1{]}} For example: \begin{verbatim} # tensor 'x' is [1, 1, 2, 4, 4, 4, 7, 8, 8] y, idx = unique(x) y ==> [1, 2, 4, 7, 8] idx ==> [0, 0, 1, 2, 2, 2, 3, 4, 4] \end{verbatim} \subparagraph{Args: }\label{args-58} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. 1-D. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-58} A tuple of \texttt{Tensor} objects (y, idx). \begin{itemize} \tightlist \item \texttt{y}: A \texttt{Tensor}. Has the same type as \texttt{x}. 1-D. \item \texttt{idx}: A \texttt{Tensor} of type \texttt{int32}. 1-D. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.edit\_distance(hypothesis,\ truth,\ normalize=True,\ name=\textquotesingle{}edit\_distance\textquotesingle{})} }{tf.edit\_distance(hypothesis, truth, normalize=True, name='edit\_distance') }}\label{tf.editux5fdistancehypothesis-truth-normalizetrue-nameeditux5fdistance} Computes the Levenshtein distance between sequences. This operation takes variable-length sequences (\texttt{hypothesis} and \texttt{truth}), each provided as a \texttt{SparseTensor}, and computes the Levenshtein distance. You can normalize the edit distance by length of \texttt{truth} by setting \texttt{normalize} to true. For example, given the following input: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'hypothesis' is a tensor of shape `[2, 1]` with variable-length values:} \CommentTok{# (0,0) = ["a"]} \CommentTok{# (1,0) = ["b"]} \NormalTok{hypothesis }\OperatorTok{=} \NormalTok{tf.SparseTensor(} \NormalTok{[[}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{],} \NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{]],} \NormalTok{[}\StringTok{"a"}\NormalTok{, }\StringTok{"b"}\NormalTok{]} \NormalTok{(}\DecValTok{2}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{))} \CommentTok{# 'truth' is a tensor of shape `[2, 2]` with variable-length values:} \CommentTok{# (0,0) = []} \CommentTok{# (0,1) = ["a"]} \CommentTok{# (1,0) = ["b", "c"]} \CommentTok{# (1,1) = ["a"]} \NormalTok{truth }\OperatorTok{=} \NormalTok{tf.SparseTensor(} \NormalTok{[[}\DecValTok{0}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{],} \NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{],} \NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{1}\NormalTok{],} \NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{]]} \NormalTok{[}\StringTok{"a"}\NormalTok{, }\StringTok{"b"}\NormalTok{, }\StringTok{"c"}\NormalTok{, }\StringTok{"a"}\NormalTok{],} \NormalTok{(}\DecValTok{2}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{2}\NormalTok{))} \NormalTok{normalize }\OperatorTok{=} \VariableTok{True} \end{Highlighting} \end{Shaded} This operation would return the following: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 'output' is a tensor of shape `[2, 2]` with edit distances normalized} \CommentTok{# by 'truth' lengths.} \NormalTok{output }\OperatorTok{==>} \NormalTok{[[inf, }\FloatTok{1.0}\NormalTok{], }\CommentTok{# (0,0): no truth, (0,1): no hypothesis} \NormalTok{[}\FloatTok{0.5}\NormalTok{, }\FloatTok{1.0}\NormalTok{]] }\CommentTok{# (1,0): addition, (1,1): no hypothesis} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-59} \begin{itemize} \tightlist \item \texttt{hypothesis}: A \texttt{SparseTensor} containing hypothesis sequences. \item \texttt{truth}: A \texttt{SparseTensor} containing truth sequences. \item \texttt{normalize}: A \texttt{bool}. If \texttt{True}, normalizes the Levenshtein distance by length of \texttt{truth.} \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-59} A dense \texttt{Tensor} with rank \texttt{R\ -\ 1}, where R is the rank of the \texttt{SparseTensor} inputs \texttt{hypothesis} and \texttt{truth}. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{TypeError}: If either \texttt{hypothesis} or \texttt{truth} are not a \texttt{SparseTensor}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.invert\_permutation(x,\ name=None)} }{tf.invert\_permutation(x, name=None) }}\label{tf.invertux5fpermutationx-namenone} Computes the inverse permutation of a tensor. This operation computes the inverse of an index permutation. It takes a 1-D integer tensor \texttt{x}, which represents the indices of a zero-based array, and swaps each value with its index position. In other words, for an ouput tensor \texttt{y} and an input tensor \texttt{x}, this operation computes the following: \texttt{y{[}x{[}i{]}{]}\ =\ i\ for\ i\ in\ {[}0,\ 1,\ ...,\ len(x)\ -\ 1{]}} The values must include 0. There can be no duplicate values or negative values. For example: \begin{verbatim} # tensor `x` is [3, 4, 0, 2, 1] invert_permutation(x) ==> [2, 4, 3, 0, 1] \end{verbatim} \subparagraph{Args: }\label{args-60} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{int32}. 1-D. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-60} A \texttt{Tensor} of type \texttt{int32}. 1-D. ================================================ FILE: tex_pdf/api/python/c4s06_control_flow_ops.tex ================================================ \section{Control Flow }\label{control-flow} Note: Functions taking \texttt{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\texttt{tf.convert\_to\_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-control-flow}{Control Flow}}{Control Flow}}\label{control-flow-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-control-flow-operations}{Control Flow Operations} \item \protect\hyperlink{identity}{\texttt{tf.identity(input,\ name=None)}} \item \protect\hyperlink{tuple}{\texttt{tf.tuple(tensors,\ name=None,\ control\_inputs=None)}} \item \protect\hyperlink{group}{\texttt{tf.group(*inputs,\ **kwargs)}} \item \protect\hyperlink{noux5fop}{\texttt{tf.no\_op(name=None)}} \item \protect\hyperlink{countux5fupux5fto}{\texttt{tf.count\_up\_to(ref,\ limit,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-logical-operators}{Logical Operators} \item \protect\hyperlink{logicalux5fand}{\texttt{tf.logical\_and(x,\ y,\ name=None)}} \item \protect\hyperlink{logicalux5fnot}{\texttt{tf.logical\_not(x,\ name=None)}} \item \protect\hyperlink{logicalux5for}{\texttt{tf.logical\_or(x,\ y,\ name=None)}} \item \protect\hyperlink{logicalux5fxor}{\texttt{tf.logical\_xor(x,\ y,\ name=\textquotesingle{}LogicalXor\textquotesingle{})}} \item \protect\hyperlink{AUTOGENERATED-comparison-operators}{Comparison Operators} \item \protect\hyperlink{equal}{\texttt{tf.equal(x,\ y,\ name=None)}} \item \protect\hyperlink{notux5fequal}{\texttt{tf.not\_equal(x,\ y,\ name=None)}} \item \protect\hyperlink{less}{\texttt{tf.less(x,\ y,\ name=None)}} \item \protect\hyperlink{lessux5fequal}{\texttt{tf.less\_equal(x,\ y,\ name=None)}} \item \protect\hyperlink{greater}{\texttt{tf.greater(x,\ y,\ name=None)}} \item \protect\hyperlink{greaterux5fequal}{\texttt{tf.greater\_equal(x,\ y,\ name=None)}} \item \protect\hyperlink{select}{\texttt{tf.select(condition,\ t,\ e,\ name=None)}} \item \protect\hyperlink{where}{\texttt{tf.where(input,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-debugging-operations}{Debugging Operations} \item \protect\hyperlink{isux5ffinite}{\texttt{tf.is\_finite(x,\ name=None)}} \item \protect\hyperlink{isux5finf}{\texttt{tf.is\_inf(x,\ name=None)}} \item \protect\hyperlink{isux5fnan}{\texttt{tf.is\_nan(x,\ name=None)}} \item \protect\hyperlink{verifyux5ftensorux5fallux5ffinite}{\texttt{tf.verify\_tensor\_all\_finite(t,\ msg,\ name=None)}} \item \protect\hyperlink{checkux5fnumerics}{\texttt{tf.check\_numerics(tensor,\ message,\ name=None)}} \item \protect\hyperlink{addux5fcheckux5fnumericsux5fops}{\texttt{tf.add\_check\_numerics\_ops()}} \item \protect\hyperlink{Assert}{\texttt{tf.Assert(condition,\ data,\ summarize=None,\ name=None)}} \item \protect\hyperlink{Print}{\texttt{tf.Print(input\_,\ data,\ message=None,\ first\_n=None,\ summarize=None,\ name=None)}} \end{itemize} \subsection{Control Flow Operations }\label{control-flow-operations} TensorFlow provides several operations and classes that you can use to control the execution of operations and add conditional dependencies to your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.identity(input,\ name=None)} }{tf.identity(input, name=None) }}\label{tf.identityinput-namenone} Return a tensor with the same shape and contents as the input tensor or value. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns} A \texttt{Tensor}. Has the same type as \texttt{input}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.tuple(tensors,\ name=None,\ control\_inputs=None)} }{tf.tuple(tensors, name=None, control\_inputs=None) }}\label{tf.tupletensors-namenone-controlux5finputsnone} Group tensors together. This creates a tuple of tensors with the same values as the \texttt{tensors} argument, except that the value of each tensor is only returned after the values of all tensors have been computed. \texttt{control\_inputs} contains additional ops that have to finish before this op finishes, but whose outputs are not returned. This can be used as a ``join'' mechanism for parallel computations: all the argument tensors can be computed in parallel, but the values of any tensor returned by \texttt{tuple} are only available after all the parallel computations are done. See also \texttt{group} and \texttt{with\_dependencies}. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{tensors}: A list of \texttt{Tensor}s or \texttt{IndexedSlices}, some entries can be \texttt{None}. \item \texttt{name}: (optional) A name to use as a \texttt{name\_scope} for the operation. \item \texttt{control\_inputs}: List of additional ops to finish before returning. \end{itemize} \subparagraph{Returns: }\label{returns-1} Same as \texttt{tensors}. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{ValueError}: If \texttt{tensors} does not contain any \texttt{Tensor} or \texttt{IndexedSlices}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.group(*inputs,\ **kwargs)} }{tf.group(*inputs, **kwargs) }}\label{tf.groupinputs-kwargs} Create an op that groups multiple operations. When this op finishes, all ops in \texttt{input} have finished. This op has no output. See also \texttt{tuple} and \texttt{with\_dependencies}. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{*inputs}: One or more tensors to group. \item \texttt{**kwargs}: Optional parameters to pass when constructing the NodeDef. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-2} An Operation that executes all its inputs. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{ValueError}: If an unknown keyword argument is provided, or if there are no inputs. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.no\_op(name=None)} }{tf.no\_op(name=None) }}\label{tf.noux5fopnamenone} Does nothing. Only useful as a placeholder for control edges. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.count\_up\_to(ref,\ limit,\ name=None)} }{tf.count\_up\_to(ref, limit, name=None) }}\label{tf.countux5fupux5ftoref-limit-namenone} Increments `ref' until it reaches `limit'. This operation outputs ``ref'' after the update is done. This makes it easier to chain operations that need to use the updated value. \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{ref}: A mutable \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. Should be from a scalar \texttt{Variable} node. \item \texttt{limit}: An \texttt{int}. If incrementing ref would bring it above limit, instead generates an `OutOfRange' error. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-4} A \texttt{Tensor}. Has the same type as \texttt{ref}. A copy of the input before increment. If nothing else modifies the input, the values produced will all be distinct. \subsection{Logical Operators }\label{logical-operators} TensorFlow provides several operations that you can use to add logical operators to your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.logical\_and(x,\ y,\ name=None)} }{tf.logical\_and(x, y, name=None) }}\label{tf.logicalux5fandx-y-namenone} Returns the truth value of x AND y element-wise. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{y}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-5} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.logical\_not(x,\ name=None)} }{tf.logical\_not(x, name=None) }}\label{tf.logicalux5fnotx-namenone} Returns the truth value of NOT x element-wise. \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.logical\_or(x,\ y,\ name=None)} }{tf.logical\_or(x, y, name=None) }}\label{tf.logicalux5forx-y-namenone} Returns the truth value of x OR y element-wise. \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{y}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.logical\_xor(x,\ y,\ name=\textquotesingle{}LogicalXor\textquotesingle{})} }{tf.logical\_xor(x, y, name='LogicalXor') }}\label{tf.logicalux5fxorx-y-namelogicalxor} x \^{} y = (x \textbar{} y) \& \textasciitilde{}(x \& y). \subsection{Comparison Operators }\label{comparison-operators} TensorFlow provides several operations that you can use to add comparison operators to your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.equal(x,\ y,\ name=None)} }{tf.equal(x, y, name=None) }}\label{tf.equalx-y-namenone} Returns the truth value of (x == y) element-wise. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{complex64}, \texttt{quint8}, \texttt{qint8}, \texttt{qint32}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.not\_equal(x,\ y,\ name=None)} }{tf.not\_equal(x, y, name=None) }}\label{tf.notux5fequalx-y-namenone} Returns the truth value of (x != y) element-wise. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{complex64}, \texttt{quint8}, \texttt{qint8}, \texttt{qint32}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-9} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.less(x,\ y,\ name=None)} }{tf.less(x, y, name=None) }}\label{tf.lessx-y-namenone} Returns the truth value of (x \textless{} y) element-wise. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-10} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.less\_equal(x,\ y,\ name=None)} }{tf.less\_equal(x, y, name=None) }}\label{tf.lessux5fequalx-y-namenone} Returns the truth value of (x \textless{}= y) element-wise. \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-11} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.greater(x,\ y,\ name=None)} }{tf.greater(x, y, name=None) }}\label{tf.greaterx-y-namenone} Returns the truth value of (x \textgreater{} y) element-wise. \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-12} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.greater\_equal(x,\ y,\ name=None)} }{tf.greater\_equal(x, y, name=None) }}\label{tf.greaterux5fequalx-y-namenone} Returns the truth value of (x \textgreater{}= y) element-wise. \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}. \item \texttt{y}: A \texttt{Tensor}. Must have the same type as \texttt{x}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-13} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.select(condition,\ t,\ e,\ name=None)} }{tf.select(condition, t, e, name=None) }}\label{tf.selectcondition-t-e-namenone} Selects elements from \texttt{t} or \texttt{e}, depending on \texttt{condition}. The \texttt{condition}, \texttt{t}, and \texttt{e} tensors must all have the same shape, and the output will also have that shape. The \texttt{condition} tensor acts as an element-wise mask that chooses, based on the value at each element, whether the corresponding element in the output should be taken from \texttt{t} (if true) or \texttt{e} (if false). For example: For example: \begin{verbatim} # 'condition' tensor is [[True, False] # [True, False]] # 't' is [[1, 1], # [1, 1]] # 'e' is [[2, 2], # [2, 2]] select(condition, t, e) ==> [[1, 2], [1, 2]] \end{verbatim} \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \texttt{condition}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{t}: A \texttt{Tensor} with the same shape as \texttt{condition}. \item \texttt{e}: A \texttt{Tensor} with the same type and shape as \texttt{t}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-14} A \texttt{Tensor} with the same type and shape as \texttt{t} and \texttt{e}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.where(input,\ name=None)} }{tf.where(input, name=None) }}\label{tf.whereinput-namenone} Returns locations of true values in a boolean tensor. This operation returns the coordinates of true elements in \texttt{input}. The coordinates are returned in a 2-D tensor where the first dimension (rows) represents the number of true elements, and the second dimension (columns) represents the coordinates of the true elements. Keep in mind, the shape of the output tensor can vary depending on how many true values there are in \texttt{input}. Indices are output in row-major order. For example: \begin{verbatim} # 'input' tensor is [[True, False] # [True, False]] # 'input' has two true values, so output has two coordinates. # 'input' has rank of 2, so coordinates have two indices. where(input) ==> [[0, 0], [1, 0]] # `input` tensor is [[[True, False] # [True, False]] # [[False, True] # [False, True]] # [[False, False] # [False, True]]] # 'input' has 5 true values, so output has 5 coordinates. # 'input' has rank of 3, so coordinates have three indices. where(input) ==> [[0, 0, 0], [0, 1, 0], [1, 0, 1], [1, 1, 1], [2, 1, 1]] \end{verbatim} \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor} of type \texttt{bool}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-15} A \texttt{Tensor} of type \texttt{int64}. \subsection{Debugging Operations }\label{debugging-operations} TensorFlow provides several operations that you can use to validate values and debug your graph. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.is\_finite(x,\ name=None)} }{tf.is\_finite(x, name=None) }}\label{tf.isux5ffinitex-namenone} Returns which elements of x are finite. \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-16} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.is\_inf(x,\ name=None)} }{tf.is\_inf(x, name=None) }}\label{tf.isux5finfx-namenone} Returns which elements of x are Inf. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-17} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.is\_nan(x,\ name=None)} }{tf.is\_nan(x, name=None) }}\label{tf.isux5fnanx-namenone} Returns which elements of x are NaN. \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-18} A \texttt{Tensor} of type \texttt{bool}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.verify\_tensor\_all\_finite(t,\ msg,\ name=None)} }{tf.verify\_tensor\_all\_finite(t, msg, name=None) }}\label{tf.verifyux5ftensorux5fallux5ffinitet-msg-namenone} Assert that the tensor does not contain any NaN's or Inf's. \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \texttt{t}: Tensor to check. \item \texttt{msg}: Message to log on failure. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-19} Same tensor as \texttt{t}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.check\_numerics(tensor,\ message,\ name=None)} }{tf.check\_numerics(tensor, message, name=None) }}\label{tf.checkux5fnumericstensor-message-namenone} Checks a tensor for NaN and Inf values. When run, reports an \texttt{InvalidArgument} error if \texttt{tensor} has any values that are not a number (NaN) or infinity (Inf). Otherwise, passes \texttt{tensor} as-is. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \texttt{tensor}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{message}: A \texttt{string}. Prefix of the error message. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-20} A \texttt{Tensor}. Has the same type as \texttt{tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.add\_check\_numerics\_ops()} }{tf.add\_check\_numerics\_ops() }}\label{tf.addux5fcheckux5fnumericsux5fops} Connect a check\_numerics to every floating point tensor. \texttt{check\_numerics} operations themselves are added for each \texttt{float} or \texttt{double} tensor in the graph. For all ops in the graph, the \texttt{check\_numerics} op for all of its (\texttt{float} or \texttt{double}) inputs is guaranteed to run before the \texttt{check\_numerics} op on any of its outputs. \subparagraph{Returns: }\label{returns-21} A \texttt{group} op depending on all \texttt{check\_numerics} ops added. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.Assert(condition,\ data,\ summarize=None,\ name=None)} }{tf.Assert(condition, data, summarize=None, name=None) }}\label{tf.assertcondition-data-summarizenone-namenone} Asserts that the given condition is true. If \texttt{condition} evaluates to false, print the list of tensors in \texttt{data}. \texttt{summarize} determines how many entries of the tensors to print. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \texttt{condition}: The condition to evaluate. \item \texttt{data}: The tensors to print out when condition is false. \item \texttt{summarize}: Print this many entries of each tensor. \item \texttt{name}: A name for this operation (optional). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.Print(input\_,\ data,\ message=None,\ first\_n=None,\ summarize=None,\ name=None)} }{tf.Print(input\_, data, message=None, first\_n=None, summarize=None, name=None) }}\label{tf.printinputux5f-data-messagenone-firstux5fnnone-summarizenone-namenone} Prints a list of tensors. This is an identity op with the side effect of printing \texttt{data} when evaluating. \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \texttt{input\_}: A tensor passed through this op. \item \texttt{data}: A list of tensors to print out when op is evaluated. \item \texttt{message}: A string, prefix of the error message. \item \texttt{first\_n}: Only log \texttt{first\_n} number of times. Negative numbers log always; this is the default. \item \texttt{summarize}: Only print this many entries of each tensor. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-22} Same tensor as \texttt{input\_}. ================================================ FILE: tex_pdf/api/python/c4s07_image.tex ================================================ \section{Images }\label{images} Note: Functions taking \texttt{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\texttt{tf.convert\_to\_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-images}{Images}}{Images}}\label{images-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-encoding-and-decoding}{Encoding and Decoding} \item \protect\hyperlink{decodeux5fjpeg}{\texttt{tf.image.decode\_jpeg(contents,\ channels=None,\ ratio=None,\ fancy\_upscaling=None,\ try\_recover\_truncated=None,\ acceptable\_fraction=None,\ name=None)}} \item \protect\hyperlink{encodeux5fjpeg}{\texttt{tf.image.encode\_jpeg(image,\ format=None,\ quality=None,\ progressive=None,\ optimize\_size=None,\ chroma\_downsampling=None,\ density\_unit=None,\ x\_density=None,\ y\_density=None,\ xmp\_metadata=None,\ name=None)}} \item \protect\hyperlink{decodeux5fpng}{\texttt{tf.image.decode\_png(contents,\ channels=None,\ name=None)}} \item \protect\hyperlink{encodeux5fpng}{\texttt{tf.image.encode\_png(image,\ compression=None,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-resizing}{Resizing} \item \protect\hyperlink{resizeux5fimages}{\texttt{tf.image.resize\_images(images,\ new\_height,\ new\_width,\ method=0)}} \item \protect\hyperlink{resizeux5farea}{\texttt{tf.image.resize\_area(images,\ size,\ name=None)}} \item \protect\hyperlink{resizeux5fbicubic}{\texttt{tf.image.resize\_bicubic(images,\ size,\ name=None)}} \item \protect\hyperlink{resizeux5fbilinear}{\texttt{tf.image.resize\_bilinear(images,\ size,\ name=None)}} \item \protect\hyperlink{resizeux5fnearestux5fneighbor}{\texttt{tf.image.resize\_nearest\_neighbor(images,\ size,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-cropping}{Cropping} \item \protect\hyperlink{resizeux5fimageux5fwithux5fcropux5forux5fpad}{\texttt{tf.image.resize\_image\_with\_crop\_or\_pad(image,\ target\_height,\ target\_width)}} \item \protect\hyperlink{padux5ftoux5fboundingux5fbox}{\texttt{tf.image.pad\_to\_bounding\_box(image,\ offset\_height,\ offset\_width,\ target\_height,\ target\_width)}} \item \protect\hyperlink{cropux5ftoux5fboundingux5fbox}{\texttt{tf.image.crop\_to\_bounding\_box(image,\ offset\_height,\ offset\_width,\ target\_height,\ target\_width)}} \item \protect\hyperlink{randomux5fcrop}{\texttt{tf.image.random\_crop(image,\ size,\ seed=None,\ name=None)}} \item \protect\hyperlink{extractux5fglimpse}{\texttt{tf.image.extract\_glimpse(input,\ size,\ offsets,\ centered=None,\ normalized=None,\ uniform\_noise=None,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-flipping-and-transposing}{Flipping and Transposing} \item \protect\hyperlink{flipux5fupux5fdown}{\texttt{tf.image.flip\_up\_down(image)}} \item \protect\hyperlink{randomux5fflipux5fupux5fdown}{\texttt{tf.image.random\_flip\_up\_down(image,\ seed=None)}} \item \protect\hyperlink{flipux5fleftux5fright}{\texttt{tf.image.flip\_left\_right(image)}} \item \protect\hyperlink{randomux5fflipux5fleftux5fright}{\texttt{tf.image.random\_flip\_left\_right(image,\ seed=None)}} \item \protect\hyperlink{transposeux5fimage}{\texttt{tf.image.transpose\_image(image)}} \item \protect\hyperlink{AUTOGENERATED-image-adjustments}{Image Adjustments} \item \protect\hyperlink{adjustux5fbrightness}{\texttt{tf.image.adjust\_brightness(image,\ delta,\ min\_value=None,\ max\_value=None)}} \item \protect\hyperlink{randomux5fbrightness}{\texttt{tf.image.random\_brightness(image,\ max\_delta,\ seed=None)}} \item \protect\hyperlink{adjustux5fcontrast}{\texttt{tf.image.adjust\_contrast(images,\ contrast\_factor,\ min\_value=None,\ max\_value=None)}} \item \protect\hyperlink{randomux5fcontrast}{\texttt{tf.image.random\_contrast(image,\ lower,\ upper,\ seed=None)}} \item \protect\hyperlink{perux5fimageux5fwhitening}{\texttt{tf.image.per\_image\_whitening(image)}} \end{itemize} \subsection{Encoding and Decoding }\label{encoding-and-decoding} TensorFlow provides Ops to decode and encode JPEG and PNG formats. Encoded images are represented by scalar string Tensors, decoded images by 3-D uint8 tensors of shape \texttt{{[}height,\ width,\ channels{]}}. The encode and decode Ops apply to one image at a time. Their input and output are all of variable size. If you need fixed size images, pass the output of the decode Ops to one of the cropping and resizing Ops. Note: The PNG encode and decode Ops support RGBA, but the conversions Ops presently only support RGB, HSV, and GrayScale. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.decode\_jpeg(contents,\ channels=None,\ ratio=None,\ fancy\_upscaling=None,\ try\_recover\_truncated=None,\ acceptable\_fraction=None,\ name=None)} }{tf.image.decode\_jpeg(contents, channels=None, ratio=None, fancy\_upscaling=None, try\_recover\_truncated=None, acceptable\_fraction=None, name=None) }}\label{tf.image.decodeux5fjpegcontents-channelsnone-rationone-fancyux5fupscalingnone-tryux5frecoverux5ftruncatednone-acceptableux5ffractionnone-namenone} Decode a JPEG-encoded image to a uint8 tensor. The attr \texttt{channels} indicates the desired number of color channels for the decoded image. Accepted values are: \begin{itemize} \tightlist \item 0: Use the number of channels in the JPEG-encoded image. \item 1: output a grayscale image. \item 3: output an RGB image. \end{itemize} If needed, the JPEG-encoded image is transformed to match the requested number of color channels. The attr \texttt{ratio} allows downscaling the image by an integer factor during decoding. Allowed values are: 1, 2, 4, and 8. This is much faster than downscaling the image later. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{contents}: A \texttt{Tensor} of type \texttt{string}. 0-D. The JPEG-encoded image. \item \texttt{channels}: An optional \texttt{int}. Defaults to \texttt{0}. Number of color channels for the decoded image. \item \texttt{ratio}: An optional \texttt{int}. Defaults to \texttt{1}. Downscaling ratio. \item \texttt{fancy\_upscaling}: An optional \texttt{bool}. Defaults to \texttt{True}. If true use a slower but nicer upscaling of the chroma planes (yuv420/422 only). \item \texttt{try\_recover\_truncated}: An optional \texttt{bool}. Defaults to \texttt{False}. If true try to recover an image from truncated input. \item \texttt{acceptable\_fraction}: An optional \texttt{float}. Defaults to \texttt{1}. The minimum required fraction of lines before a truncated input is accepted. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns} A \texttt{Tensor} of type \texttt{uint8}. 3-D with shape \texttt{{[}height,\ width,\ channels{]}}.. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.encode\_jpeg(image,\ format=None,\ quality=None,\ progressive=None,\ optimize\_size=None,\ chroma\_downsampling=None,\ density\_unit=None,\ x\_density=None,\ y\_density=None,\ xmp\_metadata=None,\ name=None)} }{tf.image.encode\_jpeg(image, format=None, quality=None, progressive=None, optimize\_size=None, chroma\_downsampling=None, density\_unit=None, x\_density=None, y\_density=None, xmp\_metadata=None, name=None) }}\label{tf.image.encodeux5fjpegimage-formatnone-qualitynone-progressivenone-optimizeux5fsizenone-chromaux5fdownsamplingnone-densityux5funitnone-xux5fdensitynone-yux5fdensitynone-xmpux5fmetadatanone-namenone} JPEG-encode an image. \texttt{image} is a 3-D uint8 Tensor of shape \texttt{{[}height,\ width,\ channels{]}}. The attr \texttt{format} can be used to override the color format of the encoded output. Values can be: \begin{itemize} \tightlist \item \texttt{\textquotesingle{}\textquotesingle{}}: Use a default format based on the number of channels in the image. \item \texttt{grayscale}: Output a grayscale JPEG image. The \texttt{channels} dimension of \texttt{image} must be 1. \item \texttt{rgb}: Output an RGB JPEG image. The \texttt{channels} dimension of \texttt{image} must be 3. \end{itemize} If \texttt{format} is not specified or is the empty string, a default format is picked in function of the number of channels in \texttt{image}: \begin{itemize} \tightlist \item 1: Output a grayscale image. \item 3: Output an RGB image. \end{itemize} \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{image}: A \texttt{Tensor} of type \texttt{uint8}. 3-D with shape \texttt{{[}height,\ width,\ channels{]}}. \item \texttt{format}: An optional \texttt{string} from: \texttt{"",\ "grayscale",\ "rgb"}. Defaults to \texttt{""}. Per pixel image format. \item \texttt{quality}: An optional \texttt{int}. Defaults to \texttt{95}. Quality of the compression from 0 to 100 (higher is better and slower). \item \texttt{progressive}: An optional \texttt{bool}. Defaults to \texttt{False}. If True, create a JPEG that loads progressively (coarse to fine). \item \texttt{optimize\_size}: An optional \texttt{bool}. Defaults to \texttt{False}. If True, spend CPU/RAM to reduce size with no quality change. \item \texttt{chroma\_downsampling}: An optional \texttt{bool}. Defaults to \texttt{True}. See http://en.wikipedia.org/wiki/Chroma\_subsampling. \item \texttt{density\_unit}: An optional \texttt{string} from: \texttt{"in",\ "cm"}. Defaults to \texttt{"in"}. Unit used to specify \texttt{x\_density} and \texttt{y\_density}: pixels per inch (\texttt{\textquotesingle{}in\textquotesingle{}}) or centimeter (\texttt{\textquotesingle{}cm\textquotesingle{}}). \item \texttt{x\_density}: An optional \texttt{int}. Defaults to \texttt{300}. Horizontal pixels per density unit. \item \texttt{y\_density}: An optional \texttt{int}. Defaults to \texttt{300}. Vertical pixels per density unit. \item \texttt{xmp\_metadata}: An optional \texttt{string}. Defaults to \texttt{""}. If not empty, embed this XMP metadata in the image header. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-1} A \texttt{Tensor} of type \texttt{string}. 0-D. JPEG-encoded image. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.decode\_png(contents,\ channels=None,\ name=None)} }{tf.image.decode\_png(contents, channels=None, name=None) }}\label{tf.image.decodeux5fpngcontents-channelsnone-namenone} Decode a PNG-encoded image to a uint8 tensor. The attr \texttt{channels} indicates the desired number of color channels for the decoded image. Accepted values are: \begin{itemize} \tightlist \item 0: Use the number of channels in the PNG-encoded image. \item 1: output a grayscale image. \item 3: output an RGB image. \item 4: output an RGBA image. \end{itemize} If needed, the PNG-encoded image is transformed to match the requested number of color channels. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{contents}: A \texttt{Tensor} of type \texttt{string}. 0-D. The PNG-encoded image. \item \texttt{channels}: An optional \texttt{int}. Defaults to \texttt{0}. Number of color channels for the decoded image. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-2} A \texttt{Tensor} of type \texttt{uint8}. 3-D with shape \texttt{{[}height,\ width,\ channels{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.encode\_png(image,\ compression=None,\ name=None)} }{tf.image.encode\_png(image, compression=None, name=None) }}\label{tf.image.encodeux5fpngimage-compressionnone-namenone} PNG-encode an image. \texttt{image} is a 3-D uint8 Tensor of shape \texttt{{[}height,\ width,\ channels{]}} where \texttt{channels} is: \begin{itemize} \tightlist \item 1: for grayscale. \item 3: for RGB. \item 4: for RGBA. \end{itemize} The ZLIB compression level, \texttt{compression}, can be -1 for the PNG-encoder default or a value from 0 to 9. 9 is the highest compression level, generating the smallest output, but is slower. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{image}: A \texttt{Tensor} of type \texttt{uint8}. 3-D with shape \texttt{{[}height,\ width,\ channels{]}}. \item \texttt{compression}: An optional \texttt{int}. Defaults to \texttt{-1}. Compression level. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} A \texttt{Tensor} of type \texttt{string}. 0-D. PNG-encoded image. \subsection{Resizing }\label{resizing} The resizing Ops accept input images as tensors of several types. They always output resized images as float32 tensors. The convenience function \protect\hyperlink{resizeux5fimages}{resize\_images()} supports both 4-D and 3-D tensors as input and output. 4-D tensors are for batches of images, 3-D tensors for individual images. Other resizing Ops only support 3-D individual images as input: \protect\hyperlink{resizeux5farea}{resize\_area}, \protect\hyperlink{resizeux5fbicubic}{resize\_bicubic}, \protect\hyperlink{resizeux5fbilinear}{resize\_bilinear}, \protect\hyperlink{resizeux5fnearestux5fneighbor}{resize\_nearest\_neighbor}. Example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Decode a JPG image and resize it to 299 by 299.} \NormalTok{image }\OperatorTok{=} \NormalTok{tf.image.decode_jpeg(...)} \NormalTok{resized_image }\OperatorTok{=} \NormalTok{tf.image.resize_bilinear(image, [}\DecValTok{299}\NormalTok{, }\DecValTok{299}\NormalTok{])} \end{Highlighting} \end{Shaded} Maybe refer to the Queue examples that show how to add images to a Queue after resizing them to a fixed size, and how to dequeue batches of resized images from the Queue. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.resize\_images(images,\ new\_height,\ new\_width,\ method=0)} }{tf.image.resize\_images(images, new\_height, new\_width, method=0) }}\label{tf.image.resizeux5fimagesimages-newux5fheight-newux5fwidth-method0} Resize \texttt{images} to \texttt{new\_width}, \texttt{new\_height} using the specified \texttt{method}. Resized images will be distorted if their original aspect ratio is not the same as \texttt{new\_width}, \texttt{new\_height}. To avoid distortions see \protect\hyperlink{resizeux5fimageux5fwithux5fcropux5forux5fpad}{resize\_image\_with\_crop\_or\_pad}. \texttt{method} can be one of: \begin{itemize} \tightlist \item ResizeMethod.BILINEAR: {[}Bilinear interpolation.{]} (https://en.wikipedia.org/wiki/Bilinear\_interpolation) \item ResizeMethod.NEAREST\_NEIGHBOR: {[}Nearest neighbor interpolation.{]} (https://en.wikipedia.org/wiki/Nearest-neighbor\_interpolation) \item ResizeMethod.BICUBIC: {[}Bicubic interpolation.{]} (https://en.wikipedia.org/wiki/Bicubic\_interpolation) \item ResizeMethod.AREA: Area interpolation. \end{itemize} \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{images}: 4-D Tensor of shape \texttt{{[}batch,\ height,\ width,\ channels{]}} or 3-D Tensor of shape \texttt{{[}height,\ width,\ channels{]}}. \item \texttt{new\_height}: integer. \item \texttt{new\_width}: integer. \item \texttt{method}: ResizeMethod. Defaults to \texttt{ResizeMethod.BILINEAR}. \end{itemize} \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{ValueError}: if the shape of \texttt{images} is incompatible with the shape arguments to this function \item \texttt{ValueError}: if an unsupported resize method is specified. \end{itemize} \subparagraph{Returns: }\label{returns-4} If \texttt{images} was 4-D, a 4-D float Tensor of shape \texttt{{[}batch,\ new\_height,\ new\_width,\ channels{]}}. If \texttt{images} was 3-D, a 3-D float Tensor of shape \texttt{{[}new\_height,\ new\_width,\ channels{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.resize\_area(images,\ size,\ name=None)} }{tf.image.resize\_area(images, size, name=None) }}\label{tf.image.resizeux5fareaimages-size-namenone} Resize \texttt{images} to \texttt{size} using area interpolation. Input images can be of different types but output images are always float. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \texttt{images}: A \texttt{Tensor}. Must be one of the following types: \texttt{uint8}, \texttt{int8}, \texttt{int32}, \texttt{float32}, \texttt{float64}. 4-D with shape \texttt{{[}batch,\ height,\ width,\ channels{]}}. \item \texttt{size}: A 1-D int32 Tensor of 2 elements: \texttt{new\_height,\ new\_width}. The new size for the images. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-5} A \texttt{Tensor} of type \texttt{float32}. 4-D with shape \texttt{{[}batch,\ new\_height,\ new\_width,\ channels{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.resize\_bicubic(images,\ size,\ name=None)} }{tf.image.resize\_bicubic(images, size, name=None) }}\label{tf.image.resizeux5fbicubicimages-size-namenone} Resize \texttt{images} to \texttt{size} using bicubic interpolation. Input images can be of different types but output images are always float. \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{images}: A \texttt{Tensor}. Must be one of the following types: \texttt{uint8}, \texttt{int8}, \texttt{int32}, \texttt{float32}, \texttt{float64}. 4-D with shape \texttt{{[}batch,\ height,\ width,\ channels{]}}. \item \texttt{size}: A 1-D int32 Tensor of 2 elements: \texttt{new\_height,\ new\_width}. The new size for the images. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A \texttt{Tensor} of type \texttt{float32}. 4-D with shape \texttt{{[}batch,\ new\_height,\ new\_width,\ channels{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.resize\_bilinear(images,\ size,\ name=None)} }{tf.image.resize\_bilinear(images, size, name=None) }}\label{tf.image.resizeux5fbilinearimages-size-namenone} Resize \texttt{images} to \texttt{size} using bilinear interpolation. Input images can be of different types but output images are always float. \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{images}: A \texttt{Tensor}. Must be one of the following types: \texttt{uint8}, \texttt{int8}, \texttt{int32}, \texttt{float32}, \texttt{float64}. 4-D with shape \texttt{{[}batch,\ height,\ width,\ channels{]}}. \item \texttt{size}: A 1-D int32 Tensor of 2 elements: \texttt{new\_height,\ new\_width}. The new size for the images. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} A \texttt{Tensor} of type \texttt{float32}. 4-D with shape \texttt{{[}batch,\ new\_height,\ new\_width,\ channels{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.resize\_nearest\_neighbor(images,\ size,\ name=None)} }{tf.image.resize\_nearest\_neighbor(images, size, name=None) }}\label{tf.image.resizeux5fnearestux5fneighborimages-size-namenone} Resize \texttt{images} to \texttt{size} using nearest neighbor interpolation. Input images can be of different types but output images are always float. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \texttt{images}: A \texttt{Tensor}. Must be one of the following types: \texttt{uint8}, \texttt{int8}, \texttt{int32}, \texttt{float32}, \texttt{float64}. 4-D with shape \texttt{{[}batch,\ height,\ width,\ channels{]}}. \item \texttt{size}: A 1-D int32 Tensor of 2 elements: \texttt{new\_height,\ new\_width}. The new size for the images. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} A \texttt{Tensor}. Has the same type as \texttt{images}. 4-D with shape \texttt{{[}batch,\ new\_height,\ new\_width,\ channels{]}}. \subsection{Cropping }\label{cropping} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.resize\_image\_with\_crop\_or\_pad(image,\ target\_height,\ target\_width)} }{tf.image.resize\_image\_with\_crop\_or\_pad(image, target\_height, target\_width) }}\label{tf.image.resizeux5fimageux5fwithux5fcropux5forux5fpadimage-targetux5fheight-targetux5fwidth} Crops and/or pads an image to a target width and height. Resizes an image to a target width and height by either centrally cropping the image or padding it evenly with zeros. If \texttt{width} or \texttt{height} is greater than the specified \texttt{target\_width} or \texttt{target\_height} respectively, this op centrally crops along that dimension. If \texttt{width} or \texttt{height} is smaller than the specified \texttt{target\_width} or \texttt{target\_height} respectively, this op centrally pads with 0 along that dimension. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor of shape {[}height, width, channels{]} \item \texttt{target\_height}: Target height. \item \texttt{target\_width}: Target width. \end{itemize} \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{ValueError}: if \texttt{target\_height} or \texttt{target\_width} are zero or negative. \end{itemize} \subparagraph{Returns: }\label{returns-9} Cropped and/or padded image of shape \texttt{{[}target\_height,\ target\_width,\ channels{]}} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.pad\_to\_bounding\_box(image,\ offset\_height,\ offset\_width,\ target\_height,\ target\_width)} }{tf.image.pad\_to\_bounding\_box(image, offset\_height, offset\_width, target\_height, target\_width) }}\label{tf.image.padux5ftoux5fboundingux5fboximage-offsetux5fheight-offsetux5fwidth-targetux5fheight-targetux5fwidth} Pad \texttt{image} with zeros to the specified \texttt{height} and \texttt{width}. Adds \texttt{offset\_height} rows of zeros on top, \texttt{offset\_width} columns of zeros on the left, and then pads the image on the bottom and right with zeros until it has dimensions \texttt{target\_height}, \texttt{target\_width}. This op does nothing if \texttt{offset\_*} is zero and the image already has size \texttt{target\_height} by \texttt{target\_width}. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor with shape \texttt{{[}height,\ width,\ channels{]}} \item \texttt{offset\_height}: Number of rows of zeros to add on top. \item \texttt{offset\_width}: Number of columns of zeros to add on the left. \item \texttt{target\_height}: Height of output image. \item \texttt{target\_width}: Width of output image. \end{itemize} \subparagraph{Returns: }\label{returns-10} 3-D tensor of shape \texttt{{[}target\_height,\ target\_width,\ channels{]}} \subparagraph{Raises: }\label{raises-2} \begin{itemize} \tightlist \item \texttt{ValueError}: If the shape of \texttt{image} is incompatible with the \texttt{offset\_*} or \texttt{target\_*} arguments \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.crop\_to\_bounding\_box(image,\ offset\_height,\ offset\_width,\ target\_height,\ target\_width)} }{tf.image.crop\_to\_bounding\_box(image, offset\_height, offset\_width, target\_height, target\_width) }}\label{tf.image.cropux5ftoux5fboundingux5fboximage-offsetux5fheight-offsetux5fwidth-targetux5fheight-targetux5fwidth} Crops an image to a specified bounding box. This op cuts a rectangular part out of \texttt{image}. The top-left corner of the returned image is at \texttt{offset\_height,\ offset\_width} in \texttt{image}, and its lower-right corner is at `offset\_height + target\_height, offset\_width + target\_width'. \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor with shape \texttt{{[}height,\ width,\ channels{]}} \item \texttt{offset\_height}: Vertical coordinate of the top-left corner of the result in the input. \item \texttt{offset\_width}: Horizontal coordinate of the top-left corner of the result in the input. \item \texttt{target\_height}: Height of the result. \item \texttt{target\_width}: Width of the result. \end{itemize} \subparagraph{Returns: }\label{returns-11} 3-D tensor of image with shape \texttt{{[}target\_height,\ target\_width,\ channels{]}} \subparagraph{Raises: }\label{raises-3} \begin{itemize} \tightlist \item \texttt{ValueError}: If the shape of \texttt{image} is incompatible with the \texttt{offset\_*} or \texttt{target\_*} arguments \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.random\_crop(image,\ size,\ seed=None,\ name=None)} }{tf.image.random\_crop(image, size, seed=None, name=None) }}\label{tf.image.randomux5fcropimage-size-seednone-namenone} Randomly crops \texttt{image} to size \texttt{{[}target\_height,\ target\_width{]}}. The offset of the output within \texttt{image} is uniformly random. \texttt{image} always fully contains the result. \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}} \item \texttt{size}: 1-D tensor with two elements, specifying target \texttt{{[}height,\ width{]}} \item \texttt{seed}: A Python integer. Used to create a random seed. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-12} A cropped 3-D tensor of shape \texttt{{[}target\_height,\ target\_width,\ channels{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.extract\_glimpse(input,\ size,\ offsets,\ centered=None,\ normalized=None,\ uniform\_noise=None,\ name=None)} }{tf.image.extract\_glimpse(input, size, offsets, centered=None, normalized=None, uniform\_noise=None, name=None) }}\label{tf.image.extractux5fglimpseinput-size-offsets-centerednone-normalizednone-uniformux5fnoisenone-namenone} Extracts a glimpse from the input tensor. Returns a set of windows called glimpses extracted at location \texttt{offsets} from the input tensor. If the windows only partially overlaps the inputs, the non overlapping areas will be filled with random noise. The result is a 4-D tensor of shape \texttt{{[}batch\_size,\ glimpse\_height,\ glimpse\_width,\ channels{]}}. The channels and batch dimensions are the same as that of the input tensor. The height and width of the output windows are specified in the \texttt{size} parameter. The argument \texttt{normalized} and \texttt{centered} controls how the windows are built: * If the coordinates are normalized but not centered, 0.0 and 1.0 correspond to the minimum and maximum of each height and width dimension. * If the coordinates are both normalized and centered, they range from -1.0 to 1.0. The coordinates (-1.0, -1.0) correspond to the upper left corner, the lower right corner is located at (1.0, 1.0) and the center is at (0, 0). * If the coordinates are not normalized they are interpreted as numbers of pixels. \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor} of type \texttt{float32}. A 4-D float tensor of shape \texttt{{[}batch\_size,\ height,\ width,\ channels{]}}. \item \texttt{size}: A \texttt{Tensor} of type \texttt{int32}. A 1-D tensor of 2 elements containing the size of the glimpses to extract. The glimpse height must be specified first, following by the glimpse width. \item \texttt{offsets}: A \texttt{Tensor} of type \texttt{float32}. A 2-D integer tensor of shape \texttt{{[}batch\_size,\ 2{]}} containing the x, y locations of the center of each window. \item \texttt{centered}: An optional \texttt{bool}. Defaults to \texttt{True}. indicates if the offset coordinates are centered relative to the image, in which case the (0, 0) offset is relative to the center of the input images. If false, the (0,0) offset corresponds to the upper left corner of the input images. \item \texttt{normalized}: An optional \texttt{bool}. Defaults to \texttt{True}. indicates if the offset coordinates are normalized. \item \texttt{uniform\_noise}: An optional \texttt{bool}. Defaults to \texttt{True}. indicates if the noise should be generated using a uniform distribution or a gaussian distribution. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-13} A \texttt{Tensor} of type \texttt{float32}. A tensor representing the glimpses \texttt{{[}batch\_size,\ glimpse\_height,\ \ \ glimpse\_width,\ channels{]}}. \subsection{Flipping and Transposing }\label{flipping-and-transposing} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.flip\_up\_down(image)} }{tf.image.flip\_up\_down(image) }}\label{tf.image.flipux5fupux5fdownimage} Flip an image horizontally (upside down). Outputs the contents of \texttt{image} flipped along the first dimension, which is \texttt{height}. See also \texttt{reverse()}. \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \texttt{image}: A 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}.} \end{itemize} \subparagraph{Returns: }\label{returns-14} A 3-D tensor of the same type and shape as \texttt{image}. \subparagraph{Raises: }\label{raises-4} \begin{itemize} \tightlist \item \texttt{ValueError}: if the shape of \texttt{image} not supported. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.random\_flip\_up\_down(image,\ seed=None)} }{tf.image.random\_flip\_up\_down(image, seed=None) }}\label{tf.image.randomux5fflipux5fupux5fdownimage-seednone} Randomly flips an image vertically (upside down). With a 1 in 2 chance, outputs the contents of \texttt{image} flipped along the first dimension, which is \texttt{height}. Otherwise output the image as-is. \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \texttt{image}: A 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}.} \item \texttt{seed}: A Python integer. Used to create a random seed. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-15} A 3-D tensor of the same type and shape as \texttt{image}. \subparagraph{Raises: }\label{raises-5} \begin{itemize} \tightlist \item \texttt{ValueError}: if the shape of \texttt{image} not supported. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.flip\_left\_right(image)} }{tf.image.flip\_left\_right(image) }}\label{tf.image.flipux5fleftux5frightimage} Flip an image horizontally (left to right). Outputs the contents of \texttt{image} flipped along the second dimension, which is \texttt{width}. See also \texttt{reverse()}. \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \texttt{image}: A 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}.} \end{itemize} \subparagraph{Returns: }\label{returns-16} A 3-D tensor of the same type and shape as \texttt{image}. \subparagraph{Raises: }\label{raises-6} \begin{itemize} \tightlist \item \texttt{ValueError}: if the shape of \texttt{image} not supported. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.random\_flip\_left\_right(image,\ seed=None)} }{tf.image.random\_flip\_left\_right(image, seed=None) }}\label{tf.image.randomux5fflipux5fleftux5frightimage-seednone} Randomly flip an image horizontally (left to right). With a 1 in 2 chance, outputs the contents of \texttt{image} flipped along the second dimension, which is \texttt{width}. Otherwise output the image as-is. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \texttt{image}: A 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}.} \item \texttt{seed}: A Python integer. Used to create a random seed. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-17} A 3-D tensor of the same type and shape as \texttt{image}. \subparagraph{Raises: }\label{raises-7} \begin{itemize} \tightlist \item \texttt{ValueError}: if the shape of \texttt{image} not supported. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.transpose\_image(image)} }{tf.image.transpose\_image(image) }}\label{tf.image.transposeux5fimageimage} Transpose an image by swapping the first and second dimension. See also \texttt{transpose()}. \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}} \end{itemize} \subparagraph{Returns: }\label{returns-18} A 3-D tensor of shape \texttt{{[}width,\ height,\ channels{]}} \subparagraph{Raises: }\label{raises-8} \begin{itemize} \tightlist \item \texttt{ValueError}: if the shape of \texttt{image} not supported. \end{itemize} \subsection{Image Adjustments }\label{image-adjustments} TensorFlow provides functions to adjust images in various ways: brightness, contrast, hue, and saturation. Each adjustment can be done with predefined parameters or with random parameters picked from predefined intervals. Random adjustments are often useful to expand a training set and reduce overfitting. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.adjust\_brightness(image,\ delta,\ min\_value=None,\ max\_value=None)} }{tf.image.adjust\_brightness(image, delta, min\_value=None, max\_value=None) }}\label{tf.image.adjustux5fbrightnessimage-delta-minux5fvaluenone-maxux5fvaluenone} Adjust the brightness of RGB or Grayscale images. The value \texttt{delta} is added to all components of the tensor \texttt{image}. \texttt{image} and \texttt{delta} are cast to \texttt{float} before adding, and the resulting values are clamped to \texttt{{[}min\_value,\ max\_value{]}}. Finally, the result is cast back to \texttt{images.dtype}. If \texttt{min\_value} or \texttt{max\_value} are not given, they are set to the minimum and maximum allowed values for \texttt{image.dtype} respectively. \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \texttt{image}: A tensor. \item \texttt{delta}: A scalar. Amount to add to the pixel values. \item \texttt{min\_value}: Minimum value for output. \item \texttt{max\_value}: Maximum value for output. \end{itemize} \subparagraph{Returns: }\label{returns-19} A tensor of the same shape and type as \texttt{image}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.random\_brightness(image,\ max\_delta,\ seed=None)} }{tf.image.random\_brightness(image, max\_delta, seed=None) }}\label{tf.image.randomux5fbrightnessimage-maxux5fdelta-seednone} Adjust the brightness of images by a random factor. Equivalent to \texttt{adjust\_brightness()} using a \texttt{delta} randomly picked in the interval \texttt{{[}-max\_delta,\ max\_delta)}. Note that \texttt{delta} is picked as a float. Because for integer type images, the brightness adjusted result is rounded before casting, integer images may have modifications in the range \texttt{{[}-max\_delta,max\_delta{]}}. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}}. \item \texttt{max\_delta}: float, must be non-negative. \item \texttt{seed}: A Python integer. Used to create a random seed. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-20} 3-D tensor of images of shape \texttt{{[}height,\ width,\ channels{]}} \subparagraph{Raises: }\label{raises-9} \begin{itemize} \tightlist \item \texttt{ValueError}: if max\_delta is negative. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.adjust\_contrast(images,\ contrast\_factor,\ min\_value=None,\ max\_value=None)} }{tf.image.adjust\_contrast(images, contrast\_factor, min\_value=None, max\_value=None) }}\label{tf.image.adjustux5fcontrastimages-contrastux5ffactor-minux5fvaluenone-maxux5fvaluenone} Adjust contrast of RGB or grayscale images. \texttt{images} is a tensor of at least 3 dimensions. The last 3 dimensions are interpreted as \texttt{{[}height,\ width,\ channels{]}}. The other dimensions only represent a collection of images, such as \texttt{{[}batch,\ height,\ width,\ channels{]}.} Contrast is adjusted independently for each channel of each image. For each channel, this Op first computes the mean of the image pixels in the channel and then adjusts each component \texttt{x} of each pixel to \texttt{(x\ -\ mean)\ *\ contrast\_factor\ +\ mean}. The adjusted values are then clipped to fit in the \texttt{{[}min\_value,\ max\_value{]}} interval. If \texttt{min\_value} or \texttt{max\_value} is not given, it is replaced with the minimum and maximum values for the data type of \texttt{images} respectively. The contrast-adjusted image is always computed as \texttt{float}, and it is cast back to its original type after clipping. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \texttt{images}: Images to adjust. At least 3-D. \item \texttt{contrast\_factor}: A float multiplier for adjusting contrast. \item \texttt{min\_value}: Minimum value for clipping the adjusted pixels. \item \texttt{max\_value}: Maximum value for clipping the adjusted pixels. \end{itemize} \subparagraph{Returns: }\label{returns-21} The constrast-adjusted image or images. \subparagraph{Raises: }\label{raises-10} \begin{itemize} \tightlist \item \texttt{ValueError}: if the arguments are invalid. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.random\_contrast(image,\ lower,\ upper,\ seed=None)} }{tf.image.random\_contrast(image, lower, upper, seed=None) }}\label{tf.image.randomux5fcontrastimage-lower-upper-seednone} Adjust the contrase of an image by a random factor. Equivalent to \texttt{adjust\_constrast()} but uses a \texttt{contrast\_factor} randomly picked in the interval \texttt{{[}lower,\ upper{]}}. \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}}. \item \texttt{lower}: float. Lower bound for the random contrast factor. \item \texttt{upper}: float. Upper bound for the random contrast factor. \item \texttt{seed}: A Python integer. Used to create a random seed. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \end{itemize} \subparagraph{Returns: }\label{returns-22} 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}}. \subparagraph{Raises: }\label{raises-11} \begin{itemize} \tightlist \item \texttt{ValueError}: if \texttt{upper\ \textless{}=\ lower} or if \texttt{lower\ \textless{}\ 0}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image.per\_image\_whitening(image)} }{tf.image.per\_image\_whitening(image) }}\label{tf.image.perux5fimageux5fwhiteningimage} Linearly scales \texttt{image} to have zero mean and unit norm. This op computes \texttt{(x\ -\ mean)\ /\ adjusted\_stddev}, where \texttt{mean} is the average of all values in image, and \texttt{adjusted\_stddev\ =\ max(stddev,\ 1.0/srqt(image.NumElements()))}. \texttt{stddev} is the standard deviation of all values in \texttt{image}. It is capped away from zero to protect against division by 0 when handling uniform images. Note that this implementation is limited: * It only whitens based on the statistics of an individual image. * It does not take into account the covariance structure. \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \texttt{image}: 3-D tensor of shape \texttt{{[}height,\ width,\ channels{]}}. \end{itemize} \subparagraph{Returns: }\label{returns-23} The whitened image with same shape as \texttt{image}. \subparagraph{Raises: }\label{raises-12} \begin{itemize} \tightlist \item \texttt{ValueError}: if the shape of `image' is incompatible with this function. \end{itemize} ================================================ FILE: tex_pdf/api/python/c4s08_sparse_ops.tex ================================================ \section{Sparse Tensors }\label{sparse-tensors} Note: Functions taking \texttt{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\texttt{tf.convert\_to\_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-sparse-tensors}{Sparse Tensors}}{Sparse Tensors}}\label{sparse-tensors-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-sparse-tensor-representation}{Sparse Tensor Representation} \item \protect\hyperlink{SparseTensor}{\texttt{class\ tf.SparseTensor}} \item \protect\hyperlink{SparseTensorValue}{\texttt{class\ tf.SparseTensorValue}} \item \protect\hyperlink{AUTOGENERATED-sparse-to-dense-conversion}{Sparse to Dense Conversion} \item \protect\hyperlink{sparseux5ftoux5fdense}{\texttt{tf.sparse\_to\_dense(sparse\_indices,\ output\_shape,\ sparse\_values,\ default\_value,\ name=None)}} \item \protect\hyperlink{sparseux5ftensorux5ftoux5fdense}{\texttt{tf.sparse\_tensor\_to\_dense(sp\_input,\ default\_value,\ name=None)}} \item \protect\hyperlink{sparseux5ftoux5findicator}{\texttt{tf.sparse\_to\_indicator(sp\_input,\ vocab\_size,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-manipulation}{Manipulation} \item \protect\hyperlink{sparseux5fconcat}{\texttt{tf.sparse\_concat(concat\_dim,\ sp\_inputs,\ name=None)}} \item \protect\hyperlink{sparseux5freorder}{\texttt{tf.sparse\_reorder(sp\_input,\ name=None)}} \item \protect\hyperlink{sparseux5fretain}{\texttt{tf.sparse\_retain(sp\_input,\ to\_retain)}} \item \protect\hyperlink{sparseux5ffillux5femptyux5frows}{\texttt{tf.sparse\_fill\_empty\_rows(sp\_input,\ default\_value,\ name=None)}} \end{itemize} \subsection{Sparse Tensor Representation }\label{sparse-tensor-representation} Tensorflow supports a \texttt{SparseTensor} representation for data that is sparse in multiple dimensions. Contrast this representation with \texttt{IndexedSlices}, which is efficient for representing tensors that are sparse in their first dimension, and dense along all other dimensions. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.SparseTensor} }{class tf.SparseTensor }}\label{class-tf.sparsetensor} Represents a sparse tensor. Tensorflow represents a sparse tensor as three separate dense tensors: \texttt{indices}, \texttt{values}, and \texttt{dense\_shape}. In Python, the three tensors are collected into a \texttt{SparseTensor} class for ease of use. If you have separate \texttt{indices}, \texttt{values}, and \texttt{dense\_shape} tensors, wrap them in a \texttt{SparseTensor} object before passing to the Ops below. Concretely, the sparse tensor \texttt{SparseTensor(values,\ indices,\ dense\_shape)} is \begin{itemize} \tightlist \item \texttt{indices}: A 2-D int64 tensor of shape \texttt{{[}N,\ ndims{]}}. \item \texttt{values}: A 1-D tensor of any type and shape \texttt{{[}N{]}}. \item \texttt{dense\_shape}: A 1-D int64 tensor of shape \texttt{{[}ndims{]}}. \end{itemize} where \texttt{N} and \texttt{ndims} are the number of values, and number of dimensions in the \texttt{SparseTensor} respectively. The corresponding dense tensor satisfies \begin{Shaded} \begin{Highlighting}[] \NormalTok{dense.shape }\OperatorTok{=} \NormalTok{dense_shape} \NormalTok{dense[}\BuiltInTok{tuple}\NormalTok{(indices[i])] }\OperatorTok{=} \NormalTok{values[i]} \end{Highlighting} \end{Shaded} By convention, \texttt{indices} should be sorted in row-major order (or equivalently lexigraphic order on the tuples \texttt{indices{[}i{]}}). This is not enforced when \texttt{SparseTensor} objects are constructed, but most Ops assume correct ordering. If the ordering is wrong, it can be fixed by calling \texttt{sparse\_reorder} on the misordered \texttt{SparseTensor}. Example: The sparse tensor \begin{Shaded} \begin{Highlighting}[] \NormalTok{SparseTensor(values}\OperatorTok{=}\NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{], indices}\OperatorTok{=}\NormalTok{[[}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{], [}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{]], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{4}\NormalTok{])} \end{Highlighting} \end{Shaded} represents the dense tensor \begin{Shaded} \begin{Highlighting}[] \NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{]} \NormalTok{[}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{0}\NormalTok{]} \NormalTok{[}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{]]} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensor.\_\_init\_\_(indices,\ values,\ shape)} }{tf.SparseTensor.\_\_init\_\_(indices, values, shape) }}\label{tf.sparsetensor.ux5fux5finitux5fux5findices-values-shape} Creates a \texttt{SparseTensor}. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{indices}: A 2-D int64 tensor of shape \texttt{{[}N,\ ndims{]}}. \item \texttt{values}: A 1-D tensor of any type and shape \texttt{{[}N{]}}. \item \texttt{dense\_shape}: A 1-D int64 tensor of shape \texttt{{[}ndims{]}}. \end{itemize} \subparagraph{Returns: }\label{returns} A \texttt{SparseTensor} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensor.indices} }{tf.SparseTensor.indices }}\label{tf.sparsetensor.indices} The indices of non-zero values in the represented dense tensor. \subparagraph{Returns: }\label{returns-1} A 2-D Tensor of int64 with shape \texttt{{[}N,\ ndims{]}}, where \texttt{N} is the number of non-zero values in the tensor, and \texttt{ndims} is the rank. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensor.values} }{tf.SparseTensor.values }}\label{tf.sparsetensor.values} The non-zero values in the represented dense tensor. \subparagraph{Returns: }\label{returns-2} A 1-D Tensor of any data type. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensor.dtype} }{tf.SparseTensor.dtype }}\label{tf.sparsetensor.dtype} The \texttt{DType} of elements in this tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensor.shape} }{tf.SparseTensor.shape }}\label{tf.sparsetensor.shape} A 1-D Tensor of int64 representing the shape of the dense tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensor.graph} }{tf.SparseTensor.graph }}\label{tf.sparsetensor.graph} The \texttt{Graph} that contains the index, value, and shape tensors. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.SparseTensorValue} }{class tf.SparseTensorValue }}\label{class-tf.sparsetensorvalue} SparseTensorValue(indices, values, shape) - - - \paragraph{\texorpdfstring{\texttt{tf.SparseTensorValue.indices} }{tf.SparseTensorValue.indices }}\label{tf.sparsetensorvalue.indices} Alias for field number 0 \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensorValue.shape} }{tf.SparseTensorValue.shape }}\label{tf.sparsetensorvalue.shape} Alias for field number 2 \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.SparseTensorValue.values} }{tf.SparseTensorValue.values }}\label{tf.sparsetensorvalue.values} Alias for field number 1 \subsection{Sparse to Dense Conversion }\label{sparse-to-dense-conversion} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_to\_dense(sparse\_indices,\ output\_shape,\ sparse\_values,\ default\_value,\ name=None)} }{tf.sparse\_to\_dense(sparse\_indices, output\_shape, sparse\_values, default\_value, name=None) }}\label{tf.sparseux5ftoux5fdensesparseux5findices-outputux5fshape-sparseux5fvalues-defaultux5fvalue-namenone} Converts a sparse representation into a dense tensor. Builds an array \texttt{dense} with shape \texttt{output\_shape} such that \begin{verbatim} # If sparse_indices is scalar dense[i] = (i == sparse_indices ? sparse_values : default_value) # If sparse_indices is a vector, then for each i dense[sparse_indices[i]] = sparse_values[i] # If sparse_indices is an n by d matrix, then for each i in [0, n) dense[sparse_indices[i][0], ..., sparse_indices[i][d-1]] = sparse_values[i] \end{verbatim} All other values in \texttt{dense} are set to \texttt{default\_value}. If \texttt{sparse\_values} is a scalar, all sparse indices are set to this single value. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{sparse\_indices}: A \texttt{Tensor}. Must be one of the following types: \texttt{int32}, \texttt{int64}. 0-D, 1-D, or 2-D. \texttt{sparse\_indices{[}i{]}} contains the complete index where \texttt{sparse\_values{[}i{]}} will be placed. \item \texttt{output\_shape}: A \texttt{Tensor}. Must have the same type as \texttt{sparse\_indices}. 1-D. Shape of the dense output tensor. \item \texttt{sparse\_values}: A \texttt{Tensor}. 1-D. Values corresponding to each row of \texttt{sparse\_indices}, or a scalar value to be used for all sparse indices. \item \texttt{default\_value}: A \texttt{Tensor}. Must have the same type as \texttt{sparse\_values}. Scalar value to set for indices not specified in \texttt{sparse\_indices}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} A \texttt{Tensor}. Has the same type as \texttt{sparse\_values}. Dense output tensor of shape \texttt{output\_shape}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_tensor\_to\_dense(sp\_input,\ default\_value,\ name=None)} }{tf.sparse\_tensor\_to\_dense(sp\_input, default\_value, name=None) }}\label{tf.sparseux5ftensorux5ftoux5fdensespux5finput-defaultux5fvalue-namenone} Converts a \texttt{SparseTensor} into a dense tensor. This op is a convenience wrapper around \texttt{sparse\_to\_dense} for \texttt{SparseTensor}s. For example, if \texttt{sp\_input} has shape \texttt{{[}3,\ 5{]}} and non-empty string values: \begin{verbatim} [0, 1]: a [0, 3]: b [2, 0]: c \end{verbatim} and \texttt{default\_value} is \texttt{x}, then the output will be a dense \texttt{{[}3,\ 5{]}} string tensor with values: \begin{verbatim} [[x a x b x] [x x x x x] [c x x x x]] \end{verbatim} \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{sp\_input}: The input \texttt{SparseTensor}. \item \texttt{default\_value}: Scalar value to set for indices not specified in \texttt{sp\_input}. \item \texttt{name}: A name prefix for the returned tensors (optional). \end{itemize} \subparagraph{Returns: }\label{returns-4} A dense tensor with shape \texttt{sp\_input.shape} and values specified by the non-empty values in \texttt{sp\_input}. Indices not in \texttt{sp\_input} are assigned \texttt{default\_value}. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{sp\_input} is not a \texttt{SparseTensor}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_to\_indicator(sp\_input,\ vocab\_size,\ name=None)} }{tf.sparse\_to\_indicator(sp\_input, vocab\_size, name=None) }}\label{tf.sparseux5ftoux5findicatorspux5finput-vocabux5fsize-namenone} Converts a \texttt{SparseTensor} of ids into a dense bool indicator tensor. The last dimension of \texttt{sp\_input} is discarded and replaced with the values of \texttt{sp\_input}. If \texttt{sp\_input.shape\ =\ {[}D0,\ D1,\ ...,\ Dn,\ K{]}}, then \texttt{output.shape\ =\ {[}D0,\ D1,\ ...,\ Dn,\ vocab\_size{]}}, where \begin{verbatim} output[d_0, d_1, ..., d_n, sp_input[d_0, d_1, ..., d_n, k]] = True \end{verbatim} and False elsewhere in \texttt{output}. For example, if \texttt{sp\_input.shape\ =\ {[}2,\ 3,\ 4{]}} with non-empty values: \begin{verbatim} [0, 0, 0]: 0 [0, 1, 0]: 10 [1, 0, 3]: 103 [1, 1, 2]: 112 [1, 1, 3]: 113 [1, 2, 1]: 121 \end{verbatim} and \texttt{vocab\_size\ =\ 200}, then the output will be a \texttt{{[}2,\ 3,\ 200{]}} dense bool tensor with False everywhere except at positions \begin{verbatim} (0, 0, 0), (0, 1, 10), (1, 0, 103), (1, 1, 112), (1, 1, 113), (1, 2, 121). \end{verbatim} This op is useful for converting \texttt{SparseTensor}s into dense formats for compatibility with ops that expect dense tensors. The input \texttt{SparseTensor} must be in row-major order. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{sp\_input}: A \texttt{SparseTensor} of type \texttt{int32} or \texttt{int64}. \item \texttt{vocab\_size}: The new size of the last dimension, with \texttt{all(0\ \textless{}=\ sp\_input.values\ \textless{}\ vocab\_size)}. \item \texttt{name}: A name prefix for the returned tensors (optional) \end{itemize} \subparagraph{Returns: }\label{returns-5} A dense bool indicator tensor representing the indices with specified value. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{sp\_input} is not a \texttt{SparseTensor}. \end{itemize} \subsection{Manipulation }\label{manipulation} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_concat(concat\_dim,\ sp\_inputs,\ name=None)} }{tf.sparse\_concat(concat\_dim, sp\_inputs, name=None) }}\label{tf.sparseux5fconcatconcatux5fdim-spux5finputs-namenone} Concatenates a list of \texttt{SparseTensor} along the specified dimension. Concatenation is with respect to the dense versions of each sparse input. It is assumed that each inputs is a \texttt{SparseTensor} whose elements are ordered along increasing dimension number. All inputs' shapes must match, except for the concat dimension. The \texttt{indices}, \texttt{values}, and \texttt{shapes} lists must have the same length. The output shape is identical to the inputs', except along the concat dimension, where it is the sum of the inputs' sizes along that dimension. The output elements will be resorted to preserve the sort order along increasing dimension number. This op runs in \texttt{O(M\ log\ M)} time, where \texttt{M} is the total number of non-empty values across all inputs. This is due to the need for an internal sort in order to concatenate efficiently across an arbitrary dimension. For example, if \texttt{concat\_dim\ =\ 1} and the inputs are \begin{verbatim} sp_inputs[0]: shape = [2, 3] [0, 2]: "a" [1, 0]: "b" [1, 1]: "c" sp_inputs[1]: shape = [2, 4] [0, 1]: "d" [0, 2]: "e" \end{verbatim} then the output will be \begin{verbatim} shape = [2, 7] [0, 2]: "a" [0, 4]: "d" [0, 5]: "e" [1, 0]: "b" [1, 1]: "c" \end{verbatim} Graphically this is equivalent to doing \begin{verbatim} [ a] concat [ d e ] = [ a d e ] [b c ] [ ] [b c ] \end{verbatim} \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{concat\_dim}: Dimension to concatenate along. \item \texttt{sp\_inputs}: List of \texttt{SparseTensor} to concatenate. \item \texttt{name}: A name prefix for the returned tensors (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A \texttt{SparseTensor} with the concatenated output. \subparagraph{Raises: }\label{raises-2} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{sp\_inputs} is not a list of \texttt{SparseTensor}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_reorder(sp\_input,\ name=None)} }{tf.sparse\_reorder(sp\_input, name=None) }}\label{tf.sparseux5freorderspux5finput-namenone} Reorders a \texttt{SparseTensor} into the canonical, row-major ordering. Note that by convention, all sparse ops preserve the canonical ordering along increasing dimension number. The only time ordering can be violated is during manual manipulation of the indices and values to add entries. Reordering does not affect the shape of the \texttt{SparseTensor}. For example, if sp\_input has shape \texttt{{[}4,\ 5{]}} and \texttt{indices} / \texttt{values}: \begin{verbatim} [0, 3]: b [0, 1]: a [3, 1]: d [2, 0]: c \end{verbatim} then the output will be a \texttt{SparseTensor} of shape \texttt{{[}4,\ 5{]}} and \texttt{indices} / \texttt{values}: \begin{verbatim} [0, 1]: a [0, 3]: b [2, 0]: c [3, 1]: d \end{verbatim} \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \texttt{sp\_input}: The input \texttt{SparseTensor}. \item \texttt{name}: A name prefix for the returned tensors (optional) \end{itemize} \subparagraph{Returns: }\label{returns-7} A \texttt{SparseTensor} with the same shape and non-empty values, but in canonical ordering. \subparagraph{Raises: }\label{raises-3} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{sp\_input} is not a \texttt{SparseTensor}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_retain(sp\_input,\ to\_retain)} }{tf.sparse\_retain(sp\_input, to\_retain) }}\label{tf.sparseux5fretainspux5finput-toux5fretain} Retains specified non-empty values within a \texttt{SparseTensor}. For example, if \texttt{sp\_input} has shape \texttt{{[}4,\ 5{]}} and 4 non-empty string values: \begin{verbatim} [0, 1]: a [0, 3]: b [2, 0]: c [3, 1]: d \end{verbatim} and \texttt{to\_retain\ =\ {[}True,\ False,\ False,\ True{]}}, then the output will be a \texttt{SparseTensor} of shape \texttt{{[}4,\ 5{]}} with 2 non-empty values: \begin{verbatim} [0, 1]: a [3, 1]: d \end{verbatim} \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{sp\_input}: The input \texttt{SparseTensor} with \texttt{N} non-empty elements. \item \texttt{to\_retain}: A bool vector of length \texttt{N} with \texttt{M} true values. \end{itemize} \subparagraph{Returns: }\label{returns-8} A \texttt{SparseTensor} with the same shape as the input and \texttt{M} non-empty elements corresponding to the true positions in \texttt{to\_retain}. \subparagraph{Raises: }\label{raises-4} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{sp\_input} is not a \texttt{SparseTensor}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sparse\_fill\_empty\_rows(sp\_input,\ default\_value,\ name=None)} }{tf.sparse\_fill\_empty\_rows(sp\_input, default\_value, name=None) }}\label{tf.sparseux5ffillux5femptyux5frowsspux5finput-defaultux5fvalue-namenone} Fills empty rows in the input 2-D \texttt{SparseTensor} with a default value. This op adds entries with the specified \texttt{default\_value} at index \texttt{{[}row,\ 0{]}} for any row in the input that does not already have a value. For example, suppose \texttt{sp\_input} has shape \texttt{{[}5,\ 6{]}} and non-empty values: \begin{verbatim} [0, 1]: a [0, 3]: b [2, 0]: c [3, 1]: d \end{verbatim} Rows 1 and 4 are empty, so the output will be of shape \texttt{{[}5,\ 6{]}} with values: \begin{verbatim} [0, 1]: a [0, 3]: b [1, 0]: default_value [2, 0]: c [3, 1]: d [4, 0]: default_value \end{verbatim} Note that the input may have empty columns at the end, with no effect on this op. The output \texttt{SparseTensor} will be in row-major order and will have the same shape as the input. This op also returns an indicator vector such that \begin{verbatim} empty_row_indicator[i] = True iff row i was an empty row. \end{verbatim} \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{sp\_input}: A \texttt{SparseTensor} with shape \texttt{{[}N,\ M{]}}. \item \texttt{default\_value}: The value to fill for empty rows, with the same type as \texttt{sp\_input.} \item \texttt{name}: A name prefix for the returned tensors (optional) \end{itemize} \subparagraph{Returns: }\label{returns-9} \begin{itemize} \tightlist \item \texttt{sp\_ordered\_output}: A \texttt{SparseTensor} with shape \texttt{{[}N,\ M{]}}, and with all empty rows filled in with \texttt{default\_value}. \item \texttt{empty\_row\_indicator}: A bool vector of length \texttt{N} indicating whether each input row was empty. \end{itemize} \subparagraph{Raises: }\label{raises-5} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{sp\_input} is not a \texttt{SparseTensor}. \end{itemize} ================================================ FILE: tex_pdf/api/python/c4s09_io_ops.tex ================================================ \section{Inputs and Readers }\label{inputs-and-readers} Note: Functions taking \texttt{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\texttt{tf.convert\_to\_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-inputs-and-readers}{Inputs and Readers}}{Inputs and Readers}}\label{inputs-and-readers-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-placeholders}{Placeholders} \item \protect\hyperlink{placeholder}{\texttt{tf.placeholder(dtype,\ shape=None,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-readers}{Readers} \item \protect\hyperlink{ReaderBase}{\texttt{class\ tf.ReaderBase}} \item \protect\hyperlink{TextLineReader}{\texttt{class\ tf.TextLineReader}} \item \protect\hyperlink{WholeFileReader}{\texttt{class\ tf.WholeFileReader}} \item \protect\hyperlink{IdentityReader}{\texttt{class\ tf.IdentityReader}} \item \protect\hyperlink{TFRecordReader}{\texttt{class\ tf.TFRecordReader}} \item \protect\hyperlink{FixedLengthRecordReader}{\texttt{class\ tf.FixedLengthRecordReader}} \item \protect\hyperlink{AUTOGENERATED-converting}{Converting} \item \protect\hyperlink{decodeux5fcsv}{\texttt{tf.decode\_csv(records,\ record\_defaults,\ field\_delim=None,\ name=None)}} \item \protect\hyperlink{decodeux5fraw}{\texttt{tf.decode\_raw(bytes,\ out\_type,\ little\_endian=None,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-example-protocol-buffer}{Example protocol buffer} \item \protect\hyperlink{parseux5fexample}{\texttt{tf.parse\_example(serialized,\ names=None,\ sparse\_keys=None,\ sparse\_types=None,\ dense\_keys=None,\ dense\_types=None,\ dense\_defaults=None,\ dense\_shapes=None,\ name=\textquotesingle{}ParseExample\textquotesingle{})}} \item \protect\hyperlink{parseux5fsingleux5fexample}{\texttt{tf.parse\_single\_example(serialized,\ names=None,\ sparse\_keys=None,\ sparse\_types=None,\ dense\_keys=None,\ dense\_types=None,\ dense\_defaults=None,\ dense\_shapes=None,\ name=\textquotesingle{}ParseSingleExample\textquotesingle{})}} \item \protect\hyperlink{AUTOGENERATED-queues}{Queues} \item \protect\hyperlink{QueueBase}{\texttt{class\ tf.QueueBase}} \item \protect\hyperlink{FIFOQueue}{\texttt{class\ tf.FIFOQueue}} \item \protect\hyperlink{RandomShuffleQueue}{\texttt{class\ tf.RandomShuffleQueue}} \item \protect\hyperlink{AUTOGENERATED-dealing-with-the-filesystem}{Dealing with the filesystem} \item \protect\hyperlink{matchingux5ffiles}{\texttt{tf.matching\_files(pattern,\ name=None)}} \item \protect\hyperlink{readux5ffile}{\texttt{tf.read\_file(filename,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-input-pipeline}{Input pipeline} \item \protect\hyperlink{AUTOGENERATED-beginning-of-an-input-pipeline}{Beginning of an input pipeline} \item \protect\hyperlink{matchux5ffilenamesux5fonce}{\texttt{tf.train.match\_filenames\_once(pattern,\ name=None)}} \item \protect\hyperlink{limitux5fepochs}{\texttt{tf.train.limit\_epochs(tensor,\ num\_epochs=None,\ name=None)}} \item \protect\hyperlink{rangeux5finputux5fproducer}{\texttt{tf.train.range\_input\_producer(limit,\ num\_epochs=None,\ shuffle=True,\ seed=None,\ capacity=32,\ name=None)}} \item \protect\hyperlink{sliceux5finputux5fproducer}{\texttt{tf.train.slice\_input\_producer(tensor\_list,\ num\_epochs=None,\ shuffle=True,\ seed=None,\ capacity=32,\ name=None)}} \item \protect\hyperlink{stringux5finputux5fproducer}{\texttt{tf.train.string\_input\_producer(string\_tensor,\ num\_epochs=None,\ shuffle=True,\ seed=None,\ capacity=32,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-batching-at-the-end-of-an-input-pipeline}{Batching at the end of an input pipeline} \item \protect\hyperlink{batch}{\texttt{tf.train.batch(tensor\_list,\ batch\_size,\ num\_threads=1,\ capacity=32,\ enqueue\_many=False,\ shapes=None,\ name=None)}} \item \protect\hyperlink{batchux5fjoin}{\texttt{tf.train.batch\_join(tensor\_list\_list,\ batch\_size,\ capacity=32,\ enqueue\_many=False,\ shapes=None,\ name=None)}} \item \protect\hyperlink{shuffleux5fbatch}{\texttt{tf.train.shuffle\_batch(tensor\_list,\ batch\_size,\ capacity,\ min\_after\_dequeue,\ num\_threads=1,\ seed=None,\ enqueue\_many=False,\ shapes=None,\ name=None)}} \item \protect\hyperlink{shuffleux5fbatchux5fjoin}{\texttt{tf.train.shuffle\_batch\_join(tensor\_list\_list,\ batch\_size,\ capacity,\ min\_after\_dequeue,\ seed=None,\ enqueue\_many=False,\ shapes=None,\ name=None)}} \end{itemize} \subsection{Placeholders }\label{placeholders} TensorFlow provides a placeholder operation that must be fed with data on execution. For more info, see the section on \href{../../how_tos/reading_data/index.md\#feeding}{Feeding data}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.placeholder(dtype,\ shape=None,\ name=None)} }{tf.placeholder(dtype, shape=None, name=None) }}\label{tf.placeholderdtype-shapenone-namenone} Inserts a placeholder for a tensor that will be always fed. \textbf{Important}: This tensor will produce an error if evaluated. Its value must be fed using the \texttt{feed\_dict} optional argument to \texttt{Session.run()}, \texttt{Tensor.eval()}, or \texttt{Operation.run()}. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{x }\OperatorTok{=} \NormalTok{tf.placeholder(}\BuiltInTok{float}\NormalTok{, shape}\OperatorTok{=}\NormalTok{(}\DecValTok{1024}\NormalTok{, }\DecValTok{1024}\NormalTok{))} \NormalTok{y }\OperatorTok{=} \NormalTok{tf.matmul(x, x)} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess:} \BuiltInTok{print} \NormalTok{sess.run(y) }\CommentTok{# ERROR: will fail because x was not fed.} \NormalTok{rand_array }\OperatorTok{=} \NormalTok{np.random.rand(}\DecValTok{1024}\NormalTok{, }\DecValTok{1024}\NormalTok{)} \BuiltInTok{print} \NormalTok{sess.run(y, feed_dict}\OperatorTok{=}\NormalTok{\{x: rand_array\}) }\CommentTok{# Will succeed.} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{dtype}: The type of elements in the tensor to be fed. \item \texttt{shape}: The shape of the tensor to be fed (optional). If the shape is not specified, you can feed a tensor of any shape. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns} A \texttt{Tensor} that may be used as a handle for feeding a value, but not evaluated directly. \subsection{Readers }\label{readers} TensorFlow provides a set of Reader classes for reading data formats. For more information on inputs and readers, see \href{../../how_tos/reading_data/index.md}{Reading data}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.ReaderBase} }{class tf.ReaderBase }}\label{class-tf.readerbase} Base class for different Reader types, that produce a record every step. Conceptually, Readers convert string `work units' into records (key, value pairs). Typically the `work units' are filenames and the records are extracted from the contents of those files. We want a single record produced per step, but a work unit can correspond to many records. Therefore we introduce some decoupling using a queue. The queue contains the work units and the Reader dequeues from the queue when it is asked to produce a record (via Read()) but it has finished the last work unit. - - - \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.\_\_init\_\_(reader\_ref,\ supports\_serialize=False)} }{tf.ReaderBase.\_\_init\_\_(reader\_ref, supports\_serialize=False) }}\label{tf.readerbase.ux5fux5finitux5fux5freaderux5fref-supportsux5fserializefalse} Creates a new ReaderBase. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{reader\_ref}: The operation that implements the reader. \item \texttt{supports\_serialize}: True if the reader implementation can serialize its state. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.num\_records\_produced(name=None)} }{tf.ReaderBase.num\_records\_produced(name=None) }}\label{tf.readerbase.numux5frecordsux5fproducednamenone} Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-1} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.num\_work\_units\_completed(name=None)} }{tf.ReaderBase.num\_work\_units\_completed(name=None) }}\label{tf.readerbase.numux5fworkux5funitsux5fcompletednamenone} Returns the number of work units this reader has finished processing. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-2} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.read(queue,\ name=None)} }{tf.ReaderBase.read(queue, name=None) }}\label{tf.readerbase.readqueue-namenone} Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g.~when the Reader needs to start reading from a new file since it has finished with the previous file). \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{queue}: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} A tuple of Tensors (key, value). \begin{itemize} \tightlist \item \texttt{key}: A string scalar Tensor. \item \texttt{value}: A string scalar Tensor. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.reader\_ref} }{tf.ReaderBase.reader\_ref }}\label{tf.readerbase.readerux5fref} Op that implements the reader. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.reset(name=None)} }{tf.ReaderBase.reset(name=None) }}\label{tf.readerbase.resetnamenone} Restore a reader to its initial clean state. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-4} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.restore\_state(state,\ name=None)} }{tf.ReaderBase.restore\_state(state, name=None) }}\label{tf.readerbase.restoreux5fstatestate-namenone} Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{state}: A string Tensor. Result of a SerializeState of a Reader with matching type. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-5} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.serialize\_state(name=None)} }{tf.ReaderBase.serialize\_state(name=None) }}\label{tf.readerbase.serializeux5fstatenamenone} Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A string Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.ReaderBase.supports\_serialize} }{tf.ReaderBase.supports\_serialize }}\label{tf.readerbase.supportsux5fserialize} Whether the Reader implementation can serialize its state. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.TextLineReader} }{class tf.TextLineReader }}\label{class-tf.textlinereader} A Reader that outputs the lines of a file delimited by newlines. Newlines are stripped from the output. See ReaderBase for supported methods. - - - \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.\_\_init\_\_(skip\_header\_lines=None,\ name=None)} }{tf.TextLineReader.\_\_init\_\_(skip\_header\_lines=None, name=None) }}\label{tf.textlinereader.ux5fux5finitux5fux5fskipux5fheaderux5flinesnone-namenone} Create a TextLineReader. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \texttt{skip\_header\_lines}: An optional int. Defaults to 0. Number of lines to skip from the beginning of every file. \item \texttt{name}: A name for the operation (optional). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.num\_records\_produced(name=None)} }{tf.TextLineReader.num\_records\_produced(name=None) }}\label{tf.textlinereader.numux5frecordsux5fproducednamenone} Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.num\_work\_units\_completed(name=None)} }{tf.TextLineReader.num\_work\_units\_completed(name=None) }}\label{tf.textlinereader.numux5fworkux5funitsux5fcompletednamenone} Returns the number of work units this reader has finished processing. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.read(queue,\ name=None)} }{tf.TextLineReader.read(queue, name=None) }}\label{tf.textlinereader.readqueue-namenone} Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g.~when the Reader needs to start reading from a new file since it has finished with the previous file). \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \texttt{queue}: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-9} A tuple of Tensors (key, value). \begin{itemize} \tightlist \item \texttt{key}: A string scalar Tensor. \item \texttt{value}: A string scalar Tensor. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.reader\_ref} }{tf.TextLineReader.reader\_ref }}\label{tf.textlinereader.readerux5fref} Op that implements the reader. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.reset(name=None)} }{tf.TextLineReader.reset(name=None) }}\label{tf.textlinereader.resetnamenone} Restore a reader to its initial clean state. \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-10} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.restore\_state(state,\ name=None)} }{tf.TextLineReader.restore\_state(state, name=None) }}\label{tf.textlinereader.restoreux5fstatestate-namenone} Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \texttt{state}: A string Tensor. Result of a SerializeState of a Reader with matching type. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-11} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.serialize\_state(name=None)} }{tf.TextLineReader.serialize\_state(name=None) }}\label{tf.textlinereader.serializeux5fstatenamenone} Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-12} A string Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TextLineReader.supports\_serialize} }{tf.TextLineReader.supports\_serialize }}\label{tf.textlinereader.supportsux5fserialize} Whether the Reader implementation can serialize its state. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.WholeFileReader} }{class tf.WholeFileReader }}\label{class-tf.wholefilereader} A Reader that outputs the entire contents of a file as a value. To use, enqueue filenames in a Queue. The output of Read will be a filename (key) and the contents of that file (value). See ReaderBase for supported methods. - - - \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.\_\_init\_\_(name=None)} }{tf.WholeFileReader.\_\_init\_\_(name=None) }}\label{tf.wholefilereader.ux5fux5finitux5fux5fnamenone} Create a WholeFileReader. \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.num\_records\_produced(name=None)} }{tf.WholeFileReader.num\_records\_produced(name=None) }}\label{tf.wholefilereader.numux5frecordsux5fproducednamenone} Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-13} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.num\_work\_units\_completed(name=None)} }{tf.WholeFileReader.num\_work\_units\_completed(name=None) }}\label{tf.wholefilereader.numux5fworkux5funitsux5fcompletednamenone} Returns the number of work units this reader has finished processing. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-14} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.read(queue,\ name=None)} }{tf.WholeFileReader.read(queue, name=None) }}\label{tf.wholefilereader.readqueue-namenone} Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g.~when the Reader needs to start reading from a new file since it has finished with the previous file). \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \texttt{queue}: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-15} A tuple of Tensors (key, value). \begin{itemize} \tightlist \item \texttt{key}: A string scalar Tensor. \item \texttt{value}: A string scalar Tensor. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.reader\_ref} }{tf.WholeFileReader.reader\_ref }}\label{tf.wholefilereader.readerux5fref} Op that implements the reader. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.reset(name=None)} }{tf.WholeFileReader.reset(name=None) }}\label{tf.wholefilereader.resetnamenone} Restore a reader to its initial clean state. \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-16} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.restore\_state(state,\ name=None)} }{tf.WholeFileReader.restore\_state(state, name=None) }}\label{tf.wholefilereader.restoreux5fstatestate-namenone} Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \texttt{state}: A string Tensor. Result of a SerializeState of a Reader with matching type. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-17} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.serialize\_state(name=None)} }{tf.WholeFileReader.serialize\_state(name=None) }}\label{tf.wholefilereader.serializeux5fstatenamenone} Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-18} A string Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.WholeFileReader.supports\_serialize} }{tf.WholeFileReader.supports\_serialize }}\label{tf.wholefilereader.supportsux5fserialize} Whether the Reader implementation can serialize its state. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.IdentityReader} }{class tf.IdentityReader }}\label{class-tf.identityreader} A Reader that outputs the queued work as both the key and value. To use, enqueue strings in a Queue. Read will take the front work string and output (work, work). See ReaderBase for supported methods. - - - \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.\_\_init\_\_(name=None)} }{tf.IdentityReader.\_\_init\_\_(name=None) }}\label{tf.identityreader.ux5fux5finitux5fux5fnamenone} Create a IdentityReader. \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.num\_records\_produced(name=None)} }{tf.IdentityReader.num\_records\_produced(name=None) }}\label{tf.identityreader.numux5frecordsux5fproducednamenone} Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-19} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.num\_work\_units\_completed(name=None)} }{tf.IdentityReader.num\_work\_units\_completed(name=None) }}\label{tf.identityreader.numux5fworkux5funitsux5fcompletednamenone} Returns the number of work units this reader has finished processing. \subparagraph{Args: }\label{args-24} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-20} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.read(queue,\ name=None)} }{tf.IdentityReader.read(queue, name=None) }}\label{tf.identityreader.readqueue-namenone} Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g.~when the Reader needs to start reading from a new file since it has finished with the previous file). \subparagraph{Args: }\label{args-25} \begin{itemize} \tightlist \item \texttt{queue}: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-21} A tuple of Tensors (key, value). \begin{itemize} \tightlist \item \texttt{key}: A string scalar Tensor. \item \texttt{value}: A string scalar Tensor. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.reader\_ref} }{tf.IdentityReader.reader\_ref }}\label{tf.identityreader.readerux5fref} Op that implements the reader. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.reset(name=None)} }{tf.IdentityReader.reset(name=None) }}\label{tf.identityreader.resetnamenone} Restore a reader to its initial clean state. \subparagraph{Args: }\label{args-26} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-22} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.restore\_state(state,\ name=None)} }{tf.IdentityReader.restore\_state(state, name=None) }}\label{tf.identityreader.restoreux5fstatestate-namenone} Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-27} \begin{itemize} \tightlist \item \texttt{state}: A string Tensor. Result of a SerializeState of a Reader with matching type. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-23} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.serialize\_state(name=None)} }{tf.IdentityReader.serialize\_state(name=None) }}\label{tf.identityreader.serializeux5fstatenamenone} Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-28} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-24} A string Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.IdentityReader.supports\_serialize} }{tf.IdentityReader.supports\_serialize }}\label{tf.identityreader.supportsux5fserialize} Whether the Reader implementation can serialize its state. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.TFRecordReader} }{class tf.TFRecordReader }}\label{class-tf.tfrecordreader} A Reader that outputs the records from a TFRecords file. See ReaderBase for supported methods. - - - \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.\_\_init\_\_(name=None)} }{tf.TFRecordReader.\_\_init\_\_(name=None) }}\label{tf.tfrecordreader.ux5fux5finitux5fux5fnamenone} Create a TFRecordReader. \subparagraph{Args: }\label{args-29} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.num\_records\_produced(name=None)} }{tf.TFRecordReader.num\_records\_produced(name=None) }}\label{tf.tfrecordreader.numux5frecordsux5fproducednamenone} Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. \subparagraph{Args: }\label{args-30} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-25} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.num\_work\_units\_completed(name=None)} }{tf.TFRecordReader.num\_work\_units\_completed(name=None) }}\label{tf.tfrecordreader.numux5fworkux5funitsux5fcompletednamenone} Returns the number of work units this reader has finished processing. \subparagraph{Args: }\label{args-31} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-26} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.read(queue,\ name=None)} }{tf.TFRecordReader.read(queue, name=None) }}\label{tf.tfrecordreader.readqueue-namenone} Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g.~when the Reader needs to start reading from a new file since it has finished with the previous file). \subparagraph{Args: }\label{args-32} \begin{itemize} \tightlist \item \texttt{queue}: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-27} A tuple of Tensors (key, value). \begin{itemize} \tightlist \item \texttt{key}: A string scalar Tensor. \item \texttt{value}: A string scalar Tensor. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.reader\_ref} }{tf.TFRecordReader.reader\_ref }}\label{tf.tfrecordreader.readerux5fref} Op that implements the reader. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.reset(name=None)} }{tf.TFRecordReader.reset(name=None) }}\label{tf.tfrecordreader.resetnamenone} Restore a reader to its initial clean state. \subparagraph{Args: }\label{args-33} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-28} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.restore\_state(state,\ name=None)} }{tf.TFRecordReader.restore\_state(state, name=None) }}\label{tf.tfrecordreader.restoreux5fstatestate-namenone} Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-34} \begin{itemize} \tightlist \item \texttt{state}: A string Tensor. Result of a SerializeState of a Reader with matching type. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-29} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.serialize\_state(name=None)} }{tf.TFRecordReader.serialize\_state(name=None) }}\label{tf.tfrecordreader.serializeux5fstatenamenone} Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-35} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-30} A string Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.TFRecordReader.supports\_serialize} }{tf.TFRecordReader.supports\_serialize }}\label{tf.tfrecordreader.supportsux5fserialize} Whether the Reader implementation can serialize its state. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.FixedLengthRecordReader} }{class tf.FixedLengthRecordReader }}\label{class-tf.fixedlengthrecordreader} A Reader that outputs fixed-length records from a file. See ReaderBase for supported methods. - - - \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.\_\_init\_\_(record\_bytes,\ header\_bytes=None,\ footer\_bytes=None,\ name=None)} }{tf.FixedLengthRecordReader.\_\_init\_\_(record\_bytes, header\_bytes=None, footer\_bytes=None, name=None) }}\label{tf.fixedlengthrecordreader.ux5fux5finitux5fux5frecordux5fbytes-headerux5fbytesnone-footerux5fbytesnone-namenone} Create a FixedLengthRecordReader. \subparagraph{Args: }\label{args-36} \begin{itemize} \tightlist \item \texttt{record\_bytes}: An int. \item \texttt{header\_bytes}: An optional int. Defaults to 0. \item \texttt{footer\_bytes}: An optional int. Defaults to 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.num\_records\_produced(name=None)} }{tf.FixedLengthRecordReader.num\_records\_produced(name=None) }}\label{tf.fixedlengthrecordreader.numux5frecordsux5fproducednamenone} Returns the number of records this reader has produced. This is the same as the number of Read executions that have succeeded. \subparagraph{Args: }\label{args-37} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-31} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.num\_work\_units\_completed(name=None)} }{tf.FixedLengthRecordReader.num\_work\_units\_completed(name=None) }}\label{tf.fixedlengthrecordreader.numux5fworkux5funitsux5fcompletednamenone} Returns the number of work units this reader has finished processing. \subparagraph{Args: }\label{args-38} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-32} An int64 Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.read(queue,\ name=None)} }{tf.FixedLengthRecordReader.read(queue, name=None) }}\label{tf.fixedlengthrecordreader.readqueue-namenone} Returns the next record (key, value pair) produced by a reader. Will dequeue a work unit from queue if necessary (e.g.~when the Reader needs to start reading from a new file since it has finished with the previous file). \subparagraph{Args: }\label{args-39} \begin{itemize} \tightlist \item \texttt{queue}: A Queue or a mutable string Tensor representing a handle to a Queue, with string work items. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-33} A tuple of Tensors (key, value). \begin{itemize} \tightlist \item \texttt{key}: A string scalar Tensor. \item \texttt{value}: A string scalar Tensor. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.reader\_ref} }{tf.FixedLengthRecordReader.reader\_ref }}\label{tf.fixedlengthrecordreader.readerux5fref} Op that implements the reader. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.reset(name=None)} }{tf.FixedLengthRecordReader.reset(name=None) }}\label{tf.fixedlengthrecordreader.resetnamenone} Restore a reader to its initial clean state. \subparagraph{Args: }\label{args-40} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-34} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.restore\_state(state,\ name=None)} }{tf.FixedLengthRecordReader.restore\_state(state, name=None) }}\label{tf.fixedlengthrecordreader.restoreux5fstatestate-namenone} Restore a reader to a previously saved state. Not all Readers support being restored, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-41} \begin{itemize} \tightlist \item \texttt{state}: A string Tensor. Result of a SerializeState of a Reader with matching type. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-35} The created Operation. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.serialize\_state(name=None)} }{tf.FixedLengthRecordReader.serialize\_state(name=None) }}\label{tf.fixedlengthrecordreader.serializeux5fstatenamenone} Produce a string tensor that encodes the state of a reader. Not all Readers support being serialized, so this can produce an Unimplemented error. \subparagraph{Args: }\label{args-42} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-36} A string Tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FixedLengthRecordReader.supports\_serialize} }{tf.FixedLengthRecordReader.supports\_serialize }}\label{tf.fixedlengthrecordreader.supportsux5fserialize} Whether the Reader implementation can serialize its state. \subsection{Converting }\label{converting} TensorFlow provides several operations that you can use to convert various data formats into tensors. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.decode\_csv(records,\ record\_defaults,\ field\_delim=None,\ name=None)} }{tf.decode\_csv(records, record\_defaults, field\_delim=None, name=None) }}\label{tf.decodeux5fcsvrecords-recordux5fdefaults-fieldux5fdelimnone-namenone} Convert CSV records to tensors. Each column maps to one tensor. RFC 4180 format is expected for the CSV records. (https://tools.ietf.org/html/rfc4180) Note that we allow leading and trailing spaces with int or float field. \subparagraph{Args: }\label{args-43} \begin{itemize} \tightlist \item \texttt{records}: A \texttt{Tensor} of type \texttt{string}. Each string is a record/row in the csv and all records should have the same format. \item \texttt{record\_defaults}: A list of \texttt{Tensor} objects with types from: \texttt{float32}, \texttt{int32}, \texttt{int64}, \texttt{string}. One tensor per column of the input record, with either a scalar default value for that column or empty if the column is required. \item \texttt{field\_delim}: An optional \texttt{string}. Defaults to \texttt{","}. delimiter to separate fields in a record. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-37} A list of \texttt{Tensor} objects. Has the same type as \texttt{record\_defaults}. Each tensor will have the same shape as records. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.decode\_raw(bytes,\ out\_type,\ little\_endian=None,\ name=None)} }{tf.decode\_raw(bytes, out\_type, little\_endian=None, name=None) }}\label{tf.decodeux5frawbytes-outux5ftype-littleux5fendiannone-namenone} Reinterpret the bytes of a string as a vector of numbers. \subparagraph{Args: }\label{args-44} \begin{itemize} \tightlist \item \texttt{bytes}: A \texttt{Tensor} of type \texttt{string}. All the elements must have the same length. \item \texttt{out\_type}: A \texttt{tf.DType} from: \texttt{tf.float32,\ tf.float64,\ tf.int32,\ tf.uint8,\ tf.int16,\ tf.int8,\ tf.int64}. \item \texttt{little\_endian}: An optional \texttt{bool}. Defaults to \texttt{True}. Whether the input bytes are in little-endian order. Ignored for out\_types that are stored in a single byte like uint8. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-38} A \texttt{Tensor} of type \texttt{out\_type}. A Tensor with one more dimension than the input bytes. The added dimension will have size equal to the length of the elements of bytes divided by the number of bytes to represent out\_type. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{Example protocol buffer }\label{example-protocol-buffer} TensorFlow's \href{../../how_tos/reading_data/index.md\#standard-tensorflow-format}{recommended format for training examples} is serialized \texttt{Example} protocol buffers, \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto}{described here}. They contain \texttt{Features}, \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto}{described here}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.parse\_example(serialized,\ names=None,\ sparse\_keys=None,\ sparse\_types=None,\ dense\_keys=None,\ dense\_types=None,\ dense\_defaults=None,\ dense\_shapes=None,\ name=\textquotesingle{}ParseExample\textquotesingle{})} }{tf.parse\_example(serialized, names=None, sparse\_keys=None, sparse\_types=None, dense\_keys=None, dense\_types=None, dense\_defaults=None, dense\_shapes=None, name='ParseExample') }}\label{tf.parseux5fexampleserialized-namesnone-sparseux5fkeysnone-sparseux5ftypesnone-denseux5fkeysnone-denseux5ftypesnone-denseux5fdefaultsnone-denseux5fshapesnone-nameparseexample} Parses \texttt{Example} protos. Parses a number of serialized {[}\texttt{Example}{]} (https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto) protos given in \texttt{serialized}. \texttt{names} may contain descriptive names for the corresponding serialized protos. These may be useful for debugging purposes, but they have no effect on the output. If not \texttt{None}, \texttt{names} must be the same length as \texttt{serialized}. This op parses serialized examples into a dictionary mapping keys to \texttt{Tensor} and \texttt{SparseTensor} objects respectively, depending on whether the keys appear in \texttt{sparse\_keys} or \texttt{dense\_keys}. The key \texttt{dense\_keys{[}j{]}} is mapped to a \texttt{Tensor} of type \texttt{dense\_types{[}j{]}} and of shape \texttt{(serialized.size(),)\ +\ dense\_shapes{[}j{]}}. \texttt{dense\_defaults} provides defaults for values referenced using \texttt{dense\_keys}. If a key is not present in this dictionary, the corresponding dense \texttt{Feature} is required in all elements of \texttt{serialized}. \texttt{dense\_shapes{[}j{]}} provides the shape of each \texttt{Feature} entry referenced by \texttt{dense\_keys{[}j{]}}. The number of elements in the \texttt{Feature} corresponding to \texttt{dense\_key{[}j{]}} must always have \texttt{np.prod(dense\_shapes{[}j{]})} entries. The returned \texttt{Tensor} for \texttt{dense\_key{[}j{]}} has shape \texttt{{[}N{]}\ +\ dense\_shape{[}j{]}}, where \texttt{N} is the number of \texttt{Example}s in \texttt{serialized}. The key \texttt{sparse\_keys{[}j{]}} is mapped to a \texttt{SparseTensor} of type \texttt{sparse\_types{[}j{]}}. The \texttt{SparseTensor} represents a ragged matrix. Its indices are \texttt{{[}batch,\ index{]}} where \texttt{batch} is the batch entry the value is from, and \texttt{index} is the value's index in the list of values associated with that feature and example. Examples: For example, if one expects a \texttt{tf.float32} sparse feature \texttt{ft} and three serialized \texttt{Example}s are provided: \begin{verbatim} serialized = [ features: { feature: [ key: { "ft" value: float_list: { value: [1.0, 2.0] } } ] }, features: { feature: [] }, features: { feature: [ key: { "ft" value: float_list: { value: [3.0] } } ] } ] \end{verbatim} then the output will look like: \begin{verbatim} {"ft": SparseTensor(indices=[[0, 0], [0, 1], [2, 0]], values=[1.0, 2.0, 3.0], shape=(3, 2)) } \end{verbatim} Given two \texttt{Example} input protos in \texttt{serialized}: \begin{verbatim} [ features: { feature: { key: "kw" value: { bytes_list: { value: [ "knit", "big" ] } } } feature: { key: "gps" value: { float_list: { value: [] } } } }, features: { feature: { key: "kw" value: { bytes_list: { value: [ "emmy" ] } } } feature: { key: "dank" value: { int64_list: { value: [ 42 ] } } } feature: { key: "gps" value: { } } } ] \end{verbatim} And arguments \begin{verbatim} names: ["input0", "input1"], sparse_keys: ["kw", "dank", "gps"] sparse_types: [DT_STRING, DT_INT64, DT_FLOAT] \end{verbatim} Then the output is a dictionary: \begin{Shaded} \begin{Highlighting}[] \NormalTok{\{} \StringTok{"kw"}\NormalTok{: SparseTensor(} \NormalTok{indices}\OperatorTok{=}\NormalTok{[[}\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{], [}\DecValTok{0}\NormalTok{, }\DecValTok{1}\NormalTok{], [}\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{]],} \NormalTok{values}\OperatorTok{=}\NormalTok{[}\StringTok{"knit"}\NormalTok{, }\StringTok{"big"}\NormalTok{, }\StringTok{"emmy"}\NormalTok{]} \NormalTok{shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{2}\NormalTok{]),} \CommentTok{"dank"}\NormalTok{: SparseTensor(} \NormalTok{indices}\OperatorTok{=}\NormalTok{[[}\DecValTok{1}\NormalTok{, }\DecValTok{0}\NormalTok{]],} \NormalTok{values}\OperatorTok{=}\NormalTok{[}\DecValTok{42}\NormalTok{],} \NormalTok{shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{1}\NormalTok{]),} \CommentTok{"gps"}\NormalTok{: SparseTensor(} \NormalTok{indices}\OperatorTok{=}\NormalTok{[],} \NormalTok{values}\OperatorTok{=}\NormalTok{[],} \NormalTok{shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{0}\NormalTok{]),} \NormalTok{\}} \end{Highlighting} \end{Shaded} For dense results in two serialized \texttt{Example}s: \begin{verbatim} [ features: { feature: { key: "age" value: { int64_list: { value: [ 0 ] } } } feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } } }, features: { feature: { key: "age" value: { int64_list: { value: [] } } } feature: { key: "gender" value: { bytes_list: { value: [ "f" ] } } } } ] \end{verbatim} We can use arguments: \begin{verbatim} names: ["input0", "input1"], dense_keys: np.array(["age", "gender"]), dense_types: [tf.int64, tf.string], dense_defaults: { "age": -1 # "age" defaults to -1 if missing # "gender" has no specified default so it's required } dense_shapes: [(1,), (1,)], # age, gender, label, weight \end{verbatim} And the expected output is: \begin{Shaded} \begin{Highlighting}[] \NormalTok{\{} \StringTok{"age"}\NormalTok{: [[}\DecValTok{0}\NormalTok{], [}\OperatorTok{-}\DecValTok{1}\NormalTok{]],} \CommentTok{"gender"}\NormalTok{: [[}\StringTok{"f"}\NormalTok{], [}\StringTok{"f"}\NormalTok{]],} \NormalTok{\}} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-45} \begin{itemize} \tightlist \item \texttt{serialized}: A list of strings, a batch of binary serialized \texttt{Example} protos. \item \texttt{names}: A list of strings, the names of the serialized protos. \item \texttt{sparse\_keys}: A list of string keys in the examples' features. The results for these keys will be returned as \texttt{SparseTensor} objects. \item \texttt{sparse\_types}: A list of \texttt{DTypes} of the same length as \texttt{sparse\_keys}. Only \texttt{tf.float32} (\texttt{FloatList}), \texttt{tf.int64} (\texttt{Int64List}), and \texttt{tf.string} (\texttt{BytesList}) are supported. \item \texttt{dense\_keys}: A list of string keys in the examples' features. The results for these keys will be returned as \texttt{Tensor}s \item \texttt{dense\_types}: A list of DTypes of the same length as \texttt{dense\_keys}. Only \texttt{tf.float32} (\texttt{FloatList}), \texttt{tf.int64} (\texttt{Int64List}), and \texttt{tf.string} (\texttt{BytesList}) are supported. \item \texttt{dense\_defaults}: A dict mapping string keys to \texttt{Tensor}s. The keys of the dict must match the dense\_keys of the feature. \item \texttt{dense\_shapes}: A list of tuples with the same length as \texttt{dense\_keys}. The shape of the data for each dense feature referenced by \texttt{dense\_keys}. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-39} A \texttt{dict} mapping keys to \texttt{Tensor}s and \texttt{SparseTensor}s. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{ValueError}: If sparse and dense key sets intersect, or input lengths do not match up. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.parse\_single\_example(serialized,\ names=None,\ sparse\_keys=None,\ sparse\_types=None,\ dense\_keys=None,\ dense\_types=None,\ dense\_defaults=None,\ dense\_shapes=None,\ name=\textquotesingle{}ParseSingleExample\textquotesingle{})} }{tf.parse\_single\_example(serialized, names=None, sparse\_keys=None, sparse\_types=None, dense\_keys=None, dense\_types=None, dense\_defaults=None, dense\_shapes=None, name='ParseSingleExample') }}\label{tf.parseux5fsingleux5fexampleserialized-namesnone-sparseux5fkeysnone-sparseux5ftypesnone-denseux5fkeysnone-denseux5ftypesnone-denseux5fdefaultsnone-denseux5fshapesnone-nameparsesingleexample} Parses a single \texttt{Example} proto. Similar to \texttt{parse\_example}, except: For dense tensors, the returned \texttt{Tensor} is identical to the output of \texttt{parse\_example}, except there is no batch dimension, the output shape is the same as the shape given in \texttt{dense\_shape}. For \texttt{SparseTensor}s, the first (batch) column of the indices matrix is removed (the indices matrix is a column vector), the values vector is unchanged, and the first (batch\_size) entry of the shape vector is removed (it is now a single element vector). See also \texttt{parse\_example}. \subparagraph{Args: }\label{args-46} \begin{itemize} \tightlist \item \texttt{serialized}: A scalar string, a single serialized Example. See parse\_example documentation for more details. \item \texttt{names}: (Optional) A scalar string, the associated name. See parse\_example documentation for more details. \item \texttt{sparse\_keys}: See parse\_example documentation for more details. \item \texttt{sparse\_types}: See parse\_example documentation for more details. \item \texttt{dense\_keys}: See parse\_example documentation for more details. \item \texttt{dense\_types}: See parse\_example documentation for more details. \item \texttt{dense\_defaults}: See parse\_example documentation for more details. \item \texttt{dense\_shapes}: See parse\_example documentation for more details. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-40} A dictionary mapping keys to Tensors and SparseTensors. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{ValueError}: if ``scalar'' or ``names'' have known shapes, and are not scalars. \end{itemize} \subsection{Queues }\label{queues} TensorFlow provides several implementations of `Queues', which are structures within the TensorFlow computation graph to stage pipelines of tensors together. The following describe the basic Queue interface and some implementations. To see an example use, see \href{../../how_tos/threading_and_queues/index.md}{Threading and Queues}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.QueueBase} }{class tf.QueueBase }}\label{class-tf.queuebase} Base class for queue implementations. A queue is a TensorFlow data structure that stores tensors across multiple steps, and exposes operations that enqueue and dequeue tensors. Each queue element is a tuple of one or more tensors, where each tuple component has a static dtype, and may have a static shape. The queue implementations support versions of enqueue and dequeue that handle single elements, versions that support enqueuing and dequeuing a batch of elements at once. See \protect\hyperlink{FIFOQueue}{\texttt{tf.FIFOQueue}} and \protect\hyperlink{RandomShuffleQueue}{\texttt{tf.RandomShuffleQueue}} for concrete implementations of this class, and instructions on how to create them. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.enqueue(vals,\ name=None)} }{tf.QueueBase.enqueue(vals, name=None) }}\label{tf.queuebase.enqueuevals-namenone} Enqueues one element to this queue. If the queue is full when this operation executes, it will block until the element has been enqueued. \subparagraph{Args: }\label{args-47} \begin{itemize} \tightlist \item \texttt{vals}: The tuple of \texttt{Tensor} objects to be enqueued. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-41} The operation that enqueues a new tuple of tensors to the queue. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.enqueue\_many(vals,\ name=None)} }{tf.QueueBase.enqueue\_many(vals, name=None) }}\label{tf.queuebase.enqueueux5fmanyvals-namenone} Enqueues zero or elements to this queue. This operation slices each component tensor along the 0th dimension to make multiple queue elements. All of the tensors in \texttt{vals} must have the same size in the 0th dimension. If the queue is full when this operation executes, it will block until all of the elements have been enqueued. \subparagraph{Args: }\label{args-48} \begin{itemize} \tightlist \item \texttt{vals}: The tensor or tuple of tensors from which the queue elements are taken. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-42} The operation that enqueues a batch of tuples of tensors to the queue. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.dequeue(name=None)} }{tf.QueueBase.dequeue(name=None) }}\label{tf.queuebase.dequeuenamenone} Dequeues one element from this queue. If the queue is empty when this operation executes, it will block until there is an element to dequeue. \subparagraph{Args: }\label{args-49} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-43} The tuple of tensors that was dequeued. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.dequeue\_many(n,\ name=None)} }{tf.QueueBase.dequeue\_many(n, name=None) }}\label{tf.queuebase.dequeueux5fmanyn-namenone} Dequeues and concatenates \texttt{n} elements from this queue. This operation concatenates queue-element component tensors along the 0th dimension to make a single component tensor. All of the components in the dequeued tuple will have size \texttt{n} in the 0th dimension. If the queue contains fewer than \texttt{n} elements when this operation executes, it will block until \texttt{n} elements have been dequeued. \subparagraph{Args: }\label{args-50} \begin{itemize} \tightlist \item \texttt{n}: A scalar \texttt{Tensor} containing the number of elements to dequeue. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-44} The tuple of concatenated tensors that was dequeued. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.size(name=None)} }{tf.QueueBase.size(name=None) }}\label{tf.queuebase.sizenamenone} Compute the number of elements in this queue. \subparagraph{Args: }\label{args-51} \begin{itemize} \tightlist \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-45} A scalar tensor containing the number of elements in this queue. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.close(cancel\_pending\_enqueues=False,\ name=None)} }{tf.QueueBase.close(cancel\_pending\_enqueues=False, name=None) }}\label{tf.queuebase.closecancelux5fpendingux5fenqueuesfalse-namenone} Closes this queue. This operation signals that no more elements will be enqueued in the given queue. Subsequent \texttt{enqueue} and \texttt{enqueue\_many} operations will fail. Subsequent \texttt{dequeue} and \texttt{dequeue\_many} operations will continue to succeed if sufficient elements remain in the queue. Subsequent \texttt{dequeue} and \texttt{dequeue\_many} operations that would block will fail immediately. If \texttt{cancel\_pending\_enqueues} is \texttt{True}, all pending requests will also be cancelled. \subparagraph{Args: }\label{args-52} \begin{itemize} \tightlist \item \texttt{cancel\_pending\_enqueues}: (Optional.) A boolean, defaulting to \texttt{False} (described above). \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-46} The operation that closes the queue. \paragraph{Other Methods }\label{other-methods} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.\_\_init\_\_(dtypes,\ shapes,\ queue\_ref)} }{tf.QueueBase.\_\_init\_\_(dtypes, shapes, queue\_ref) }}\label{tf.queuebase.ux5fux5finitux5fux5fdtypes-shapes-queueux5fref} Constructs a queue object from a queue reference. \subparagraph{Args: }\label{args-53} \begin{itemize} \tightlist \item \texttt{dtypes}: A list of types. The length of dtypes must equal the number of tensors in each element. \item \texttt{shapes}: Constraints on the shapes of tensors in an element: A list of shape tuples or None. This list is the same length as dtypes. If the shape of any tensors in the element are constrained, all must be; shapes can be None if the shapes should not be constrained. \item \texttt{queue\_ref}: The queue reference, i.e.~the output of the queue op. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.dtypes} }{tf.QueueBase.dtypes }}\label{tf.queuebase.dtypes} The list of dtypes for each component of a queue element. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.name} }{tf.QueueBase.name }}\label{tf.queuebase.name} The name of the underlying queue. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.QueueBase.queue\_ref} }{tf.QueueBase.queue\_ref }}\label{tf.queuebase.queueux5fref} The underlying queue reference. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.FIFOQueue} }{class tf.FIFOQueue }}\label{class-tf.fifoqueue} A queue implementation that dequeues elements in first-in-first out order. See \protect\hyperlink{QueueBase}{\texttt{tf.QueueBase}} for a description of the methods on this class. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.FIFOQueue.\_\_init\_\_(capacity,\ dtypes,\ shapes=None,\ shared\_name=None,\ name=\textquotesingle{}fifo\_queue\textquotesingle{})} }{tf.FIFOQueue.\_\_init\_\_(capacity, dtypes, shapes=None, shared\_name=None, name='fifo\_queue') }}\label{tf.fifoqueue.ux5fux5finitux5fux5fcapacity-dtypes-shapesnone-sharedux5fnamenone-namefifoux5fqueue} Creates a queue that dequeues elements in a first-in first-out order. A \texttt{FIFOQueue} has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery. A \texttt{FIFOQueue} holds a list of up to \texttt{capacity} elements. Each element is a fixed-length tuple of tensors whose dtypes are described by \texttt{dtypes}, and whose shapes are optionally described by the \texttt{shapes} argument. If the \texttt{shapes} argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of \texttt{dequeue\_many} is disallowed. \subparagraph{Args: }\label{args-54} \begin{itemize} \tightlist \item \texttt{capacity}: An integer. The upper bound on the number of elements that may be stored in this queue. \item \texttt{dtypes}: A list of \texttt{DType} objects. The length of \texttt{dtypes} must equal the number of tensors in each queue element. \item \texttt{shapes}: (Optional.) A list of fully-defined \texttt{TensorShape} objects, with the same length as \texttt{dtypes} or \texttt{None}. \item \texttt{shared\_name}: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions. \item \texttt{name}: Optional name for the queue operation. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.RandomShuffleQueue} }{class tf.RandomShuffleQueue }}\label{class-tf.randomshufflequeue} A queue implementation that dequeues elements in a random order. See \protect\hyperlink{QueueBase}{\texttt{tf.QueueBase}} for a description of the methods on this class. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.RandomShuffleQueue.\_\_init\_\_(capacity,\ min\_after\_dequeue,\ dtypes,\ shapes=None,\ seed=None,\ shared\_name=None,\ name=\textquotesingle{}random\_shuffle\_queue\textquotesingle{})} }{tf.RandomShuffleQueue.\_\_init\_\_(capacity, min\_after\_dequeue, dtypes, shapes=None, seed=None, shared\_name=None, name='random\_shuffle\_queue') }}\label{tf.randomshufflequeue.ux5fux5finitux5fux5fcapacity-minux5fafterux5fdequeue-dtypes-shapesnone-seednone-sharedux5fnamenone-namerandomux5fshuffleux5fqueue} Create a queue that dequeues elements in a random order. A \texttt{RandomShuffleQueue} has bounded capacity; supports multiple concurrent producers and consumers; and provides exactly-once delivery. A \texttt{RandomShuffleQueue} holds a list of up to \texttt{capacity} elements. Each element is a fixed-length tuple of tensors whose dtypes are described by \texttt{dtypes}, and whose shapes are optionally described by the \texttt{shapes} argument. If the \texttt{shapes} argument is specified, each component of a queue element must have the respective fixed shape. If it is unspecified, different queue elements may have different shapes, but the use of \texttt{dequeue\_many} is disallowed. The \texttt{min\_after\_dequeue} argument allows the caller to specify a minimum number of elements that will remain in the queue after a \texttt{dequeue} or \texttt{dequeue\_many} operation completes, to ensure a minimum level of mixing of elements. This invariant is maintained by blocking those operations until sufficient elements have been enqueued. The \texttt{min\_after\_dequeue} argument is ignored after the queue has been closed. \subparagraph{Args: }\label{args-55} \begin{itemize} \tightlist \item \texttt{capacity}: An integer. The upper bound on the number of elements that may be stored in this queue. \item \texttt{min\_after\_dequeue}: An integer (described above). \item \texttt{dtypes}: A list of \texttt{DType} objects. The length of \texttt{dtypes} must equal the number of tensors in each queue element. \item \texttt{shapes}: (Optional.) A list of fully-defined \texttt{TensorShape} objects, with the same length as \texttt{dtypes} or \texttt{None}. \item \texttt{seed}: A Python integer. Used to create a random seed. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \item \texttt{shared\_name}: (Optional.) If non-empty, this queue will be shared under the given name across multiple sessions. \item \texttt{name}: Optional name for the queue operation. \end{itemize} \subsection{Dealing with the filesystem }\label{dealing-with-the-filesystem} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.matching\_files(pattern,\ name=None)} }{tf.matching\_files(pattern, name=None) }}\label{tf.matchingux5ffilespattern-namenone} Returns the set of files matching a pattern. Note that this routine only supports wildcard characters in the basename portion of the pattern, not in the directory portion. \subparagraph{Args: }\label{args-56} \begin{itemize} \tightlist \item \texttt{pattern}: A \texttt{Tensor} of type \texttt{string}. A (scalar) shell wildcard pattern. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-47} A \texttt{Tensor} of type \texttt{string}. A vector of matching filenames. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.read\_file(filename,\ name=None)} }{tf.read\_file(filename, name=None) }}\label{tf.readux5ffilefilename-namenone} Reads and outputs the entire contents of the input filename. \subparagraph{Args: }\label{args-57} \begin{itemize} \tightlist \item \texttt{filename}: A \texttt{Tensor} of type \texttt{string}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-48} A \texttt{Tensor} of type \texttt{string}. \subsection{Input pipeline }\label{input-pipeline} TensorFlow functions for setting up an input-prefetching pipeline. Please see the \href{../../how_tos/reading_data/index.md}{reading data how-to} for context. \subsubsection{Beginning of an input pipeline }\label{beginning-of-an-input-pipeline} The ``producer'' functions add a queue to the graph and a corresponding \texttt{QueueRunner} for running the subgraph that fills that queue. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.match\_filenames\_once(pattern,\ name=None)} }{tf.train.match\_filenames\_once(pattern, name=None) }}\label{tf.train.matchux5ffilenamesux5foncepattern-namenone} Save the list of files matching pattern, so it is only computed once. \subparagraph{Args: }\label{args-58} \begin{itemize} \tightlist \item \texttt{pattern}: A file pattern (glob). \item \texttt{name}: A name for the operations (optional). \end{itemize} \subparagraph{Returns: }\label{returns-49} A variable that is initialized to the list of files matching pattern. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.limit\_epochs(tensor,\ num\_epochs=None,\ name=None)} }{tf.train.limit\_epochs(tensor, num\_epochs=None, name=None) }}\label{tf.train.limitux5fepochstensor-numux5fepochsnone-namenone} Returns tensor num\_epochs times and then raises an OutOfRange error. \subparagraph{Args: }\label{args-59} \begin{itemize} \tightlist \item \texttt{tensor}: Any Tensor. \item \texttt{num\_epochs}: An integer (optional). If specified, limits the number of steps the output tensor may be evaluated. \item \texttt{name}: A name for the operations (optional). \end{itemize} \subparagraph{Returns: }\label{returns-50} tensor or OutOfRange. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.range\_input\_producer(limit,\ num\_epochs=None,\ shuffle=True,\ seed=None,\ capacity=32,\ name=None)} }{tf.train.range\_input\_producer(limit, num\_epochs=None, shuffle=True, seed=None, capacity=32, name=None) }}\label{tf.train.rangeux5finputux5fproducerlimit-numux5fepochsnone-shuffletrue-seednone-capacity32-namenone} Produces the integers from 0 to limit-1 in a queue. \subparagraph{Args: }\label{args-60} \begin{itemize} \tightlist \item \texttt{limit}: An int32 scalar tensor. \item \texttt{num\_epochs}: An integer (optional). If specified, \texttt{range\_input\_producer} produces each integer \texttt{num\_epochs} times before generating an OutOfRange error. If not specified, \texttt{range\_input\_producer} can cycle through the integers an unlimited number of times. \item \texttt{shuffle}: Boolean. If true, the integers are randomly shuffled within each epoch. \item \texttt{seed}: An integer (optional). Seed used if shuffle == True. \item \texttt{capacity}: An integer. Sets the queue capacity. \item \texttt{name}: A name for the operations (optional). \end{itemize} \subparagraph{Returns: }\label{returns-51} A Queue with the output integers. A QueueRunner for the Queue is added to the current Graph's QUEUE\_RUNNER collection. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.slice\_input\_producer(tensor\_list,\ num\_epochs=None,\ shuffle=True,\ seed=None,\ capacity=32,\ name=None)} }{tf.train.slice\_input\_producer(tensor\_list, num\_epochs=None, shuffle=True, seed=None, capacity=32, name=None) }}\label{tf.train.sliceux5finputux5fproducertensorux5flist-numux5fepochsnone-shuffletrue-seednone-capacity32-namenone} Produces a slice of each Tensor in tensor\_list. Implemented using a Queue -- a QueueRunner for the Queue is added to the current Graph's QUEUE\_RUNNER collection. \subparagraph{Args: }\label{args-61} \begin{itemize} \tightlist \item \texttt{tensor\_list}: A list of Tensors. Every Tensor in tensor\_list must have the same size in the first dimension. \item \texttt{num\_epochs}: An integer (optional). If specified, \texttt{slice\_input\_producer} produces each slice \texttt{num\_epochs} times before generating an OutOfRange error. If not specified, \texttt{slice\_input\_producer} can cycle through the slices an unlimited number of times. \item \texttt{seed}: An integer (optional). Seed used if shuffle == True. \item \texttt{capacity}: An integer. Sets the queue capacity. \item \texttt{name}: A name for the operations (optional). \end{itemize} \subparagraph{Returns: }\label{returns-52} A list of tensors, one for each element of tensor\_list. If the tensor in tensor\_list has shape {[}N, a, b, .., z{]}, then the corresponding output tensor will have shape {[}a, b, \ldots{}, z{]}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.string\_input\_producer(string\_tensor,\ num\_epochs=None,\ shuffle=True,\ seed=None,\ capacity=32,\ name=None)} }{tf.train.string\_input\_producer(string\_tensor, num\_epochs=None, shuffle=True, seed=None, capacity=32, name=None) }}\label{tf.train.stringux5finputux5fproducerstringux5ftensor-numux5fepochsnone-shuffletrue-seednone-capacity32-namenone} Output strings (e.g.~filenames) to a queue for an input pipeline. \subparagraph{Args: }\label{args-62} \begin{itemize} \tightlist \item \texttt{string\_tensor}: A 1-D string tensor with the strings to produce. \item \texttt{num\_epochs}: An integer (optional). If specified, \texttt{string\_input\_producer} produces each string from \texttt{string\_tensor} \texttt{num\_epochs} times before generating an OutOfRange error. If not specified, \texttt{string\_input\_producer} can cycle through the strings in \texttt{string\_tensor} an unlimited number of times. \item \texttt{shuffle}: Boolean. If true, the strings are randomly shuffled within each epoch. \item \texttt{seed}: An integer (optional). Seed used if shuffle == True. \item \texttt{capacity}: An integer. Sets the queue capacity. \item \texttt{name}: A name for the operations (optional). \end{itemize} \subparagraph{Returns: }\label{returns-53} A queue with the output strings. A QueueRunner for the Queue is added to the current Graph's QUEUE\_RUNNER collection. \subsubsection{Batching at the end of an input pipeline }\label{batching-at-the-end-of-an-input-pipeline} These functions add a queue to the graph to assemble a batch of examples, with possible shuffling. They also add a \texttt{QueueRunner} for running the subgraph that fills that queue. Use \protect\hyperlink{batch}{batch} or \protect\hyperlink{batchux5fjoin}{batch\_join} for batching examples that have already been well shuffled. Use \protect\hyperlink{shuffleux5fbatch}{shuffle\_batch} or \protect\hyperlink{shuffleux5fbatchux5fjoin}{shuffle\_batch\_join} for examples that would benefit from additional shuffling. Use \protect\hyperlink{batch}{batch} or \protect\hyperlink{shuffleux5fbatch}{shuffle\_batch} if you want a single thread producing examples to batch, or if you have a single subgraph producing examples but you want to run it in N threads (where you increase N until it can keep the queue full). Use \protect\hyperlink{batchux5fjoin}{batch\_join} or \protect\hyperlink{shuffleux5fbatchux5fjoin}{shuffle\_batch\_join} if you have N different subgraphs producing examples to batch and you want them run by N threads. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.batch(tensor\_list,\ batch\_size,\ num\_threads=1,\ capacity=32,\ enqueue\_many=False,\ shapes=None,\ name=None)} }{tf.train.batch(tensor\_list, batch\_size, num\_threads=1, capacity=32, enqueue\_many=False, shapes=None, name=None) }}\label{tf.train.batchtensorux5flist-batchux5fsize-numux5fthreads1-capacity32-enqueueux5fmanyfalse-shapesnone-namenone} Creates batches of tensors in \texttt{tensor\_list}. This function is implemented using a queue. A \texttt{QueueRunner} for the queue is added to the current \texttt{Graph}'s \texttt{QUEUE\_RUNNER} collection. If \texttt{enqueue\_many} is \texttt{False}, \texttt{tensor\_list} is assumed to represent a single example. An input tensor with shape \texttt{{[}x,\ y,\ z{]}} will be output as a tensor with shape \texttt{{[}batch\_size,\ x,\ y,\ z{]}}. If \texttt{enqueue\_many} is \texttt{True}, \texttt{tensor\_list} is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of \texttt{tensor\_list} should have the same size in the first dimension. If an input tensor has shape \texttt{{[}*,\ x,\ y,\ z{]}}, the output will have shape \texttt{{[}batch\_size,\ x,\ y,\ z{]}}. The \texttt{capacity} argument controls the how long the prefetching is allowed to grow the queues. \subparagraph{Args: }\label{args-63} \begin{itemize} \tightlist \item \texttt{tensor\_list}: The list of tensors to enqueue. \item \texttt{batch\_size}: The new batch size pulled from the queue. \item \texttt{num\_threads}: The number of threads enqueuing \texttt{tensor\_list}. \item \texttt{capacity}: An integer. The maximum number of elements in the queue. \item \texttt{enqueue\_many}: Whether each tensor in \texttt{tensor\_list} is a single example. \item \texttt{shapes}: (Optional) The shapes for each example. Defaults to the inferred shapes for \texttt{tensor\_list}. \item \texttt{name}: (Optional) A name for the operations. \end{itemize} \subparagraph{Returns: }\label{returns-54} A list of tensors with the same number and types as \texttt{tensor\_list}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.batch\_join(tensor\_list\_list,\ batch\_size,\ capacity=32,\ enqueue\_many=False,\ shapes=None,\ name=None)} }{tf.train.batch\_join(tensor\_list\_list, batch\_size, capacity=32, enqueue\_many=False, shapes=None, name=None) }}\label{tf.train.batchux5fjointensorux5flistux5flist-batchux5fsize-capacity32-enqueueux5fmanyfalse-shapesnone-namenone} Runs a list of tensors to fill a queue to create batches of examples. Enqueues a different list of tensors in different threads. Implemented using a queue -- a \texttt{QueueRunner} for the queue is added to the current \texttt{Graph}'s \texttt{QUEUE\_RUNNER} collection. \texttt{len(tensor\_list\_list)} threads will be started, with thread \texttt{i} enqueuing the tensors from \texttt{tensor\_list\_list{[}i{]}}. \texttt{tensor\_list\_list{[}i1{]}{[}j{]}} must match \texttt{tensor\_list\_list{[}i2{]}{[}j{]}} in type and shape, except in the first dimension if \texttt{enqueue\_many} is true. If \texttt{enqueue\_many} is \texttt{False}, each \texttt{tensor\_list\_list{[}i{]}} is assumed to represent a single example. An input tensor \texttt{x} will be output as a tensor with shape \texttt{{[}batch\_size{]}\ +\ x.shape}. If \texttt{enqueue\_many} is \texttt{True}, \texttt{tensor\_list\_list{[}i{]}} is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of \texttt{tensor\_list\_list{[}i{]}} should have the same size in the first dimension. The slices of any input tensor \texttt{x} are treated as examples, and the output tensors will have shape \texttt{{[}batch\_size{]}\ +\ x.shape{[}1:{]}}. The \texttt{capacity} argument controls the how long the prefetching is allowed to grow the queues. \subparagraph{Args: }\label{args-64} \begin{itemize} \tightlist \item \texttt{tensor\_list\_list}: A list of tuples of tensors to enqueue. \item \texttt{batch\_size}: An integer. The new batch size pulled from the queue. \item \texttt{capacity}: An integer. The maximum number of elements in the queue. \item \texttt{enqueue\_many}: Whether each tensor in \texttt{tensor\_list\_list} is a single example. \item \texttt{shapes}: (Optional) The shapes for each example. Defaults to the inferred shapes for \texttt{tensor\_list\_list{[}i{]}}. \item \texttt{name}: (Optional) A name for the operations. \end{itemize} \subparagraph{Returns: }\label{returns-55} A list of tensors with the same number and types as \texttt{tensor\_list\_list{[}i{]}}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.shuffle\_batch(tensor\_list,\ batch\_size,\ capacity,\ min\_after\_dequeue,\ num\_threads=1,\ seed=None,\ enqueue\_many=False,\ shapes=None,\ name=None)} }{tf.train.shuffle\_batch(tensor\_list, batch\_size, capacity, min\_after\_dequeue, num\_threads=1, seed=None, enqueue\_many=False, shapes=None, name=None) }}\label{tf.train.shuffleux5fbatchtensorux5flist-batchux5fsize-capacity-minux5fafterux5fdequeue-numux5fthreads1-seednone-enqueueux5fmanyfalse-shapesnone-namenone} Creates batches by randomly shuffling tensors. This function adds the following to the current \texttt{Graph}: \begin{itemize} \tightlist \item A shuffling queue into which tensors from \texttt{tensor\_list} are enqueued. \item A \texttt{dequeue\_many} operation to create batches from the queue. \item A \texttt{QueueRunner} to \texttt{QUEUE\_RUNNER} collection, to enqueue the tensors from \texttt{tensor\_list}. \end{itemize} If \texttt{enqueue\_many} is \texttt{False}, \texttt{tensor\_list} is assumed to represent a single example. An input tensor with shape \texttt{{[}x,\ y,\ z{]}} will be output as a tensor with shape \texttt{{[}batch\_size,\ x,\ y,\ z{]}}. If \texttt{enqueue\_many} is \texttt{True}, \texttt{tensor\_list} is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of \texttt{tensor\_list} should have the same size in the first dimension. If an input tensor has shape \texttt{{[}*,\ x,\ y,\ z{]}}, the output will have shape \texttt{{[}batch\_size,\ x,\ y,\ z{]}}. The \texttt{capacity} argument controls the how long the prefetching is allowed to grow the queues. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Creates batches of 32 images and 32 labels.} \NormalTok{image_batch, label_batch }\OperatorTok{=} \NormalTok{tf.train.shuffle_batch(} \NormalTok{[single_image, single_label],} \NormalTok{batch_size}\OperatorTok{=}\DecValTok{32}\NormalTok{,} \NormalTok{num_threads}\OperatorTok{=}\DecValTok{4}\NormalTok{,} \NormalTok{capacity}\OperatorTok{=}\DecValTok{50000}\NormalTok{,} \NormalTok{min_after_dequeue}\OperatorTok{=}\DecValTok{10000}\NormalTok{)} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-65} \begin{itemize} \tightlist \item \texttt{tensor\_list}: The list of tensors to enqueue. \item \texttt{batch\_size}: The new batch size pulled from the queue. \item \texttt{capacity}: An integer. The maximum number of elements in the queue. \item \texttt{min\_after\_dequeue}: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements. \item \texttt{num\_threads}: The number of threads enqueuing \texttt{tensor\_list}. \item \texttt{seed}: Seed for the random shuffling within the queue. \item \texttt{enqueue\_many}: Whether each tensor in \texttt{tensor\_list} is a single example. \item \texttt{shapes}: (Optional) The shapes for each example. Defaults to the inferred shapes for \texttt{tensor\_list}. \item \texttt{name}: (Optional) A name for the operations. \end{itemize} \subparagraph{Returns: }\label{returns-56} A list of tensors with the same number and types as \texttt{tensor\_list}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.shuffle\_batch\_join(tensor\_list\_list,\ batch\_size,\ capacity,\ min\_after\_dequeue,\ seed=None,\ enqueue\_many=False,\ shapes=None,\ name=None)} }{tf.train.shuffle\_batch\_join(tensor\_list\_list, batch\_size, capacity, min\_after\_dequeue, seed=None, enqueue\_many=False, shapes=None, name=None) }}\label{tf.train.shuffleux5fbatchux5fjointensorux5flistux5flist-batchux5fsize-capacity-minux5fafterux5fdequeue-seednone-enqueueux5fmanyfalse-shapesnone-namenone} Create batches by randomly shuffling tensors. This version enqueues a different list of tensors in different threads. It adds the following to the current \texttt{Graph}: \begin{itemize} \tightlist \item A shuffling queue into which tensors from \texttt{tensor\_list\_list} are enqueued. \item A \texttt{dequeue\_many} operation to create batches from the queue. \item A \texttt{QueueRunner} to \texttt{QUEUE\_RUNNER} collection, to enqueue the tensors from \texttt{tensor\_list\_list}. \end{itemize} \texttt{len(tensor\_list\_list)} threads will be started, with thread \texttt{i} enqueuing the tensors from \texttt{tensor\_list\_list{[}i{]}}. \texttt{tensor\_list\_list{[}i1{]}{[}j{]}} must match \texttt{tensor\_list\_list{[}i2{]}{[}j{]}} in type and shape, except in the first dimension if \texttt{enqueue\_many} is true. If \texttt{enqueue\_many} is \texttt{False}, each \texttt{tensor\_list\_list{[}i{]}} is assumed to represent a single example. An input tensor with shape \texttt{{[}x,\ y,\ z{]}} will be output as a tensor with shape \texttt{{[}batch\_size,\ x,\ y,\ z{]}}. If \texttt{enqueue\_many} is \texttt{True}, \texttt{tensor\_list\_list{[}i{]}} is assumed to represent a batch of examples, where the first dimension is indexed by example, and all members of \texttt{tensor\_list\_list{[}i{]}} should have the same size in the first dimension. If an input tensor has shape \texttt{{[}*,\ x,\ y,\ z{]}}, the output will have shape \texttt{{[}batch\_size,\ x,\ y,\ z{]}}. The \texttt{capacity} argument controls the how long the prefetching is allowed to grow the queues. \subparagraph{Args: }\label{args-66} \begin{itemize} \tightlist \item \texttt{tensor\_list\_list}: A list of tuples of tensors to enqueue. \item \texttt{batch\_size}: An integer. The new batch size pulled from the queue. \item \texttt{capacity}: An integer. The maximum number of elements in the queue. \item \texttt{min\_after\_dequeue}: Minimum number elements in the queue after a dequeue, used to ensure a level of mixing of elements. \item \texttt{seed}: Seed for the random shuffling within the queue. \item \texttt{enqueue\_many}: Whether each tensor in \texttt{tensor\_list\_list} is a single example. \item \texttt{shapes}: (Optional) The shapes for each example. Defaults to the inferred shapes for \texttt{tensor\_list\_list{[}i{]}}. \item \texttt{name}: (Optional) A name for the operations. \end{itemize} \subparagraph{Returns: }\label{returns-57} A list of tensors with the same number and types as \texttt{tensor\_list\_list{[}i{]}}. ================================================ FILE: tex_pdf/api/python/c4s10_python_io.tex ================================================ \section{Data IO (Python functions) }\label{data-io-python-functions} \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-data-io--python-functions-}{Data IO (Python functions)}}{Data IO (Python functions)}}\label{data-io-python-functions-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-data-io--python-functions-}{Data IO (Python Functions)} \item \protect\hyperlink{TFRecordWriter}{\texttt{class\ tf.python\_io.TFRecordWriter}} \item \protect\hyperlink{tfux5frecordux5fiterator}{\texttt{tf.python\_io.tf\_record\_iterator(path)}} \item \protect\hyperlink{AUTOGENERATED-tfrecords-format-details}{TFRecords Format Details} \end{itemize} \subsection{Data IO (Python Functions) }\label{data-io-python-functions-2} A TFRecords file represents a sequence of (binary) strings. The format is not random access, so it is suitable for streaming large amounts of data but not suitable if fast sharding or other non-sequential access is desired. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.python\_io.TFRecordWriter} }{class tf.python\_io.TFRecordWriter }}\label{class-tf.pythonux5fio.tfrecordwriter} A class to write records to a TFRecords file. This class implements \texttt{\_\_enter\_\_} and \texttt{\_\_exit\_\_}, and can be used in \texttt{with} blocks like a normal file. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.python\_io.TFRecordWriter.\_\_init\_\_(path)} }{tf.python\_io.TFRecordWriter.\_\_init\_\_(path) }}\label{tf.pythonux5fio.tfrecordwriter.ux5fux5finitux5fux5fpath} Opens file \texttt{path} and creates a \texttt{TFRecordWriter} writing to it. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{path}: The path to the TFRecords file. \end{itemize} \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{IOError}: If \texttt{path} cannot be opened for writing. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.python\_io.TFRecordWriter.write(record)} }{tf.python\_io.TFRecordWriter.write(record) }}\label{tf.pythonux5fio.tfrecordwriter.writerecord} Write a string record to the file. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{record}: str \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.python\_io.TFRecordWriter.close()} }{tf.python\_io.TFRecordWriter.close() }}\label{tf.pythonux5fio.tfrecordwriter.close} Close the file. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.python\_io.tf\_record\_iterator(path)} }{tf.python\_io.tf\_record\_iterator(path) }}\label{tf.pythonux5fio.tfux5frecordux5fiteratorpath} An iterator that read the records from a TFRecords file. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{path}: The path to the TFRecords file. \end{itemize} \subparagraph{Yields: }\label{yields} Strings. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{IOError}: If \texttt{path} cannot be opened for reading. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{TFRecords Format Details }\label{tfrecords-format-details} A TFRecords file contains a sequence of strings with CRC hashes. Each record has the format \begin{verbatim} uint64 length uint32 masked_crc32_of_length byte data[length] uint32 masked_crc32_of_data \end{verbatim} and the records are concatenated together to produce the file. The CRC32s are \href{https://en.wikipedia.org/wiki/Cyclic_redundancy_check}{described here}, and the mask of a CRC is \begin{verbatim} masked_crc = ((crc >> 15) | (crc << 17)) + 0xa282ead8ul \end{verbatim} ================================================ FILE: tex_pdf/api/python/c4s11_nn.tex ================================================ \begin{document} \section{Neural Network }\label{neural-network} Note: Functions taking \texttt{Tensor} arguments can also take anything accepted by \href{../../api_docs/python/framework.md\#convert_to_tensor}{\texttt{tf.convert\_to\_tensor}}. \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-neural-network}{Neural Network}}{Neural Network}}\label{neural-network-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-activation-functions}{Activation Functions} \item \protect\hyperlink{relu}{\texttt{tf.nn.relu(features,\ name=None)}} \item \protect\hyperlink{relu6}{\texttt{tf.nn.relu6(features,\ name=None)}} \item \protect\hyperlink{softplus}{\texttt{tf.nn.softplus(features,\ name=None)}} \item \protect\hyperlink{dropout}{\texttt{tf.nn.dropout(x,\ keep\_prob,\ noise\_shape=None,\ seed=None,\ name=None)}} \item \protect\hyperlink{biasux5fadd}{\texttt{tf.nn.bias\_add(value,\ bias,\ name=None)}} \item \protect\hyperlink{sigmoid}{\texttt{tf.sigmoid(x,\ name=None)}} \item \protect\hyperlink{tanh}{\texttt{tf.tanh(x,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-convolution}{Convolution} \item \protect\hyperlink{conv2d}{\texttt{tf.nn.conv2d(input,\ filter,\ strides,\ padding,\ use\_cudnn\_on\_gpu=None,\ name=None)}} \item \protect\hyperlink{depthwiseux5fconv2d}{\texttt{tf.nn.depthwise\_conv2d(input,\ filter,\ strides,\ padding,\ name=None)}} \item \protect\hyperlink{separableux5fconv2d}{\texttt{tf.nn.separable\_conv2d(input,\ depthwise\_filter,\ pointwise\_filter,\ strides,\ padding,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-pooling}{Pooling} \item \protect\hyperlink{avgux5fpool}{\texttt{tf.nn.avg\_pool(value,\ ksize,\ strides,\ padding,\ name=None)}} \item \protect\hyperlink{maxux5fpool}{\texttt{tf.nn.max\_pool(value,\ ksize,\ strides,\ padding,\ name=None)}} \item \protect\hyperlink{maxux5fpoolux5fwithux5fargmax}{\texttt{tf.nn.max\_pool\_with\_argmax(input,\ ksize,\ strides,\ padding,\ Targmax=None,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-normalization}{Normalization} \item \protect\hyperlink{l2ux5fnormalize}{\texttt{tf.nn.l2\_normalize(x,\ dim,\ epsilon=1e-12,\ name=None)}} \item \protect\hyperlink{localux5fresponseux5fnormalization}{\texttt{tf.nn.local\_response\_normalization(input,\ depth\_radius=None,\ bias=None,\ alpha=None,\ beta=None,\ name=None)}} \item \protect\hyperlink{moments}{\texttt{tf.nn.moments(x,\ axes,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-losses}{Losses} \item \protect\hyperlink{l2ux5floss}{\texttt{tf.nn.l2\_loss(t,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-classification}{Classification} \item \protect\hyperlink{sigmoidux5fcrossux5fentropyux5fwithux5flogits}{\texttt{tf.nn.sigmoid\_cross\_entropy\_with\_logits(logits,\ targets,\ name=None)}} \item \protect\hyperlink{softmax}{\texttt{tf.nn.softmax(logits,\ name=None)}} \item \protect\hyperlink{softmaxux5fcrossux5fentropyux5fwithux5flogits}{\texttt{tf.nn.softmax\_cross\_entropy\_with\_logits(logits,\ labels,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-embeddings}{Embeddings} \item \protect\hyperlink{embeddingux5flookup}{\texttt{tf.nn.embedding\_lookup(params,\ ids,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-evaluation}{Evaluation} \item \protect\hyperlink{topux5fk}{\texttt{tf.nn.top\_k(input,\ k,\ name=None)}} \item \protect\hyperlink{inux5ftopux5fk}{\texttt{tf.nn.in\_top\_k(predictions,\ targets,\ k,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-candidate-sampling}{Candidate Sampling} \item \protect\hyperlink{AUTOGENERATED-sampled-loss-functions}{Sampled Loss Functions} \item \protect\hyperlink{nceux5floss}{\texttt{tf.nn.nce\_loss(weights,\ biases,\ inputs,\ labels,\ num\_sampled,\ num\_classes,\ num\_true=1,\ sampled\_values=None,\ remove\_accidental\_hits=False,\ name=\textquotesingle{}nce\_loss\textquotesingle{})}} \item \protect\hyperlink{sampledux5fsoftmaxux5floss}{\texttt{tf.nn.sampled\_softmax\_loss(weights,\ biases,\ inputs,\ labels,\ num\_sampled,\ num\_classes,\ num\_true=1,\ sampled\_values=None,\ remove\_accidental\_hits=True,\ name=\textquotesingle{}sampled\_softmax\_loss\textquotesingle{})}} \item \protect\hyperlink{AUTOGENERATED-candidate-samplers}{Candidate Samplers} \item \protect\hyperlink{uniformux5fcandidateux5fsampler}{\texttt{tf.nn.uniform\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ seed=None,\ name=None)}} \item \protect\hyperlink{logux5funiformux5fcandidateux5fsampler}{\texttt{tf.nn.log\_uniform\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ seed=None,\ name=None)}} \item \protect\hyperlink{learnedux5funigramux5fcandidateux5fsampler}{\texttt{tf.nn.learned\_unigram\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ seed=None,\ name=None)}} \item \protect\hyperlink{fixedux5funigramux5fcandidateux5fsampler}{\texttt{tf.nn.fixed\_unigram\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ vocab\_file=\textquotesingle{}\textquotesingle{},\ distortion=0.0,\ num\_reserved\_ids=0,\ num\_shards=1,\ shard=0,\ unigrams={[}{]},\ seed=None,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-miscellaneous-candidate-sampling-utilities}{Miscellaneous candidate sampling utilities} \item \protect\hyperlink{computeux5faccidentalux5fhits}{\texttt{tf.nn.compute\_accidental\_hits(true\_classes,\ sampled\_candidates,\ num\_true,\ seed=None,\ name=None)}} \end{itemize} \subsection{Activation Functions }\label{activation-functions} The activation ops provide different types of nonlinearities for use in neural networks. These include smooth nonlinearities (\texttt{sigmoid}, \texttt{tanh}, and \texttt{softplus}), continuous but not everywhere differentiable functions (\texttt{relu}, \texttt{relu6}, and \texttt{relu\_x}), and random regularization (\texttt{dropout}). All activation ops apply componentwise, and produce a tensor of the same shape as the input tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.relu(features,\ name=None)} }{tf.nn.relu(features, name=None) }}\label{tf.nn.relufeatures-namenone} Computes rectified linear: \texttt{max(features,\ 0)}. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{features}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns} A \texttt{Tensor}. Has the same type as \texttt{features}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.relu6(features,\ name=None)} }{tf.nn.relu6(features, name=None) }}\label{tf.nn.relu6features-namenone} Computes Rectified Linear 6: \texttt{min(max(features,\ 0),\ 6)}. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{features}: A \texttt{Tensor} with type \texttt{float}, \texttt{double}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, or \texttt{int8}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-1} A \texttt{Tensor} with the same type as \texttt{features}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.softplus(features,\ name=None)} }{tf.nn.softplus(features, name=None) }}\label{tf.nn.softplusfeatures-namenone} Computes softplus: \texttt{log(exp(features)\ +\ 1)}. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{features}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-2} A \texttt{Tensor}. Has the same type as \texttt{features}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.dropout(x,\ keep\_prob,\ noise\_shape=None,\ seed=None,\ name=None)} }{tf.nn.dropout(x, keep\_prob, noise\_shape=None, seed=None, name=None) }}\label{tf.nn.dropoutx-keepux5fprob-noiseux5fshapenone-seednone-namenone} Computes dropout. With probability \texttt{keep\_prob}, outputs the input element scaled up by \texttt{1\ /\ keep\_prob}, otherwise outputs \texttt{0}. The scaling is so that the expected sum is unchanged. By default, each element is kept or dropped independently. If \texttt{noise\_shape} is specified, it must be \href{http://docs.scipy.org/doc/numpy/user/basics.broadcasting.html}{broadcastable} to the shape of \texttt{x}, and only dimensions with \texttt{noise\_shape{[}i{]}\ ==\ shape(x){[}i{]}} will make independent decisions. For example, if \texttt{shape(x)\ =\ {[}k,\ l,\ m,\ n{]}} and \texttt{noise\_shape\ =\ {[}k,\ 1,\ 1,\ n{]}}, each batch and channel component will be kept independently and each row and column will be kept or not kept together. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{x}: A tensor. \item \texttt{keep\_prob}: A Python float. The probability that each element is kept. \item \texttt{noise\_shape}: A 1-D \texttt{Tensor} of type \texttt{int32}, representing the shape for randomly generated keep/drop flags. \item \texttt{seed}: A Python integer. Used to create random seeds. See \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} for behavior. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-3} A Tensor of the same shape of \texttt{x}. \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{ValueError}: If \texttt{keep\_prob} is not in \texttt{(0,\ 1{]}}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.bias\_add(value,\ bias,\ name=None)} }{tf.nn.bias\_add(value, bias, name=None) }}\label{tf.nn.biasux5faddvalue-bias-namenone} Adds \texttt{bias} to \texttt{value}. This is (mostly) a special case of \texttt{tf.add} where \texttt{bias} is restricted to 1-D. Broadcasting is supported, so \texttt{value} may have any number of dimensions. Unlike \texttt{tf.add}, the type of \texttt{bias} is allowed to differ from \texttt{value} in the case where both types are quantized. \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{value}: A \texttt{Tensor} with type \texttt{float}, \texttt{double}, \texttt{int64}, \texttt{int32}, \texttt{uint8}, \texttt{int16}, \texttt{int8}, or \texttt{complex64}. \item \texttt{bias}: A 1-D \texttt{Tensor} with size matching the last dimension of \texttt{value}. Must be the same type as \texttt{value} unless \texttt{value} is a quantized type, in which case a different quantized type may be used. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-4} A \texttt{Tensor} with the same type as \texttt{value}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.sigmoid(x,\ name=None)} }{tf.sigmoid(x, name=None) }}\label{tf.sigmoidx-namenone} Computes sigmoid of \texttt{x} element-wise. Specifically, \texttt{y\ =\ 1\ /\ (1\ +\ exp(-x))}. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \texttt{x}: A Tensor with type \texttt{float}, \texttt{double}, \texttt{int32}, \texttt{complex64}, \texttt{int64}, or \texttt{qint32}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-5} A Tensor with the same type as \texttt{x} if \texttt{x.dtype\ !=\ qint32} otherwise the return type is \texttt{quint8}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.tanh(x,\ name=None)} }{tf.tanh(x, name=None) }}\label{tf.tanhx-namenone} Computes hyperbolic tangent of \texttt{x} element-wise. \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{x}: A Tensor with type \texttt{float}, \texttt{double}, \texttt{int32}, \texttt{complex64}, \texttt{int64}, or \texttt{qint32}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A Tensor with the same type as \texttt{x} if \texttt{x.dtype\ !=\ qint32} otherwise the return type is \texttt{quint8}. \subsection{Convolution }\label{convolution} The convolution ops sweep a 2-D filter over a batch of images, applying the filter to each window of each image of the appropriate size. The different ops trade off between generic vs.~specific filters: \begin{itemize} \tightlist \item \texttt{conv2d}: Arbitrary filters that can mix channels together. \item \texttt{depthwise\_conv2d}: Filters that operate on each channel independently. \item \texttt{separable\_conv2d}: A depthwise spatial filter followed by a pointwise filter. \end{itemize} Note that although these ops are called ``convolution'', they are strictly speaking ``cross-correlation'' since the filter is combined with an input window without reversing the filter. For details, see \href{https://en.wikipedia.org/wiki/Cross-correlation\#Properties}{the properties of cross-correlation}. The filter is applied to image patches of the same size as the filter and strided according to the \texttt{strides} argument. \texttt{strides\ =\ {[}1,\ 1,\ 1,\ 1{]}} applies the filter to a patch at every offset, \texttt{strides\ =\ {[}1,\ 2,\ 2,\ 1{]}} applies the filter to every other image patch in each dimension, etc. Ignoring channels for the moment, the spatial semantics of the convolution ops are as follows. If the 4-D \texttt{input} has shape \texttt{{[}batch,\ in\_height,\ in\_width,\ ...{]}} and the 4-D \texttt{filter} has shape \texttt{{[}filter\_height,\ filter\_width,\ ...{]}}, then \begin{verbatim} shape(output) = [batch, (in_height - filter_height + 1) / strides[1], (in_width - filter_width + 1) / strides[2], ...] output[b, i, j, :] = sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, ...] * filter[di, dj, ...] \end{verbatim} Since \texttt{input} is 4-D, each \texttt{input{[}b,\ i,\ j,\ :{]}} is a vector. For \texttt{conv2d}, these vectors are multiplied by the \texttt{filter{[}di,\ dj,\ :,\ :{]}} matrices to produce new vectors. For \texttt{depthwise\_conv\_2d}, each scalar component \texttt{input{[}b,\ i,\ j,\ k{]}} is multiplied by a vector \texttt{filter{[}di,\ dj,\ k{]}}, and all the vectors are concatenated. In the formula for \texttt{shape(output)}, the rounding direction depends on padding: \begin{itemize} \tightlist \item \texttt{padding\ =\ \textquotesingle{}SAME\textquotesingle{}}: Round down (only full size windows are considered). \item \texttt{padding\ =\ \textquotesingle{}VALID\textquotesingle{}}: Round up (partial windows are included). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.conv2d(input,\ filter,\ strides,\ padding,\ use\_cudnn\_on\_gpu=None,\ name=None)} }{tf.nn.conv2d(input, filter, strides, padding, use\_cudnn\_on\_gpu=None, name=None) }}\label{tf.nn.conv2dinput-filter-strides-padding-useux5fcudnnux5fonux5fgpunone-namenone} Computes a 2-D convolution given 4-D \texttt{input} and \texttt{filter} tensors. Given an input tensor of shape \texttt{{[}batch,\ in\_height,\ in\_width,\ in\_channels{]}} and a filter / kernel tensor of shape \texttt{{[}filter\_height,\ filter\_width,\ in\_channels,\ out\_channels{]}}, this op performs the following: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item Flattens the filter to a 2-D matrix with shape \texttt{{[}filter\_height\ *\ filter\_width\ *\ in\_channels,\ output\_channels{]}}. \item Extracts image patches from the the input tensor to form a \emph{virtual} tensor of shape \texttt{{[}batch,\ out\_height,\ out\_width,\ \ \ \ filter\_height\ *\ filter\_width\ *\ in\_channels{]}}. \item For each patch, right-multiplies the filter matrix and the image patch vector. \end{enumerate} In detail, \begin{verbatim} output[b, i, j, k] = sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] * filter[di, dj, q, k] \end{verbatim} Must have \texttt{strides{[}0{]}\ =\ strides{[}3{]}\ =\ 1}. For the most common case of the same horizontal and vertices strides, \texttt{strides\ =\ {[}1,\ stride,\ stride,\ 1{]}}. \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. \item \texttt{filter}: A \texttt{Tensor}. Must have the same type as \texttt{input}. \item \texttt{strides}: A list of \texttt{ints}. 1-D of length 4. The stride of the sliding window for each dimension of \texttt{input}. \item \texttt{padding}: A \texttt{string} from: \texttt{"SAME",\ "VALID"}. The type of padding algorithm to use. \item \texttt{use\_cudnn\_on\_gpu}: An optional \texttt{bool}. Defaults to \texttt{True}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} A \texttt{Tensor}. Has the same type as \texttt{input}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.depthwise\_conv2d(input,\ filter,\ strides,\ padding,\ name=None)} }{tf.nn.depthwise\_conv2d(input, filter, strides, padding, name=None) }}\label{tf.nn.depthwiseux5fconv2dinput-filter-strides-padding-namenone} Depthwise 2-D convolution. Given an input tensor of shape \texttt{{[}batch,\ in\_height,\ in\_width,\ in\_channels{]}} and a filter tensor of shape \texttt{{[}filter\_height,\ filter\_width,\ in\_channels,\ channel\_multiplier{]}} containing \texttt{in\_channels} convolutional filters of depth 1, \texttt{depthwise\_conv2d} applies a different filter to each input channel (expanding from 1 channel to \texttt{channel\_multiplier} channels for each), then concatenates the results together. The output has \texttt{in\_channels\ *\ channel\_multiplier} channels. In detail, \begin{verbatim} output[b, i, j, k * channel_multiplier + q] = sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, k] * filter[di, dj, k, q] \end{verbatim} Must have \texttt{strides{[}0{]}\ =\ strides{[}3{]}\ =\ 1}. For the most common case of the same horizontal and vertical strides, \texttt{strides\ =\ {[}1,\ stride,\ stride,\ 1{]}}. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \texttt{input}: 4-D with shape \texttt{{[}batch,\ in\_height,\ in\_width,\ in\_channels{]}}. \item \texttt{filter}: 4-D with shape \texttt{{[}filter\_height,\ filter\_width,\ in\_channels,\ channel\_multiplier{]}}. \item \texttt{strides}: 1-D of size 4. The stride of the sliding window for each dimension of \texttt{input}. \item \texttt{padding}: A string, either \texttt{\textquotesingle{}VALID\textquotesingle{}} or \texttt{\textquotesingle{}SAME\textquotesingle{}}. The padding algorithm. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} A 4-D \texttt{Tensor} of shape \texttt{{[}batch,\ out\_height,\ out\_width,\ in\_channels\ *\ channel\_multiplier{]}.} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.separable\_conv2d(input,\ depthwise\_filter,\ pointwise\_filter,\ strides,\ padding,\ name=None)} }{tf.nn.separable\_conv2d(input, depthwise\_filter, pointwise\_filter, strides, padding, name=None) }}\label{tf.nn.separableux5fconv2dinput-depthwiseux5ffilter-pointwiseux5ffilter-strides-padding-namenone} 2-D convolution with separable filters. Performs a depthwise convolution that acts separately on channels followed by a pointwise convolution that mixes channels. Note that this is separability between dimensions \texttt{{[}1,\ 2{]}} and \texttt{3}, not spatial separability between dimensions \texttt{1} and \texttt{2}. In detail, \begin{verbatim} output[b, i, j, k] = sum_{di, dj, q, r] input[b, strides[1] * i + di, strides[2] * j + dj, q] * depthwise_filter[di, dj, q, r] * pointwise_filter[0, 0, q * channel_multiplier + r, k] \end{verbatim} \texttt{strides} controls the strides for the depthwise convolution only, since the pointwise convolution has implicit strides of \texttt{{[}1,\ 1,\ 1,\ 1{]}}. Must have \texttt{strides{[}0{]}\ =\ strides{[}3{]}\ =\ 1}. For the most common case of the same horizontal and vertical strides, \texttt{strides\ =\ {[}1,\ stride,\ stride,\ 1{]}}. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \texttt{input}: 4-D \texttt{Tensor} with shape \texttt{{[}batch,\ in\_height,\ in\_width,\ in\_channels{]}}. \item \texttt{depthwise\_filter}: 4-D \texttt{Tensor} with shape \texttt{{[}filter\_height,\ filter\_width,\ in\_channels,\ channel\_multiplier{]}}. Contains \texttt{in\_channels} convolutional filters of depth 1. \item \texttt{pointwise\_filter}: 4-D \texttt{Tensor} with shape \texttt{{[}1,\ 1,\ channel\_multiplier\ *\ in\_channels,\ out\_channels{]}}. Pointwise filter to mix channels after \texttt{depthwise\_filter} has convolved spatially. \item \texttt{strides}: 1-D of size 4. The strides for the depthwise convolution for each dimension of \texttt{input}. \item \texttt{padding}: A string, either \texttt{\textquotesingle{}VALID\textquotesingle{}} or \texttt{\textquotesingle{}SAME\textquotesingle{}}. The padding algorithm. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-9} A 4-D \texttt{Tensor} of shape \texttt{{[}batch,\ out\_height,\ out\_width,\ out\_channels{]}}. \subsection{Pooling }\label{pooling} The pooling ops sweep a rectangular window over the input tensor, computing a reduction operation for each window (average, max, or max with argmax). Each pooling op uses rectangular windows of size \texttt{ksize} separated by offset \texttt{strides}. For example, if \texttt{strides} is all ones every window is used, if \texttt{strides} is all twos every other window is used in each dimension, etc. In detail, the output is \begin{verbatim} output[i] = reduce(value[strides * i:strides * i + ksize]) \end{verbatim} for each tuple of indices \texttt{i}. The output shape is \begin{verbatim} shape(output) = (shape(value) - ksize + 1) / strides \end{verbatim} where the rounding direction depends on padding: \begin{itemize} \tightlist \item \texttt{padding\ =\ \textquotesingle{}SAME\textquotesingle{}}: Round down (only full size windows are considered). \item \texttt{padding\ =\ \textquotesingle{}VALID\textquotesingle{}}: Round up (partial windows are included). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.avg\_pool(value,\ ksize,\ strides,\ padding,\ name=None)} }{tf.nn.avg\_pool(value, ksize, strides, padding, name=None) }}\label{tf.nn.avgux5fpoolvalue-ksize-strides-padding-namenone} Performs the average pooling on the input. Each entry in \texttt{output} is the mean of the corresponding size \texttt{ksize} window in \texttt{value}. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \texttt{value}: A 4-D \texttt{Tensor} of shape \texttt{{[}batch,\ height,\ width,\ channels{]}} and type \texttt{float32}, \texttt{float64}, \texttt{qint8}, \texttt{quint8}, or \texttt{qint32}. \item \texttt{ksize}: A list of ints that has length \textgreater{}= 4. The size of the window for each dimension of the input tensor. \item \texttt{strides}: A list of ints that has length \textgreater{}= 4. The stride of the sliding window for each dimension of the input tensor. \item \texttt{padding}: A string, either \texttt{\textquotesingle{}VALID\textquotesingle{}} or \texttt{\textquotesingle{}SAME\textquotesingle{}}. The padding algorithm. \item \texttt{name}: Optional name for the operation. \end{itemize} \subparagraph{Returns: }\label{returns-10} A \texttt{Tensor} with the same type as \texttt{value}. The average pooled output tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.max\_pool(value,\ ksize,\ strides,\ padding,\ name=None)} }{tf.nn.max\_pool(value, ksize, strides, padding, name=None) }}\label{tf.nn.maxux5fpoolvalue-ksize-strides-padding-namenone} Performs the max pooling on the input. \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \texttt{value}: A 4-D \texttt{Tensor} with shape \texttt{{[}batch,\ height,\ width,\ channels{]}} and type \texttt{float32}, \texttt{float64}, \texttt{qint8}, \texttt{quint8}, \texttt{qint32}. \item \texttt{ksize}: A list of ints that has length \textgreater{}= 4. The size of the window for each dimension of the input tensor. \item \texttt{strides}: A list of ints that has length \textgreater{}= 4. The stride of the sliding window for each dimension of the input tensor. \item \texttt{padding}: A string, either \texttt{\textquotesingle{}VALID\textquotesingle{}} or \texttt{\textquotesingle{}SAME\textquotesingle{}}. The padding algorithm. \item \texttt{name}: Optional name for the operation. \end{itemize} \subparagraph{Returns: }\label{returns-11} A \texttt{Tensor} with the same type as \texttt{value}. The max pooled output tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.max\_pool\_with\_argmax(input,\ ksize,\ strides,\ padding,\ Targmax=None,\ name=None)} }{tf.nn.max\_pool\_with\_argmax(input, ksize, strides, padding, Targmax=None, name=None) }}\label{tf.nn.maxux5fpoolux5fwithux5fargmaxinput-ksize-strides-padding-targmaxnone-namenone} Performs max pooling on the input and outputs both max values and indices. The indices in \texttt{argmax} are flattened, so that a maximum value at position \texttt{{[}b,\ y,\ x,\ c{]}} becomes flattened index \texttt{((b\ *\ height\ +\ y)\ *\ width\ +\ x)\ *\ channels\ +\ c}. \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor} of type \texttt{float32}. 4-D with shape \texttt{{[}batch,\ height,\ width,\ channels{]}}. Input to pool over. \item \texttt{ksize}: A list of \texttt{ints} that has length \texttt{\textgreater{}=\ 4}. The size of the window for each dimension of the input tensor. \item \texttt{strides}: A list of \texttt{ints} that has length \texttt{\textgreater{}=\ 4}. The stride of the sliding window for each dimension of the input tensor. \item \texttt{padding}: A \texttt{string} from: \texttt{"SAME",\ "VALID"}. The type of padding algorithm to use. \item \texttt{Targmax}: An optional \texttt{tf.DType} from: \texttt{tf.int32,\ tf.int64}. Defaults to \texttt{tf.int64}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-12} A tuple of \texttt{Tensor} objects (output, argmax). \begin{itemize} \tightlist \item \texttt{output}: A \texttt{Tensor} of type \texttt{float32}. The max pooled output tensor. \item \texttt{argmax}: A \texttt{Tensor} of type \texttt{Targmax}. 4-D. The flattened indices of the max values chosen for each output. \end{itemize} \subsection{Normalization }\label{normalization} Normalization is useful to prevent neurons from saturating when inputs may have varying scale, and to aid generalization. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.l2\_normalize(x,\ dim,\ epsilon=1e-12,\ name=None)} }{tf.nn.l2\_normalize(x, dim, epsilon=1e-12, name=None) }}\label{tf.nn.l2ux5fnormalizex-dim-epsilon1e-12-namenone} Normalizes along dimension \texttt{dim} using an L2 norm. For a 1-D tensor with \texttt{dim\ =\ 0}, computes \begin{verbatim} output = x / sqrt(max(sum(x**2), epsilon)) \end{verbatim} For \texttt{x} with more dimensions, independently normalizes each 1-D slice along dimension \texttt{dim}. \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. \item \texttt{dim}: Dimension along which to normalize. \item \texttt{epsilon}: A lower bound value for the norm. Will use \texttt{sqrt(epsilon)} as the divisor if \texttt{norm\ \textless{}\ sqrt(epsilon)}. \item \texttt{name}: A name for this operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-13} A \texttt{Tensor} with the same shape as \texttt{x}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.local\_response\_normalization(input,\ depth\_radius=None,\ bias=None,\ alpha=None,\ beta=None,\ name=None)} }{tf.nn.local\_response\_normalization(input, depth\_radius=None, bias=None, alpha=None, beta=None, name=None) }}\label{tf.nn.localux5fresponseux5fnormalizationinput-depthux5fradiusnone-biasnone-alphanone-betanone-namenone} Local Response Normalization. The 4-D \texttt{input} tensor is treated as a 3-D array of 1-D vectors (along the last dimension), and each vector is normalized independently. Within a given vector, each component is divided by the weighted, squared sum of inputs within \texttt{depth\_radius}. In detail, \begin{verbatim} sqr_sum[a, b, c, d] = sum(input[a, b, c, d - depth_radius : d + depth_radius + 1] ** 2) output = input / (bias + alpha * sqr_sum ** beta) \end{verbatim} For details, see {[}Krizhevsky et al., ImageNet classification with deep convolutional neural networks (NIPS 2012){]} (http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks). \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor} of type \texttt{float32}. 4-D. \item \texttt{depth\_radius}: An optional \texttt{int}. Defaults to \texttt{5}. 0-D. Half-width of the 1-D normalization window. \item \texttt{bias}: An optional \texttt{float}. Defaults to \texttt{1}. An offset (usually positive to avoid dividing by 0). \item \texttt{alpha}: An optional \texttt{float}. Defaults to \texttt{1}. A scale factor, usually positive. \item \texttt{beta}: An optional \texttt{float}. Defaults to \texttt{0.5}. An exponent. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-14} A \texttt{Tensor} of type \texttt{float32}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.moments(x,\ axes,\ name=None)} }{tf.nn.moments(x, axes, name=None) }}\label{tf.nn.momentsx-axes-namenone} Calculate the mean and variance of \texttt{x}. The mean and variance are calculated by aggregating the contents of \texttt{x} across \texttt{axes}. If \texttt{x} is 1-D and \texttt{axes\ =\ {[}0{]}} this is just the mean and variance of a vector. For so-called ``global normalization'' needed for convolutional filters pass \texttt{axes={[}0,\ 1,\ 2{]}} (batch, height, width). For batch normalization pass \texttt{axes={[}0{]}} (batch). \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \texttt{x}: A \texttt{Tensor}. \item \texttt{axes}: array of ints. Axes along which to compute mean and variance. \item \texttt{name}: Name used to scope the operations that compute the moments. \end{itemize} \subparagraph{Returns: }\label{returns-15} Two \texttt{Tensors}: \texttt{mean} and \texttt{variance}. \subsection{Losses }\label{losses} The loss ops measure error between two tensors, or between a tensor and zero. These can be used for measuring accuracy of a network in a regression task or for regularization purposes (weight decay). \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.l2\_loss(t,\ name=None)} }{tf.nn.l2\_loss(t, name=None) }}\label{tf.nn.l2ux5flosst-namenone} L2 Loss. Computes half the L2 norm of a tensor without the \texttt{sqrt}: \begin{verbatim} output = sum(t ** 2) / 2 \end{verbatim} \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \texttt{t}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int64}, \texttt{int32}, \texttt{uint8}, \texttt{int16}, \texttt{int8}, \texttt{complex64}, \texttt{qint8}, \texttt{quint8}, \texttt{qint32}. Typically 2-D, but may have any dimensions. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-16} A \texttt{Tensor}. Has the same type as \texttt{t}. 0-D. \subsection{Classification }\label{classification} TensorFlow provides several operations that help you perform classification. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.sigmoid\_cross\_entropy\_with\_logits(logits,\ targets,\ name=None)} }{tf.nn.sigmoid\_cross\_entropy\_with\_logits(logits, targets, name=None) }}\label{tf.nn.sigmoidux5fcrossux5fentropyux5fwithux5flogitslogits-targets-namenone} Computes sigmoid cross entropy given \texttt{logits}. Measures the probability error in discrete classification tasks in which each class is independent and not mutually exclusive. For instance, one could perform multilabel classification where a picture can contain both an elephant and a dog at the same time. For brevity, let \texttt{x\ =\ logits}, \texttt{z\ =\ targets}. The logistic loss is \begin{verbatim} x - x * z + log(1 + exp(-x)) \end{verbatim} To ensure stability and avoid overflow, the implementation uses \begin{verbatim} max(x, 0) - x * z + log(1 + exp(-abs(x))) \end{verbatim} \texttt{logits} and \texttt{targets} must have the same type and shape. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \texttt{logits}: A \texttt{Tensor} of type \texttt{float32} or \texttt{float64}. \item \texttt{targets}: A \texttt{Tensor} of the same type and shape as \texttt{logits}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-17} A \texttt{Tensor} of the same shape as \texttt{logits} with the componentwise logistic losses. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.softmax(logits,\ name=None)} }{tf.nn.softmax(logits, name=None) }}\label{tf.nn.softmaxlogits-namenone} Computes softmax activations. For each batch \texttt{i} and class \texttt{j} we have \begin{verbatim} softmax[i, j] = exp(logits[i, j]) / sum(exp(logits[i])) \end{verbatim} \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \texttt{logits}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}. 2-D with shape \texttt{{[}batch\_size,\ num\_classes{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-18} A \texttt{Tensor}. Has the same type as \texttt{logits}. Same shape as \texttt{logits}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.softmax\_cross\_entropy\_with\_logits(logits,\ labels,\ name=None)} }{tf.nn.softmax\_cross\_entropy\_with\_logits(logits, labels, name=None) }}\label{tf.nn.softmaxux5fcrossux5fentropyux5fwithux5flogitslogits-labels-namenone} Computes softmax cross entropy between \texttt{logits} and \texttt{labels}. Measures the probability error in discrete classification tasks in which the classes are mutually exclusive (each entry is in exactly one class). For example, each CIFAR-10 image is labeled with one and only one label: an image can be a dog or a truck, but not both. \textbf{WARNING:} This op expects unscaled logits, since it performs a \texttt{softmax} on \texttt{logits} internally for efficiency. Do not call this op with the output of \texttt{softmax}, as it will produce incorrect results. \texttt{logits} and \texttt{labels} must have the same shape \texttt{{[}batch\_size,\ num\_classes{]}} and the same dtype (either \texttt{float32} or \texttt{float64}). \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \texttt{logits}: Unscaled log probabilities. \item \texttt{labels}: Each row \texttt{labels{[}i{]}} must be a valid probability distribution. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-19} A 1-D \texttt{Tensor} of length \texttt{batch\_size} of the same type as \texttt{logits} with the softmax cross entropy loss. \subsection{Embeddings }\label{embeddings} TensorFlow provides library support for looking up values in embedding tensors. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.embedding\_lookup(params,\ ids,\ name=None)} }{tf.nn.embedding\_lookup(params, ids, name=None) }}\label{tf.nn.embeddingux5flookupparams-ids-namenone} Looks up \texttt{ids} in a list of embedding tensors. This function is used to perform parallel lookups on the list of tensors in \texttt{params}. It is a generalization of \href{../../api_docs/python/array_ops.md\#gather}{\texttt{tf.gather()}}, where \texttt{params} is interpreted as a partition of a larger embedding tensor. If \texttt{len(params)\ \textgreater{}\ 1}, each element \texttt{id} of \texttt{ids} is partitioned between the elements of \texttt{params} by computing \texttt{p\ =\ id\ \%\ len(params)}, and is then used to look up the slice \texttt{params{[}p{]}{[}id\ //\ len(params),\ ...{]}}. The results of the lookup are then concatenated into a dense tensor. The returned tensor has shape \texttt{shape(ids)\ +\ shape(params){[}1:{]}}. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \texttt{params}: A list of tensors with the same shape and type. \item \texttt{ids}: A \texttt{Tensor} with type \texttt{int32} containing the ids to be looked up in \texttt{params}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-20} A \texttt{Tensor} with the same type as the tensors in \texttt{params}. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{ValueError}: If \texttt{params} is empty. \end{itemize} \subsection{Evaluation }\label{evaluation} The evaluation ops are useful for measuring the performance of a network. Since they are nondifferentiable, they are typically used at evaluation time. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.top\_k(input,\ k,\ name=None)} }{tf.nn.top\_k(input, k, name=None) }}\label{tf.nn.topux5fkinput-k-namenone} Returns the values and indices of the k largest elements for each row. \textbackslash{}(values\_\{i, j\}\textbackslash{}) represents the j-th largest element in \textbackslash{}(input\_i\textbackslash{}). \textbackslash{}(indices\_\{i, j\}\textbackslash{}) gives the column index of the corresponding element, such that \textbackslash{}(input\_\{i, indices\_\{i, j\}\} = values\_\{i, j\}\textbackslash{}). If two elements are equal, the lower-index element appears first. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. Must be one of the following types: \texttt{float32}, \texttt{float64}, \texttt{int32}, \texttt{int64}, \texttt{uint8}, \texttt{int16}, \texttt{int8}. A batch\_size x classes tensor \item \texttt{k}: An \texttt{int} that is \texttt{\textgreater{}=\ 1}. Number of top elements to look for within each row \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-21} A tuple of \texttt{Tensor} objects (values, indices). \begin{itemize} \tightlist \item \texttt{values}: A \texttt{Tensor}. Has the same type as \texttt{input}. A batch\_size x k tensor with the k largest elements for each row, sorted in descending order \item \texttt{indices}: A \texttt{Tensor} of type \texttt{int32}. A batch\_size x k tensor with the index of each value within each row \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.in\_top\_k(predictions,\ targets,\ k,\ name=None)} }{tf.nn.in\_top\_k(predictions, targets, k, name=None) }}\label{tf.nn.inux5ftopux5fkpredictions-targets-k-namenone} Says whether the targets are in the top K predictions. This outputs a batch\_size bool array, an entry out{[}i{]} is true if the prediction for the target class is among the top k predictions among all predictions for example i. Note that the behavior of InTopK differs from the TopK op in its handling of ties; if multiple classes have the same prediction value and straddle the top-k boundary, all of those classes are considered to be in the top k. More formally, let \textbackslash{}(predictions\_i\textbackslash{}) be the predictions for all classes for example i, \textbackslash{}(targets\_i\textbackslash{}) be the target class for example i, \textbackslash{}(out\_i\textbackslash{}) be the output for example i, \[out_i = predictions_{i, targets_i} \in TopKIncludingTies(predictions_i)\] \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \texttt{predictions}: A \texttt{Tensor} of type \texttt{float32}. A batch\_size x classes tensor \item \texttt{targets}: A \texttt{Tensor} of type \texttt{int32}. A batch\_size vector of class ids \item \texttt{k}: An \texttt{int}. Number of top elements to look at for computing precision \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-22} A \texttt{Tensor} of type \texttt{bool}. Computed Precision at k as a bool Tensor \subsection{Candidate Sampling }\label{candidate-sampling} Do you want to train a multiclass or multilabel model with thousands or millions of output classes (for example, a language model with a large vocabulary)? Training with a full Softmax is slow in this case, since all of the classes are evaluated for every training example. Candidate Sampling training algorithms can speed up your step times by only considering a small randomly-chosen subset of contrastive classes (called candidates) for each batch of training examples. See our {[}Candidate Sampling Algorithms Reference{]} (../../extras/candidate\_sampling.pdf) \subsubsection{Sampled Loss Functions }\label{sampled-loss-functions} TensorFlow provides the following sampled loss functions for faster training. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.nce\_loss(weights,\ biases,\ inputs,\ labels,\ num\_sampled,\ num\_classes,\ num\_true=1,\ sampled\_values=None,\ remove\_accidental\_hits=False,\ name=\textquotesingle{}nce\_loss\textquotesingle{})} }{tf.nn.nce\_loss(weights, biases, inputs, labels, num\_sampled, num\_classes, num\_true=1, sampled\_values=None, remove\_accidental\_hits=False, name='nce\_loss') }}\label{tf.nn.nceux5flossweights-biases-inputs-labels-numux5fsampled-numux5fclasses-numux5ftrue1-sampledux5fvaluesnone-removeux5faccidentalux5fhitsfalse-namenceux5floss} Computes and returns the noise-contrastive estimation training loss. See {[}Noise-contrastive estimation: A new estimation principle for unnormalized statistical models{]} (http://www.jmlr.org/proceedings/papers/v9/gutmann10a/gutmann10a.pdf). Also see our {[}Candidate Sampling Algorithms Reference{]} (http://www.tensorflow.org/extras/candidate\_sampling.pdf) Note: In the case where num\_true \textgreater{} 1, we assign to each target class the target probability 1 / num\_true so that the target probabilities sum to 1 per-example. Note: It would be useful to allow a variable number of target classes per example. We hope to provide this functionality in a future release. For now, if you have a variable number of target classes, you can pad them out to a constant number by either repeating them or by padding with an otherwise unused class. \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \texttt{weights}: A \texttt{Tensor} of shape {[}num\_classes, dim{]}. The class embeddings. \item \texttt{biases}: A \texttt{Tensor} of shape {[}num\_classes{]}. The class biases. \item \texttt{inputs}: A \texttt{Tensor} of shape {[}batch\_size, dim{]}. The forward activations of the input network. \item \texttt{labels}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}batch\_size,\ num\_true{]}}. The target classes. \item \texttt{num\_sampled}: An \texttt{int}. The number of classes to randomly sample per batch. \item \texttt{num\_classes}: An \texttt{int}. The number of possible classes. \item \texttt{num\_true}: An \texttt{int}. The number of target classes per training example. \item \texttt{sampled\_values}: a tuple of \texttt{(sampled\_candidates,\ true\_expected\_count,\ \ \ sampled\_expected\_count)} returned by a *\_candidate\_sampler function. (if None, we default to LogUniformCandidateSampler) \item \texttt{remove\_accidental\_hits}: A \texttt{bool}. Whether to remove ``accidental hits'' where a sampled class equals one of the target classes. If set to \texttt{True}, this is a ``Sampled Logistic'' loss instead of NCE, and we are learning to generate log-odds instead of log probabilities. See our {[}Candidate Sampling Algorithms Reference{]} (http://www.tensorflow.org/extras/candidate\_sampling.pdf). Default is False. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-23} A batch\_size 1-D tensor of per-example NCE losses. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.sampled\_softmax\_loss(weights,\ biases,\ inputs,\ labels,\ num\_sampled,\ num\_classes,\ num\_true=1,\ sampled\_values=None,\ remove\_accidental\_hits=True,\ name=\textquotesingle{}sampled\_softmax\_loss\textquotesingle{})} }{tf.nn.sampled\_softmax\_loss(weights, biases, inputs, labels, num\_sampled, num\_classes, num\_true=1, sampled\_values=None, remove\_accidental\_hits=True, name='sampled\_softmax\_loss') }}\label{tf.nn.sampledux5fsoftmaxux5flossweights-biases-inputs-labels-numux5fsampled-numux5fclasses-numux5ftrue1-sampledux5fvaluesnone-removeux5faccidentalux5fhitstrue-namesampledux5fsoftmaxux5floss} Computes and returns the sampled softmax training loss. This is a faster way to train a softmax classifier over a huge number of classes. This operation is for training only. It is generally an underestimate of the full softmax loss. At inference time, you can compute full softmax probabilities with the expression \texttt{tf.nn.softmax(tf.matmul(inputs,\ weights)\ +\ biases)}. See our {[}Candidate Sampling Algorithms Reference{]} (http://www.tensorflow.org/extras/candidate\_sampling.pdf) Also see Section 3 of http://arxiv.org/abs/1412.2007 for the math. \subparagraph{Args: }\label{args-24} \begin{itemize} \tightlist \item \texttt{weights}: A \texttt{Tensor} of shape {[}num\_classes, dim{]}. The class embeddings. \item \texttt{biases}: A \texttt{Tensor} of shape {[}num\_classes{]}. The class biases. \item \texttt{inputs}: A \texttt{Tensor} of shape {[}batch\_size, dim{]}. The forward activations of the input network. \item \texttt{labels}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}batch\_size,\ num\_true{]}}. The target classes. Note that this format differs from the \texttt{labels} argument of \texttt{nn.softmax\_cross\_entropy\_with\_logits}. \item \texttt{num\_sampled}: An \texttt{int}. The number of classes to randomly sample per batch. \item \texttt{num\_classes}: An \texttt{int}. The number of possible classes. \item \texttt{num\_true}: An \texttt{int}. The number of target classes per training example. \item \texttt{sampled\_values}: a tuple of \texttt{(sampled\_candidates,\ true\_expected\_count,\ \ \ sampled\_expected\_count)} returned by a *\_candidate\_sampler function. (if None, we default to LogUniformCandidateSampler) \item \texttt{remove\_accidental\_hits}: A \texttt{bool}. whether to remove ``accidental hits'' where a sampled class equals one of the target classes. Default is True. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-24} A batch\_size 1-D tensor of per-example sampled softmax losses. \subsubsection{Candidate Samplers }\label{candidate-samplers} TensorFlow provides the following samplers for randomly sampling candidate classes when using one of the sampled loss functions above. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.uniform\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ seed=None,\ name=None)} }{tf.nn.uniform\_candidate\_sampler(true\_classes, num\_true, num\_sampled, unique, range\_max, seed=None, name=None) }}\label{tf.nn.uniformux5fcandidateux5fsamplertrueux5fclasses-numux5ftrue-numux5fsampled-unique-rangeux5fmax-seednone-namenone} Samples a set of classes using a uniform base distribution. This operation randomly samples a tensor of sampled classes (\texttt{sampled\_candidates}) from the range of integers \texttt{{[}0,\ range\_max{]}}. The elements of \texttt{sampled\_candidates} are drawn without replacement (if \texttt{unique=True}) or with replacement (if \texttt{unique=False}) from the base distribution. The base distribution for this operation is the uniform distribution over the range of integers \texttt{{[}0,\ range\_max{]}}. In addition, this operation returns tensors \texttt{true\_expected\_count} and \texttt{sampled\_expected\_count} representing the number of times each of the target classes (\texttt{true\_classes}) and the sampled classes (\texttt{sampled\_candidates}) is expected to occur in an average tensor of sampled classes. These values correspond to \texttt{Q(y\textbar{}x)} defined in \href{http://www.tensorflow.org/extras/candidate_sampling.pdf}{this document}. If \texttt{unique=True}, then these are post-rejection probabilities and we compute them approximately. \subparagraph{Args: }\label{args-25} \begin{itemize} \tightlist \item \texttt{true\_classes}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}batch\_size,\ num\_true{]}}. The target classes. \item \texttt{num\_true}: An \texttt{int}. The number of target classes per training example. \item \texttt{num\_sampled}: An \texttt{int}. The number of classes to randomly sample per batch. \item \texttt{unique}: A \texttt{bool}. Determines whether all sampled classes in a batch are unique. \item \texttt{range\_max}: An \texttt{int}. The number of possible classes. \item \texttt{seed}: An \texttt{int}. An operation-specific seed. Default is 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-25} \begin{itemize} \tightlist \item \texttt{sampled\_candidates}: A tensor of type \texttt{int64} and shape \texttt{{[}num\_sampled{]}}. The sampled classes. \item \texttt{true\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{true\_classes}. The expected counts under the sampling distribution of each of \texttt{true\_classes}. \item \texttt{sampled\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{sampled\_candidates}. The expected counts under the sampling distribution of each of \texttt{sampled\_candidates}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.log\_uniform\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ seed=None,\ name=None)} }{tf.nn.log\_uniform\_candidate\_sampler(true\_classes, num\_true, num\_sampled, unique, range\_max, seed=None, name=None) }}\label{tf.nn.logux5funiformux5fcandidateux5fsamplertrueux5fclasses-numux5ftrue-numux5fsampled-unique-rangeux5fmax-seednone-namenone} Samples a set of classes using a log-uniform (Zipfian) base distribution. This operation randomly samples a tensor of sampled classes (\texttt{sampled\_candidates}) from the range of integers \texttt{{[}0,\ range\_max{]}}. The elements of \texttt{sampled\_candidates} are drawn without replacement (if \texttt{unique=True}) or with replacement (if \texttt{unique=False}) from the base distribution. The base distribution for this operation is an approximately log-uniform or Zipfian distribution: \texttt{P(class)\ =\ (log(class\ +\ 2)\ -\ log(class\ +\ 1))\ /\ log(range\_max\ +\ 1)} This sampler is useful when the target classes approximately follow such a distribution - for example, if the classes represent words in a lexicon sorted in decreasing order of frequency. If your classes are not ordered by decreasing frequency, do not use this op. In addition, this operation returns tensors \texttt{true\_expected\_count} and \texttt{sampled\_expected\_count} representing the number of times each of the target classes (\texttt{true\_classes}) and the sampled classes (\texttt{sampled\_candidates}) is expected to occur in an average tensor of sampled classes. These values correspond to \texttt{Q(y\textbar{}x)} defined in \href{http://www.tensorflow.org/extras/candidate_sampling.pdf}{this document}. If \texttt{unique=True}, then these are post-rejection probabilities and we compute them approximately. \subparagraph{Args: }\label{args-26} \begin{itemize} \tightlist \item \texttt{true\_classes}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}batch\_size,\ num\_true{]}}. The target classes. \item \texttt{num\_true}: An \texttt{int}. The number of target classes per training example. \item \texttt{num\_sampled}: An \texttt{int}. The number of classes to randomly sample per batch. \item \texttt{unique}: A \texttt{bool}. Determines whether all sampled classes in a batch are unique. \item \texttt{range\_max}: An \texttt{int}. The number of possible classes. \item \texttt{seed}: An \texttt{int}. An operation-specific seed. Default is 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-26} \begin{itemize} \tightlist \item \texttt{sampled\_candidates}: A tensor of type \texttt{int64} and shape \texttt{{[}num\_sampled{]}}. The sampled classes. \item \texttt{true\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{true\_classes}. The expected counts under the sampling distribution of each of \texttt{true\_classes}. \item \texttt{sampled\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{sampled\_candidates}. The expected counts under the sampling distribution of each of \texttt{sampled\_candidates}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.learned\_unigram\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ seed=None,\ name=None)} }{tf.nn.learned\_unigram\_candidate\_sampler(true\_classes, num\_true, num\_sampled, unique, range\_max, seed=None, name=None) }}\label{tf.nn.learnedux5funigramux5fcandidateux5fsamplertrueux5fclasses-numux5ftrue-numux5fsampled-unique-rangeux5fmax-seednone-namenone} Samples a set of classes from a distribution learned during training. This operation randomly samples a tensor of sampled classes (\texttt{sampled\_candidates}) from the range of integers \texttt{{[}0,\ range\_max{]}}. The elements of \texttt{sampled\_candidates} are drawn without replacement (if \texttt{unique=True}) or with replacement (if \texttt{unique=False}) from the base distribution. The base distribution for this operation is constructed on the fly during training. It is a unigram distribution over the target classes seen so far during training. Every integer in \texttt{{[}0,\ range\_max{]}} begins with a weight of 1, and is incremented by 1 each time it is seen as a target class. The base distribution is not saved to checkpoints, so it is reset when the model is reloaded. In addition, this operation returns tensors \texttt{true\_expected\_count} and \texttt{sampled\_expected\_count} representing the number of times each of the target classes (\texttt{true\_classes}) and the sampled classes (\texttt{sampled\_candidates}) is expected to occur in an average tensor of sampled classes. These values correspond to \texttt{Q(y\textbar{}x)} defined in \href{http://www.tensorflow.org/extras/candidate_sampling.pdf}{this document}. If \texttt{unique=True}, then these are post-rejection probabilities and we compute them approximately. \subparagraph{Args: }\label{args-27} \begin{itemize} \tightlist \item \texttt{true\_classes}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}batch\_size,\ num\_true{]}}. The target classes. \item \texttt{num\_true}: An \texttt{int}. The number of target classes per training example. \item \texttt{num\_sampled}: An \texttt{int}. The number of classes to randomly sample per batch. \item \texttt{unique}: A \texttt{bool}. Determines whether all sampled classes in a batch are unique. \item \texttt{range\_max}: An \texttt{int}. The number of possible classes. \item \texttt{seed}: An \texttt{int}. An operation-specific seed. Default is 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-27} \begin{itemize} \tightlist \item \texttt{sampled\_candidates}: A tensor of type \texttt{int64} and shape \texttt{{[}num\_sampled{]}}. The sampled classes. \item \texttt{true\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{true\_classes}. The expected counts under the sampling distribution of each of \texttt{true\_classes}. \item \texttt{sampled\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{sampled\_candidates}. The expected counts under the sampling distribution of each of \texttt{sampled\_candidates}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.fixed\_unigram\_candidate\_sampler(true\_classes,\ num\_true,\ num\_sampled,\ unique,\ range\_max,\ vocab\_file=\textquotesingle{}\textquotesingle{},\ distortion=0.0,\ num\_reserved\_ids=0,\ num\_shards=1,\ shard=0,\ unigrams={[}{]},\ seed=None,\ name=None)} }{tf.nn.fixed\_unigram\_candidate\_sampler(true\_classes, num\_true, num\_sampled, unique, range\_max, vocab\_file='', distortion=0.0, num\_reserved\_ids=0, num\_shards=1, shard=0, unigrams={[}{]}, seed=None, name=None) }}\label{tf.nn.fixedux5funigramux5fcandidateux5fsamplertrueux5fclasses-numux5ftrue-numux5fsampled-unique-rangeux5fmax-vocabux5ffile-distortion0.0-numux5freservedux5fids0-numux5fshards1-shard0-unigrams-seednone-namenone} Samples a set of classes using the provided (fixed) base distribution. This operation randomly samples a tensor of sampled classes (\texttt{sampled\_candidates}) from the range of integers \texttt{{[}0,\ range\_max{]}}. The elements of \texttt{sampled\_candidates} are drawn without replacement (if \texttt{unique=True}) or with replacement (if \texttt{unique=False}) from the base distribution. The base distribution is read from a file or passed in as an in-memory array. There is also an option to skew the distribution by applying a distortion power to the weights. In addition, this operation returns tensors \texttt{true\_expected\_count} and \texttt{sampled\_expected\_count} representing the number of times each of the target classes (\texttt{true\_classes}) and the sampled classes (\texttt{sampled\_candidates}) is expected to occur in an average tensor of sampled classes. These values correspond to \texttt{Q(y\textbar{}x)} defined in \href{http://www.tensorflow.org/extras/candidate_sampling.pdf}{this document}. If \texttt{unique=True}, then these are post-rejection probabilities and we compute them approximately. \subparagraph{Args: }\label{args-28} \begin{itemize} \tightlist \item \texttt{true\_classes}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}batch\_size,\ num\_true{]}}. The target classes. \item \texttt{num\_true}: An \texttt{int}. The number of target classes per training example. \item \texttt{num\_sampled}: An \texttt{int}. The number of classes to randomly sample per batch. \item \texttt{unique}: A \texttt{bool}. Determines whether all sampled classes in a batch are unique. \item \texttt{range\_max}: An \texttt{int}. The number of possible classes. \item \texttt{vocab\_file}: Each valid line in this file (which should have a CSV-like format) corresponds to a valid word ID. IDs are in sequential order, starting from num\_reserved\_ids. The last entry in each line is expected to be a value corresponding to the count or relative probability. Exactly one of \texttt{vocab\_file} and \texttt{unigrams} needs to be passed to this operation. \item \texttt{distortion}: The distortion is used to skew the unigram probability distribution. Each weight is first raised to the distortion's power before adding to the internal unigram distribution. As a result, \texttt{distortion\ =\ 1.0} gives regular unigram sampling (as defined by the vocab file), and \texttt{distortion\ =\ 0.0} gives a uniform distribution. \item \texttt{num\_reserved\_ids}: Optionally some reserved IDs can be added in the range \texttt{{[}0,\ num\_reserved\_ids{]}} by the users. One use case is that a special unknown word token is used as ID 0. These IDs will have a sampling probability of 0. \item \texttt{num\_shards}: A sampler can be used to sample from a subset of the original range in order to speed up the whole computation through parallelism. This parameter (together with \texttt{shard}) indicates the number of partitions that are being used in the overall computation. \item \texttt{shard}: A sampler can be used to sample from a subset of the original range in order to speed up the whole computation through parallelism. This parameter (together with \texttt{num\_shards}) indicates the particular partition number of the operation, when partitioning is being used. \item \texttt{unigrams}: A list of unigram counts or probabilities, one per ID in sequential order. Exactly one of \texttt{vocab\_file} and \texttt{unigrams} should be passed to this operation. \item \texttt{seed}: An \texttt{int}. An operation-specific seed. Default is 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-28} \begin{itemize} \tightlist \item \texttt{sampled\_candidates}: A tensor of type \texttt{int64} and shape \texttt{{[}num\_sampled{]}}. The sampled classes. \item \texttt{true\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{true\_classes}. The expected counts under the sampling distribution of each of \texttt{true\_classes}. \item \texttt{sampled\_expected\_count}: A tensor of type \texttt{float}. Same shape as \texttt{sampled\_candidates}. The expected counts under the sampling distribution of each of \texttt{sampled\_candidates}. \end{itemize} \subsubsection{Miscellaneous candidate sampling utilities }\label{miscellaneous-candidate-sampling-utilities} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.compute\_accidental\_hits(true\_classes,\ sampled\_candidates,\ num\_true,\ seed=None,\ name=None)} }{tf.nn.compute\_accidental\_hits(true\_classes, sampled\_candidates, num\_true, seed=None, name=None) }}\label{tf.nn.computeux5faccidentalux5fhitstrueux5fclasses-sampledux5fcandidates-numux5ftrue-seednone-namenone} Compute the ids of positions in sampled\_candidates matching true\_classes. In Candidate Sampling, this operation facilitates virtually removing sampled classes which happen to match target classes. This is done in Sampled Softmax and Sampled Logistic. See our \href{http://www.tensorflow.org/extras/candidate_sampling.pdf}{Candidate Sampling Algorithms Reference}. We presuppose that the \texttt{sampled\_candidates} are unique. We call it an `accidental hit' when one of the target classes matches one of the sampled classes. This operation reports accidental hits as triples \texttt{(index,\ id,\ weight)}, where \texttt{index} represents the row number in \texttt{true\_classes}, \texttt{id} represents the position in \texttt{sampled\_candidates}, and weight is \texttt{-FLOAT\_MAX}. The result of this op should be passed through a \texttt{sparse\_to\_dense} operation, then added to the logits of the sampled classes. This removes the contradictory effect of accidentally sampling the true target classes as noise classes for the same example. \subparagraph{Args: }\label{args-29} \begin{itemize} \tightlist \item \texttt{true\_classes}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}batch\_size,\ num\_true{]}}. The target classes. \item \texttt{sampled\_candidates}: A tensor of type \texttt{int64} and shape \texttt{{[}num\_sampled{]}}. The sampled\_candidates output of CandidateSampler. \item \texttt{num\_true}: An \texttt{int}. The number of target classes per training example. \item \texttt{seed}: An \texttt{int}. An operation-specific seed. Default is 0. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-29} \begin{itemize} \tightlist \item \texttt{indices}: A \texttt{Tensor} of type \texttt{int32} and shape \texttt{{[}num\_accidental\_hits{]}}. Values indicate rows in \texttt{true\_classes}. \item \texttt{ids}: A \texttt{Tensor} of type \texttt{int64} and shape \texttt{{[}num\_accidental\_hits{]}}. Values indicate positions in \texttt{sampled\_candidates}. \item \texttt{weights}: A \texttt{Tensor} of type \texttt{float} and shape \texttt{{[}num\_accidental\_hits{]}}. Each value is \texttt{-FLOAT\_MAX}. \end{itemize} ================================================ FILE: tex_pdf/api/python/c4s12_client.tex ================================================ \section{Running Graphs }\label{running-graphs} \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-running-graphs}{Running Graphs}}{Running Graphs}}\label{running-graphs-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-session-management}{Session management} \item \protect\hyperlink{Session}{\texttt{class\ tf.Session}} \item \protect\hyperlink{InteractiveSession}{\texttt{class\ tf.InteractiveSession}} \item \protect\hyperlink{getux5fdefaultux5fsession}{\texttt{tf.get\_default\_session()}} \item \protect\hyperlink{AUTOGENERATED-error-classes}{Error classes} \item \protect\hyperlink{OpError}{\texttt{class\ tf.OpError}} \item \protect\hyperlink{CancelledError}{\texttt{class\ tf.errors.CancelledError}} \item \protect\hyperlink{UnknownError}{\texttt{class\ tf.errors.UnknownError}} \item \protect\hyperlink{InvalidArgumentError}{\texttt{class\ tf.errors.InvalidArgumentError}} \item \protect\hyperlink{DeadlineExceededError}{\texttt{class\ tf.errors.DeadlineExceededError}} \item \protect\hyperlink{NotFoundError}{\texttt{class\ tf.errors.NotFoundError}} \item \protect\hyperlink{AlreadyExistsError}{\texttt{class\ tf.errors.AlreadyExistsError}} \item \protect\hyperlink{PermissionDeniedError}{\texttt{class\ tf.errors.PermissionDeniedError}} \item \protect\hyperlink{UnauthenticatedError}{\texttt{class\ tf.errors.UnauthenticatedError}} \item \protect\hyperlink{ResourceExhaustedError}{\texttt{class\ tf.errors.ResourceExhaustedError}} \item \protect\hyperlink{FailedPreconditionError}{\texttt{class\ tf.errors.FailedPreconditionError}} \item \protect\hyperlink{AbortedError}{\texttt{class\ tf.errors.AbortedError}} \item \protect\hyperlink{OutOfRangeError}{\texttt{class\ tf.errors.OutOfRangeError}} \item \protect\hyperlink{UnimplementedError}{\texttt{class\ tf.errors.UnimplementedError}} \item \protect\hyperlink{InternalError}{\texttt{class\ tf.errors.InternalError}} \item \protect\hyperlink{UnavailableError}{\texttt{class\ tf.errors.UnavailableError}} \item \protect\hyperlink{DataLossError}{\texttt{class\ tf.errors.DataLossError}} \end{itemize} This library contains classes for launching graphs and executing operations. The \href{../../get_started/introduction.md\#basic-usage}{basic usage} guide has examples of how a graph is launched in a \protect\hyperlink{Session}{\texttt{tf.Session}}. \subsection{Session management }\label{session-management} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{class \lstinline{tf.Session}} \label{class-tf.session} A class for running TensorFlow operations. A \texttt{Session} object encapsulates the environment in which \texttt{Operation} objects are executed, and \texttt{Tensor} objects are evaluated. For example: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Build a graph.} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{5.0}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{6.0}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{a }\OperatorTok{*} \NormalTok{b} \CommentTok{# Launch the graph in a session.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \CommentTok{# Evaluate the tensor `c`.} \BuiltInTok{print} \NormalTok{sess.run(c)} \end{Highlighting} \end{Shaded} A session may own resources, such as \href{../../api_docs/python/state_ops.md\#Variable}{variables}, \href{../../api_docs/python/io_ops.md\#QueueBase}{queues}, and \href{../../api_docs/python/io_ops.md\#ReaderBase}{readers}. It is important to release these resources when they are no longer required. To do this, either invoke the \protect\hyperlink{Session.close}{\texttt{close()}} method on the session, or use the session as a context manager. The following two examples are equivalent: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Using the `close()` method.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \NormalTok{sess.run(...)} \NormalTok{sess.close()} \CommentTok{# Using the context manager.} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess:} \NormalTok{sess.run(...)} \end{Highlighting} \end{Shaded} The {[}\texttt{ConfigProto}{]} (https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/config.proto) protocol buffer exposes various configuration options for a session. For example, to create a session that uses soft constraints for device placement, and log the resulting placement decisions, create a session as follows: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Launch the graph in a session that allows soft device placement and} \CommentTok{# logs the placement decisions.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session(config}\OperatorTok{=}\NormalTok{tf.ConfigProto(allow_soft_placement}\OperatorTok{=}\VariableTok{True}\NormalTok{,} \NormalTok{log_device_placement}\OperatorTok{=}\VariableTok{True}\NormalTok{))} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.Session.\_\_init\_\_(target=\textquotesingle{}\textquotesingle{},\ graph=None,\ config=None)} }{tf.Session.\_\_init\_\_(target='', graph=None, config=None) }}\label{tf.session.ux5fux5finitux5fux5ftarget-graphnone-confignone} Creates a new TensorFlow session. If no \texttt{graph} argument is specified when constructing the session, the default graph will be launched in the session. If you are using more than one graph (created with \texttt{tf.Graph()} in the same process, you will have to use different sessions for each graph, but each graph can be used in multiple sessions. In this case, it is often clearer to pass the graph to be launched explicitly to the session constructor. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{target}: (Optional.) The execution engine to connect to. Defaults to using an in-process engine. At present, no value other than the empty string is supported. \item \texttt{graph}: (Optional.) The \texttt{Graph} to be launched (described above). \item \texttt{config}: (Optional.) A \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/config.proto}{\texttt{ConfigProto}} protocol buffer with configuration options for the session. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.Session.run(fetches,\ feed\_dict=None)} }{tf.Session.run(fetches, feed\_dict=None) }}\label{tf.session.run} Runs the operations and evaluates the tensors in \texttt{fetches}. This method runs one ``step'' of TensorFlow computation, by running the necessary graph fragment to execute every \texttt{Operation} and evaluate every \texttt{Tensor} in \texttt{fetches}, substituting the values in \texttt{feed\_dict} for the corresponding input values. The \texttt{fetches} argument may be a list of graph elements or a single graph element, and these determine the return value of this method. A graph element can be one of the following types: \begin{itemize} \tightlist \item If the \emph{i}th element of \texttt{fetches} is an \href{../../api_docs/python/framework.md\#Operation}{\texttt{Operation}}, the \emph{i}th return value will be \texttt{None}. \item If the \emph{i}th element of \texttt{fetches} is a \href{../../api_docs/python/framework.md\#Tensor}{\texttt{Tensor}}, the \emph{i}th return value will be a numpy ndarray containing the value of that tensor. \item If the \emph{i}th element of \texttt{fetches} is a \href{../../api_docs/python/sparse_ops.md\#SparseTensor}{\texttt{SparseTensor}}, the \emph{i}th return value will be a \href{../../api_docs/python/sparse_ops.md\#SparseTensorValue}{\texttt{SparseTensorValue}} containing the value of that sparse tensor. \end{itemize} The optional \texttt{feed\_dict} argument allows the caller to override the value of tensors in the graph. Each key in \texttt{feed\_dict} can be one of the following types: \begin{itemize} \tightlist \item If the key is a \href{../../api_docs/python/framework.md\#Tensor}{\texttt{Tensor}}, the value may be a Python scalar, string, list, or numpy ndarray that can be converted to the same \texttt{dtype} as that tensor. Additionally, if the key is a \href{../../api_docs/python/io_ops.md\#placeholder}{placeholder}, the shape of the value will be checked for compatibility with the placeholder. \item If the key is a \href{../../api_docs/python/sparse_ops.md\#SparseTensor}{\texttt{SparseTensor}}, the value should be a \href{../../api_docs/python/sparse_ops.md\#SparseTensorValue}{\texttt{SparseTensorValue}}. \end{itemize} \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{fetches}: A single graph element, or a list of graph elements (described above). \item \texttt{feed\_dict}: A dictionary that maps graph elements to values (described above). \end{itemize} \subparagraph{Returns: }\label{returns} Either a single value if \texttt{fetches} is a single graph element, or a list of values if \texttt{fetches} is a list (described above). \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{RuntimeError}: If this \texttt{Session} is in an invalid state (e.g.~has been closed). \item \texttt{TypeError}: If \texttt{fetches} or \texttt{feed\_dict} keys are of an inappropriate type. \item \texttt{ValueError}: If \texttt{fetches} or \texttt{feed\_dict} keys are invalid or refer to a \texttt{Tensor} that doesn't exist. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.Session.close()} }{tf.Session.close() }}\label{tf.session.close} Closes this session. Calling this method frees all resources associated with the session. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{RuntimeError}: If an error occurs while closing the session. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.Session.graph} }{tf.Session.graph }}\label{tf.session.graph} The graph that was launched in this session. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.Session.as\_default()} }{tf.Session.as\_default() }}\label{tf.session.asux5fdefault} Returns a context manager that makes this object the default session. Use with the \texttt{with} keyword to specify that calls to \href{../../api_docs/python/framework.md\#Operation.run}{\texttt{Operation.run()}} or \href{../../api_docs/python/framework.md\#Tensor.run}{\texttt{Tensor.run()}} should be executed in this session. \begin{Shaded} \begin{Highlighting}[] \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant(..)} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \ControlFlowTok{with} \NormalTok{sess.as_default():} \ControlFlowTok{assert} \NormalTok{tf.get_default_session() }\OperatorTok{is} \NormalTok{sess} \BuiltInTok{print} \NormalTok{c.}\BuiltInTok{eval}\NormalTok{()} \end{Highlighting} \end{Shaded} To get the current default session, use \protect\hyperlink{getux5fdefaultux5fsession}{\texttt{tf.get\_default\_session()}}. \emph{N.B.} The \texttt{as\_default} context manager \emph{does not} close the session when you exit the context, and you must close the session explicitly. \begin{Shaded} \begin{Highlighting}[] \NormalTok{c }\OperatorTok{=} \NormalTok{tf.constant(...)} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \ControlFlowTok{with} \NormalTok{sess.as_default():} \BuiltInTok{print} \NormalTok{c.}\BuiltInTok{eval}\NormalTok{()} \CommentTok{# ...} \ControlFlowTok{with} \NormalTok{sess.as_default():} \BuiltInTok{print} \NormalTok{c.}\BuiltInTok{eval}\NormalTok{()} \NormalTok{sess.close()} \end{Highlighting} \end{Shaded} Alternatively, you can use \texttt{with\ tf.Session():} to create a session that is automatically closed on exiting the context, including when an uncaught exception is raised. \emph{N.B.} The default graph is a property of the current thread. If you create a new thread, and wish to use the default session in that thread, you must explicitly add a \texttt{with\ sess.as\_default():} in that thread's function. \subparagraph{Returns: }\label{returns-1} A context manager using this session as the default session. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.InteractiveSession} }{class tf.InteractiveSession }}\label{class-tf.interactivesession} A TensorFlow \texttt{Session} for use in interactive contexts, such as a shell. The only difference with a regular \texttt{Session} is that an \texttt{InteractiveSession} installs itself as the default session on construction. The methods \href{../../api_docs/python/framework.md\#Tensor.eval}{\texttt{Tensor.eval()}} and \href{../../api_docs/python/framework.md\#Operation.run}{\texttt{Operation.run()}} will use that session to run ops. This is convenient in interactive shells and \href{http://ipython.org}{IPython notebooks}, as it avoids having to pass an explicit \texttt{Session} object to run ops. For example: \begin{Shaded} \begin{Highlighting}[] \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.InteractiveSession()} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{5.0}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{6.0}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{a }\OperatorTok{*} \NormalTok{b} \CommentTok{# We can just use 'c.eval()' without passing 'sess'} \BuiltInTok{print} \NormalTok{c.}\BuiltInTok{eval}\NormalTok{()} \NormalTok{sess.close()} \end{Highlighting} \end{Shaded} Note that a regular session installs itself as the default session when it is created in a \texttt{with} statement. The common usage in non-interactive programs is to follow that pattern: \begin{Shaded} \begin{Highlighting}[] \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{5.0}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant(}\FloatTok{6.0}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{a }\OperatorTok{*} \NormalTok{b} \ControlFlowTok{with} \NormalTok{tf.Session():} \CommentTok{# We can also use 'c.eval()' here.} \BuiltInTok{print} \NormalTok{c.}\BuiltInTok{eval}\NormalTok{()} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.InteractiveSession.\_\_init\_\_(target=\textquotesingle{}\textquotesingle{},\ graph=None)} }{tf.InteractiveSession.\_\_init\_\_(target='', graph=None) }}\label{tf.interactivesession.ux5fux5finitux5fux5ftarget-graphnone} Creates a new interactive TensorFlow session. If no \texttt{graph} argument is specified when constructing the session, the default graph will be launched in the session. If you are using more than one graph (created with \texttt{tf.Graph()} in the same process, you will have to use different sessions for each graph, but each graph can be used in multiple sessions. In this case, it is often clearer to pass the graph to be launched explicitly to the session constructor. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{target}: (Optional.) The execution engine to connect to. Defaults to using an in-process engine. At present, no value other than the empty string is supported. \item \texttt{graph}: (Optional.) The \texttt{Graph} to be launched (described above). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.InteractiveSession.close()} }{tf.InteractiveSession.close() }}\label{tf.interactivesession.close} Closes an \texttt{InteractiveSession}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.get\_default\_session()} }{tf.get\_default\_session() }}\label{tf.getux5fdefaultux5fsession} Returns the default session for the current thread. The returned \texttt{Session} will be the innermost session on which a \texttt{Session} or \texttt{Session.as\_default()} context has been entered. \emph{N.B.} The default session is a property of the current thread. If you create a new thread, and wish to use the default session in that thread, you must explicitly add a \texttt{with\ sess.as\_default():} in that thread's function. \subparagraph{Returns: }\label{returns-2} The default \texttt{Session} being used in the current thread. \subsection{Error classes }\label{error-classes} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.OpError} }{class tf.OpError }}\label{class-tf.operror} A generic error that is raised when TensorFlow execution fails. Whenever possible, the session will raise a more specific subclass of \texttt{OpError} from the \texttt{tf.errors} module. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.OpError.op} }{tf.OpError.op }}\label{tf.operror.op} The operation that failed, if known. \emph{N.B.} If the failed op was synthesized at runtime, e.g.~a \texttt{Send} or \texttt{Recv} op, there will be no corresponding \href{../../api_docs/python/framework.md\#Operation}{\texttt{Operation}} object. In that case, this will return \texttt{None}, and you should instead use the \protect\hyperlink{OpError.nodeux5fdef}{\texttt{OpError.node\_def}} to discover information about the op. \subparagraph{Returns: }\label{returns-3} The \texttt{Operation} that failed, or None. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.OpError.node\_def} }{tf.OpError.node\_def }}\label{tf.operror.nodeux5fdef} The \texttt{NodeDef} proto representing the op that failed. \paragraph{Other Methods }\label{other-methods} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.OpError.\_\_init\_\_(node\_def,\ op,\ message,\ error\_code)} }{tf.OpError.\_\_init\_\_(node\_def, op, message, error\_code) }}\label{tf.operror.ux5fux5finitux5fux5fnodeux5fdef-op-message-errorux5fcode} Creates a new OpError indicating that a particular op failed. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{node\_def}: The graph\_pb2.NodeDef proto representing the op that failed. \item \texttt{op}: The ops.Operation that failed, if known; otherwise None. \item \texttt{message}: The message string describing the failure. \item \texttt{error\_code}: The error\_codes\_pb2.Code describing the error. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.OpError.error\_code} }{tf.OpError.error\_code }}\label{tf.operror.errorux5fcode} The integer error code that describes the error. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.OpError.message} }{tf.OpError.message }}\label{tf.operror.message} The error message that describes the error. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.CancelledError} }{class tf.errors.CancelledError }}\label{class-tf.errors.cancellederror} Raised when an operation or step is cancelled. For example, a long-running operation (e.g. \href{../../api_docs/python/io_ops.md\#QueueBase.enqueue}{\texttt{queue.enqueue()}} may be cancelled by running another operation (e.g. \href{../../api_docs/python/io_ops.md\#QueueBase.close}{\texttt{queue.close(cancel\_pending\_enqueues=True)}}, or by \href{../../api_docs/python/client.md\#Session.close}{closing the session}. A step that is running such a long-running operation will fail by raising \texttt{CancelledError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.CancelledError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.CancelledError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.cancellederror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates a \texttt{CancelledError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.UnknownError} }{class tf.errors.UnknownError }}\label{class-tf.errors.unknownerror} Unknown error. An example of where this error may be returned is if a Status value received from another address space belongs to an error-space that is not known to this address space. Also errors raised by APIs that do not return enough error information may be converted to this error. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.UnknownError.\_\_init\_\_(node\_def,\ op,\ message,\ error\_code=2)} }{tf.errors.UnknownError.\_\_init\_\_(node\_def, op, message, error\_code=2) }}\label{tf.errors.unknownerror.ux5fux5finitux5fux5fnodeux5fdef-op-message-errorux5fcode2} Creates an \texttt{UnknownError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.InvalidArgumentError} }{class tf.errors.InvalidArgumentError }}\label{class-tf.errors.invalidargumenterror} Raised when an operation receives an invalid argument. This may occur, for example, if an operation is receives an input tensor that has an invalid value or shape. For example, the \href{../../api_docs/python/math_ops.md\#matmul}{\texttt{tf.matmul()}} op will raise this error if it receives an input that is not a matrix, and the \href{../../api_docs/python/array_ops.md\#reshape}{\texttt{tf.reshape()}} op will raise this error if the new shape does not match the number of elements in the input tensor. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.InvalidArgumentError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.InvalidArgumentError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.invalidargumenterror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{InvalidArgumentError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.DeadlineExceededError} }{class tf.errors.DeadlineExceededError }}\label{class-tf.errors.deadlineexceedederror} Raised when a deadline expires before an operation could complete. This exception is not currently used. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.DeadlineExceededError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.DeadlineExceededError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.deadlineexceedederror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates a \texttt{DeadlineExceededError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.NotFoundError} }{class tf.errors.NotFoundError }}\label{class-tf.errors.notfounderror} Raised when a requested entity (e.g., a file or directory) was not found. For example, running the \href{../../api_docs/python/io_ops.md\#WholeFileReader}{\texttt{tf.WholeFileReader.read()}} operation could raise \texttt{NotFoundError} if it receives the name of a file that does not exist. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.NotFoundError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.NotFoundError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.notfounderror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates a \texttt{NotFoundError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.AlreadyExistsError} }{class tf.errors.AlreadyExistsError }}\label{class-tf.errors.alreadyexistserror} Raised when an entity that we attempted to create already exists. For example, running an operation that saves a file (e.g. \href{../../api_docs/python/train.md\#Saver.save}{\texttt{tf.train.Saver.save()}}) could potentially raise this exception if an explicit filename for an existing file was passed. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.AlreadyExistsError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.AlreadyExistsError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.alreadyexistserror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{AlreadyExistsError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.PermissionDeniedError} }{class tf.errors.PermissionDeniedError }}\label{class-tf.errors.permissiondeniederror} Raised when the caller does not have permission to run an operation. For example, running the \href{../../api_docs/python/io_ops.md\#WholeFileReader}{\texttt{tf.WholeFileReader.read()}} operation could raise \texttt{PermissionDeniedError} if it receives the name of a file for which the user does not have the read file permission. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.PermissionDeniedError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.PermissionDeniedError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.permissiondeniederror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates a \texttt{PermissionDeniedError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.UnauthenticatedError} }{class tf.errors.UnauthenticatedError }}\label{class-tf.errors.unauthenticatederror} The request does not have valid authentication credentials. This exception is not currently used. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.UnauthenticatedError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.UnauthenticatedError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.unauthenticatederror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{UnauthenticatedError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.ResourceExhaustedError} }{class tf.errors.ResourceExhaustedError }}\label{class-tf.errors.resourceexhaustederror} Some resource has been exhausted. For example, this error might be raised if a per-user quota is exhausted, or perhaps the entire file system is out of space. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.ResourceExhaustedError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.ResourceExhaustedError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.resourceexhaustederror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates a \texttt{ResourceExhaustedError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.FailedPreconditionError} }{class tf.errors.FailedPreconditionError }}\label{class-tf.errors.failedpreconditionerror} Operation was rejected because the system is not in a state to execute it. This exception is most commonly raised when running an operation that reads a \href{../../api_docs/python/state_ops.md\#Variable}{\texttt{tf.Variable}} before it has been initialized. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.FailedPreconditionError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.FailedPreconditionError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.failedpreconditionerror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates a \texttt{FailedPreconditionError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.AbortedError} }{class tf.errors.AbortedError }}\label{class-tf.errors.abortederror} The operation was aborted, typically due to a concurrent action. For example, running a \href{../../api_docs/python/io_ops.md\#QueueBase.enqueue}{\texttt{queue.enqueue()}} operation may raise \texttt{AbortedError} if a \href{../../api_docs/python/io_ops.md\#QueueBase.close}{\texttt{queue.close()}} operation previously ran. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.AbortedError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.AbortedError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.abortederror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{AbortedError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.OutOfRangeError} }{class tf.errors.OutOfRangeError }}\label{class-tf.errors.outofrangeerror} Raised when an operation executed past the valid range. This exception is raised in ``end-of-file'' conditions, such as when a \href{../../api_docs/python/io_ops.md\#QueueBase.dequeue}{\texttt{queue.dequeue()}} operation is blocked on an empty queue, and a \href{../../api_docs/python/io_ops.md\#QueueBase.close}{\texttt{queue.close()}} operation executes. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.OutOfRangeError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.OutOfRangeError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.outofrangeerror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{OutOfRangeError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.UnimplementedError} }{class tf.errors.UnimplementedError }}\label{class-tf.errors.unimplementederror} Raised when an operation has not been implemented. Some operations may raise this error when passed otherwise-valid arguments that it does not currently support. For example, running the \href{../../api_docs/python/nn.md\#max_pool}{\texttt{tf.nn.max\_pool()}} operation would raise this error if pooling was requested on the batch dimension, because this is not yet supported. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.UnimplementedError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.UnimplementedError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.unimplementederror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{UnimplementedError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.InternalError} }{class tf.errors.InternalError }}\label{class-tf.errors.internalerror} Raised when the system experiences an internal error. This exception is raised when some invariant expected by the runtime has been broken. Catching this exception is not recommended. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.InternalError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.InternalError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.internalerror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{InternalError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.UnavailableError} }{class tf.errors.UnavailableError }}\label{class-tf.errors.unavailableerror} Raised when the runtime is currently unavailable. This exception is not currently used. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.UnavailableError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.UnavailableError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.unavailableerror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates an \texttt{UnavailableError}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.errors.DataLossError} }{class tf.errors.DataLossError }}\label{class-tf.errors.datalosserror} Raised when unrecoverable data loss or corruption is encountered. For example, this may be raised by running a \href{../../api_docs/python/io_ops.md\#WholeFileReader}{\texttt{tf.WholeFileReader.read()}} operation, if the file is truncated while it is being read. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.errors.DataLossError.\_\_init\_\_(node\_def,\ op,\ message)} }{tf.errors.DataLossError.\_\_init\_\_(node\_def, op, message) }}\label{tf.errors.datalosserror.ux5fux5finitux5fux5fnodeux5fdef-op-message} Creates a \texttt{DataLossError}. ================================================ FILE: tex_pdf/api/python/c4s13_train.tex ================================================ \section{Training }\label{training} \subsection{Contents}\label{contents} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-training}{Training}}{Training}}\label{training-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-optimizers}{Optimizers} \item \protect\hyperlink{Optimizer}{\texttt{class\ tf.train.Optimizer}} \item \protect\hyperlink{AUTOGENERATED-usage}{Usage} \item \protect\hyperlink{AUTOGENERATED-processing-gradients-before-applying-them.}{Processing gradients before applying them.} \item \protect\hyperlink{AUTOGENERATED-gating-gradients}{Gating Gradients} \item \protect\hyperlink{AUTOGENERATED-slots}{Slots} \item \protect\hyperlink{GradientDescentOptimizer}{\texttt{class\ tf.train.GradientDescentOptimizer}} \item \protect\hyperlink{AdagradOptimizer}{\texttt{class\ tf.train.AdagradOptimizer}} \item \protect\hyperlink{MomentumOptimizer}{\texttt{class\ tf.train.MomentumOptimizer}} \item \protect\hyperlink{AdamOptimizer}{\texttt{class\ tf.train.AdamOptimizer}} \item \protect\hyperlink{FtrlOptimizer}{\texttt{class\ tf.train.FtrlOptimizer}} \item \protect\hyperlink{RMSPropOptimizer}{\texttt{class\ tf.train.RMSPropOptimizer}} \item \protect\hyperlink{AUTOGENERATED-gradient-computation}{Gradient Computation} \item \protect\hyperlink{gradients}{\texttt{tf.gradients(ys,\ xs,\ grad\_ys=None,\ name=\textquotesingle{}gradients\textquotesingle{},\ colocate\_gradients\_with\_ops=False,\ gate\_gradients=False,\ aggregation\_method=None)}} \item \protect\hyperlink{AggregationMethod}{\texttt{class\ tf.AggregationMethod}} \item \protect\hyperlink{stopux5fgradient}{\texttt{tf.stop\_gradient(input,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-gradient-clipping}{Gradient Clipping} \item \protect\hyperlink{clipux5fbyux5fvalue}{\texttt{tf.clip\_by\_value(t,\ clip\_value\_min,\ clip\_value\_max,\ name=None)}} \item \protect\hyperlink{clipux5fbyux5fnorm}{\texttt{tf.clip\_by\_norm(t,\ clip\_norm,\ name=None)}} \item \protect\hyperlink{clipux5fbyux5faverageux5fnorm}{\texttt{tf.clip\_by\_average\_norm(t,\ clip\_norm,\ name=None)}} \item \protect\hyperlink{clipux5fbyux5fglobalux5fnorm}{\texttt{tf.clip\_by\_global\_norm(t\_list,\ clip\_norm,\ use\_norm=None,\ name=None)}} \item \protect\hyperlink{globalux5fnorm}{\texttt{tf.global\_norm(t\_list,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-decaying-the-learning-rate}{Decaying the learning rate} \item \protect\hyperlink{exponentialux5fdecay}{\texttt{tf.train.exponential\_decay(learning\_rate,\ global\_step,\ decay\_steps,\ decay\_rate,\ staircase=False,\ name=None)}} \item \protect\hyperlink{AUTOGENERATED-moving-averages}{Moving Averages} \item \protect\hyperlink{ExponentialMovingAverage}{\texttt{class\ tf.train.ExponentialMovingAverage}} \item \protect\hyperlink{AUTOGENERATED-coordinator-and-queuerunner}{Coordinator and QueueRunner} \item \protect\hyperlink{Coordinator}{\texttt{class\ tf.train.Coordinator}} \item \protect\hyperlink{QueueRunner}{\texttt{class\ tf.train.QueueRunner}} \item \protect\hyperlink{addux5fqueueux5frunner}{\texttt{tf.train.add\_queue\_runner(qr,\ collection=\textquotesingle{}queue\_runners\textquotesingle{})}} \item \protect\hyperlink{startux5fqueueux5frunners}{\texttt{tf.train.start\_queue\_runners(sess=None,\ coord=None,\ daemon=True,\ start=True,\ collection=\textquotesingle{}queue\_runners\textquotesingle{})}} \item \protect\hyperlink{AUTOGENERATED-summary-operations}{Summary Operations} \item \protect\hyperlink{scalarux5fsummary}{\texttt{tf.scalar\_summary(tags,\ values,\ collections=None,\ name=None)}} \item \protect\hyperlink{imageux5fsummary}{\texttt{tf.image\_summary(tag,\ tensor,\ max\_images=None,\ collections=None,\ name=None)}} \item \protect\hyperlink{histogramux5fsummary}{\texttt{tf.histogram\_summary(tag,\ values,\ collections=None,\ name=None)}} \item \protect\hyperlink{zeroux5ffraction}{\texttt{tf.nn.zero\_fraction(value,\ name=None)}} \item \protect\hyperlink{mergeux5fsummary}{\texttt{tf.merge\_summary(inputs,\ collections=None,\ name=None)}} \item \protect\hyperlink{mergeux5fallux5fsummaries}{\texttt{tf.merge\_all\_summaries(key=\textquotesingle{}summaries\textquotesingle{})}} \item \protect\hyperlink{AUTOGENERATED-adding-summaries-to-event-files}{Adding Summaries to Event Files} \item \protect\hyperlink{SummaryWriter}{\texttt{class\ tf.train.SummaryWriter}} \item \protect\hyperlink{summaryux5fiterator}{\texttt{tf.train.summary\_iterator(path)}} \item \protect\hyperlink{AUTOGENERATED-training-utilities}{Training utilities} \item \protect\hyperlink{globalux5fstep}{\texttt{tf.train.global\_step(sess,\ global\_step\_tensor)}} \item \protect\hyperlink{writeux5fgraph}{\texttt{tf.train.write\_graph(graph\_def,\ logdir,\ name,\ as\_text=True)}} \end{itemize} This library provides a set of classes and functions that helps train models. \subsection{Optimizers }\label{optimizers} The Optimizer base class provides methods to compute gradients for a loss and apply gradients to variables. A collection of subclasses implement classic optimization algorithms such as GradientDescent and Adagrad. You never instantiate the Optimizer class itself, but instead instantiate one of the subclasses. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.Optimizer} }{class tf.train.Optimizer }}\label{class-tf.train.optimizer} Base class for optimizers. This class defines the API to add Ops to train a model. You never use this class directly, but instead instantiate one of its subclasses such as \texttt{GradientDescentOptimizer}, \texttt{AdagradOptimizer}, or \texttt{MomentumOptimizer}. \subsubsection{Usage }\label{usage} \begin{verbatim} # Create an optimizer with the desired parameters. opt = GradientDescentOptimizer(learning_rate=0.1) # Add Ops to the graph to minimize a cost by updating a list of variables. # "cost" is a Tensor, and the list of variables contains variables.Variable # objects. opt_op = opt.minimize(cost, ) \end{verbatim} In the training program you will just have to run the returned Op. \begin{verbatim} # Execute opt_op to do one step of training: opt_op.run() \end{verbatim} \subsubsection{Processing gradients before applying them. }\label{processing-gradients-before-applying-them.} Calling \texttt{minimize()} takes care of both computing the gradients and applying them to the variables. If you want to process the gradients before applying them you can instead use the optimizer in three steps: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item Compute the gradients with \texttt{compute\_gradients()}. \item Process the gradients as you wish. \item Apply the processed gradients with \texttt{apply\_gradients()}. \end{enumerate} Example: \begin{verbatim} # Create an optimizer. opt = GradientDescentOptimizer(learning_rate=0.1) # Compute the gradients for a list of variables. grads_and_vars = opt.compute_gradients(loss, ) # grads_and_vars is a list of tuples (gradient, variable). Do whatever you # need to the 'gradient' part, for example cap them, etc. capped_grads_and_vars = [(MyCapper(gv[0]), gv[1])) for gv in grads_and_vars] # Ask the optimizer to apply the capped gradients. opt.apply_gradients(capped_grads_and_vars) \end{verbatim} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Optimizer.\_\_init\_\_(use\_locking,\ name)} }{tf.train.Optimizer.\_\_init\_\_(use\_locking, name) }}\label{tf.train.optimizer.ux5fux5finitux5fux5fuseux5flocking-name} Create a new Optimizer. This must be called by the constructors of subclasses. \subparagraph{Args: }\label{args} \begin{itemize} \tightlist \item \texttt{use\_locking}: Bool. If True apply use locks to prevent concurrent updates to variables. \item \texttt{name}: A non-empty string. The name to use for accumulators created for the optimizer. \end{itemize} \subparagraph{Raises: }\label{raises} \begin{itemize} \tightlist \item \texttt{ValueError}: if name is malformed. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Optimizer.minimize(loss,\ global\_step=None,\ var\_list=None,\ gate\_gradients=1,\ name=None)} }{tf.train.Optimizer.minimize(loss, global\_step=None, var\_list=None, gate\_gradients=1, name=None) }}\label{tf.train.optimizer.minimizeloss-globalux5fstepnone-varux5flistnone-gateux5fgradients1-namenone} Add operations to minimize `loss' by updating `var\_list'. This method simply combines calls compute\_gradients() and apply\_gradients(). If you want to process the gradient before applying them call compute\_gradients() and apply\_gradients() explicitly instead of using this function. \subparagraph{Args: }\label{args-1} \begin{itemize} \tightlist \item \texttt{loss}: A Tensor containing the value to minimize. \item \texttt{global\_step}: Optional Variable to increment by one after the variables have been updated. \item \texttt{var\_list}: Optional list of variables.Variable to update to minimize `loss'. Defaults to the list of variables collected in the graph under the key GraphKeys.TRAINABLE\_VARIABLES. \item \texttt{gate\_gradients}: How to gate the computation of gradients. Can be GATE\_NONE, GATE\_OP, or GATE\_GRAPH. \item \texttt{name}: Optional name for the returned operation. \end{itemize} \subparagraph{Returns: }\label{returns} An Operation that updates the variables in `var\_list'. If `global\_step' was not None, that operation also increments global\_step. \subparagraph{Raises: }\label{raises-1} \begin{itemize} \tightlist \item \texttt{ValueError}: if some of the variables are not variables.Variable objects. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Optimizer.compute\_gradients(loss,\ var\_list=None,\ gate\_gradients=1)} }{tf.train.Optimizer.compute\_gradients(loss, var\_list=None, gate\_gradients=1) }}\label{tf.train.optimizer.computeux5fgradientsloss-varux5flistnone-gateux5fgradients1} Compute gradients of ``loss'' for the variables in ``var\_list''. This is the first part of minimize(). It returns a list of (gradient, variable) pairs where ``gradient'' is the gradient for ``variable''. Note that ``gradient'' can be a Tensor, a IndexedSlices, or None if there is no gradient for the given variable. \subparagraph{Args: }\label{args-2} \begin{itemize} \tightlist \item \texttt{loss}: A Tensor containing the value to minimize. \item \texttt{var\_list}: Optional list of variables.Variable to update to minimize ``loss''. Defaults to the list of variables collected in the graph under the key GraphKey.TRAINABLE\_VARIABLES. \item \texttt{gate\_gradients}: How to gate the computation of gradients. Can be GATE\_NONE, GATE\_OP, or GATE\_GRAPH. \end{itemize} \subparagraph{Returns: }\label{returns-1} A list of (gradient, variable) pairs. \subparagraph{Raises: }\label{raises-2} \begin{itemize} \tightlist \item \texttt{TypeError}: If var\_list contains anything else than variables.Variable. \item \texttt{ValueError}: If some arguments are invalid. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Optimizer.apply\_gradients(grads\_and\_vars,\ global\_step=None,\ name=None)} }{tf.train.Optimizer.apply\_gradients(grads\_and\_vars, global\_step=None, name=None) }}\label{tf.train.optimizer.applyux5fgradientsgradsux5fandux5fvars-globalux5fstepnone-namenone} Apply gradients to variables. This is the second part of minimize(). It returns an Operation that applies gradients. \subparagraph{Args: }\label{args-3} \begin{itemize} \tightlist \item \texttt{grads\_and\_vars}: List of (gradient, variable) pairs as returned by compute\_gradients(). \item \texttt{global\_step}: Optional Variable to increment by one after the variables have been updated. \item \texttt{name}: Optional name for the returned operation. Default to the name passed to the Optimizer constructor. \end{itemize} \subparagraph{Returns: }\label{returns-2} An Operation that applies the specified gradients. If `global\_step' was not None, that operation also increments global\_step. \subparagraph{Raises: }\label{raises-3} \begin{itemize} \tightlist \item \texttt{TypeError}: if grads\_and\_vars is malformed. \end{itemize} \subsubsection{Gating Gradients }\label{gating-gradients} Both \texttt{minimize()} and \texttt{compute\_gradients()} accept a \texttt{gate\_gradient} argument that controls the degree of parallelism during the application of the gradients. The possible values are: \texttt{GATE\_NONE}, \texttt{GATE\_OP}, and \texttt{GATE\_GRAPH}. GATE\_NONE: Compute and apply gradients in parallel. This provides the maximum parallelism in execution, at the cost of some non-reproducibility in the results. For example the two gradients of MatMul depend on the input values: With \texttt{GATE\_NONE} one of the gradients could be applied to one of the inputs \emph{before} the other gradient is computed resulting in non-reproducible results. GATE\_OP: For each Op, make sure all gradients are computed before they are used. This prevents race conditions for Ops that generate gradients for multiple inputs where the gradients depend on the inputs. GATE\_GRAPH: Make sure all gradients for all variables are computed before any one of them is used. This provides the least parallelism but can be useful if you want to process all gradients before applying any of them. \subsubsection{Slots }\label{slots} Some optimizer subclasses, such as \texttt{MomentumOptimizer} and \texttt{AdagradOptimizer} allocate and manage additional variables associated with the variables to train. These are called Slots. Slots have names and you can ask the optimizer for the names of the slots that it uses. Once you have a slot name you can ask the optimizer for the variable it created to hold the slot value. This can be useful if you want to log debug a training algorithm, report stats about the slots, etc. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Optimizer.get\_slot\_names()} }{tf.train.Optimizer.get\_slot\_names() }}\label{tf.train.optimizer.getux5fslotux5fnames} Return a list of the names of slots created by the Optimizer. See get\_slot(). \subparagraph{Returns: }\label{returns-3} A list of strings. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Optimizer.get\_slot(var,\ name)} }{tf.train.Optimizer.get\_slot(var, name) }}\label{tf.train.optimizer.getux5fslotvar-name} Return a slot named ``name'' created for ``var'' by the Optimizer. Some Optimizer subclasses use additional variables. For example Momentum and Adagrad use variables to accumulate updates. This method gives access to these Variables if for some reason you need them. Use get\_slot\_names() to get the list of slot names created by the Optimizer. \subparagraph{Args: }\label{args-4} \begin{itemize} \tightlist \item \texttt{var}: A variable passed to minimize() or apply\_gradients(). \item \texttt{name}: A string. \end{itemize} \subparagraph{Returns: }\label{returns-4} The Variable for the slot if it was created, None otherwise. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.GradientDescentOptimizer} }{class tf.train.GradientDescentOptimizer }}\label{class-tf.train.gradientdescentoptimizer} Optimizer that implements the gradient descent algorithm. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.GradientDescentOptimizer.\_\_init\_\_(learning\_rate,\ use\_locking=False,\ name=\textquotesingle{}GradientDescent\textquotesingle{})} }{tf.train.GradientDescentOptimizer.\_\_init\_\_(learning\_rate, use\_locking=False, name='GradientDescent') }}\label{tf.train.gradientdescentoptimizer.ux5fux5finitux5fux5flearningux5frate-useux5flockingfalse-namegradientdescent} Construct a new gradient descent optimizer. \subparagraph{Args: }\label{args-5} \begin{itemize} \tightlist \item \texttt{learning\_rate}: A Tensor or a floating point value. The learning rate to use. \item \texttt{use\_locking}: If True use locks for update operation.s \item \texttt{name}: Optional name prefix for the operations created when applying gradients. Defaults to ``GradientDescent''. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.AdagradOptimizer} }{class tf.train.AdagradOptimizer }}\label{class-tf.train.adagradoptimizer} Optimizer that implements the Adagrad algorithm. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.AdagradOptimizer.\_\_init\_\_(learning\_rate,\ initial\_accumulator\_value=0.1,\ use\_locking=False,\ name=\textquotesingle{}Adagrad\textquotesingle{})} }{tf.train.AdagradOptimizer.\_\_init\_\_(learning\_rate, initial\_accumulator\_value=0.1, use\_locking=False, name='Adagrad') }}\label{tf.train.adagradoptimizer.ux5fux5finitux5fux5flearningux5frate-initialux5faccumulatorux5fvalue0.1-useux5flockingfalse-nameadagrad} Construct a new Adagrad optimizer. \subparagraph{Args: }\label{args-6} \begin{itemize} \tightlist \item \texttt{learning\_rate}: A \texttt{Tensor} or a floating point value. The learning rate. \item \texttt{initial\_accumulator\_value}: A floating point value. Starting value for the accumulators, must be positive. \item \texttt{use\_locking}: If \texttt{True} use locks for update operations. \item \texttt{name}: Optional name prefix for the operations created when applying gradients. Defaults to ``Adagrad''. \end{itemize} \subparagraph{Raises: }\label{raises-4} \begin{itemize} \tightlist \item \texttt{ValueError}: If the initial\_accumulator\_value is invalid. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.MomentumOptimizer} }{class tf.train.MomentumOptimizer }}\label{class-tf.train.momentumoptimizer} Optimizer that implements the Momentum algorithm. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.MomentumOptimizer.\_\_init\_\_(learning\_rate,\ momentum,\ use\_locking=False,\ name=\textquotesingle{}Momentum\textquotesingle{})} }{tf.train.MomentumOptimizer.\_\_init\_\_(learning\_rate, momentum, use\_locking=False, name='Momentum') }}\label{tf.train.momentumoptimizer.ux5fux5finitux5fux5flearningux5frate-momentum-useux5flockingfalse-namemomentum} Construct a new Momentum optimizer. \subparagraph{Args: }\label{args-7} \begin{itemize} \tightlist \item \texttt{learning\_rate}: A \texttt{Tensor} or a floating point value. The learning rate. \item \texttt{momentum}: A \texttt{Tensor} or a floating point value. The momentum. \item \texttt{use\_locking}: If \texttt{True} use locks for update operations. \item \texttt{name}: Optional name prefix for the operations created when applying gradients. Defaults to ``Momentum''. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.AdamOptimizer} }{class tf.train.AdamOptimizer }}\label{class-tf.train.adamoptimizer} Optimizer that implements the Adam algorithm. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.AdamOptimizer.\_\_init\_\_(learning\_rate=0.001,\ beta1=0.9,\ beta2=0.999,\ epsilon=1e-08,\ use\_locking=False,\ name=\textquotesingle{}Adam\textquotesingle{})} }{tf.train.AdamOptimizer.\_\_init\_\_(learning\_rate=0.001, beta1=0.9, beta2=0.999, epsilon=1e-08, use\_locking=False, name='Adam') }}\label{tf.train.adamoptimizer.ux5fux5finitux5fux5flearningux5frate0.001-beta10.9-beta20.999-epsilon1e-08-useux5flockingfalse-nameadam} Construct a new Adam optimizer. Implementation is based on: http://arxiv.org/pdf/1412.6980v7.pdf Initialization: \begin{verbatim} m_0 <- 0 (Initialize initial 1st moment vector) v_0 <- 0 (Initialize initial 2nd moment vector) t <- 0 (Initialize timestep) \end{verbatim} The update rule for \texttt{variable} with gradient \texttt{g} uses an optimization described at the end of section2 of the paper: \begin{verbatim} t <- t + 1 lr_t <- learning_rate * sqrt(1 - beta2^t) / (1 - beta1^t) m_t <- beta1 * m_{t-1} + (1 - beta1) * g v_t <- beta2 * v_{t-1} + (1 - beta2) * g * g variable <- variable - lr_t * m_t / (sqrt(v_t) + epsilon) \end{verbatim} The default value of 1e-8 for epsilon might not be a good default in general. For example, when training an Inception network on ImageNet a current good choice is 1.0 or 0.1. \subparagraph{Args: }\label{args-8} \begin{itemize} \tightlist \item \texttt{learning\_rate}: A Tensor or a floating point value. The learning rate. \item \texttt{beta1}: A float value or a constant float tensor. The exponential decay rate for the 1st moment estimates. \item \texttt{beta2}: A float value or a constant float tensor. The exponential decay rate for the 2st moment estimates. \item \texttt{epsilon}: A small constant for numerical stability. \item \texttt{use\_locking}: If True use locks for update operation.s \item \texttt{name}: Optional name for the operations created when applying gradients. Defaults to ``Adam''. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.FtrlOptimizer} }{class tf.train.FtrlOptimizer }}\label{class-tf.train.ftrloptimizer} Optimizer that implements the FTRL algorithm. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.FtrlOptimizer.\_\_init\_\_(learning\_rate,\ learning\_rate\_power=-0.5,\ initial\_accumulator\_value=0.1,\ l1\_regularization\_strength=0.0,\ l2\_regularization\_strength=0.0,\ use\_locking=False,\ name=\textquotesingle{}Ftrl\textquotesingle{})} }{tf.train.FtrlOptimizer.\_\_init\_\_(learning\_rate, learning\_rate\_power=-0.5, initial\_accumulator\_value=0.1, l1\_regularization\_strength=0.0, l2\_regularization\_strength=0.0, use\_locking=False, name='Ftrl') }}\label{tf.train.ftrloptimizer.ux5fux5finitux5fux5flearningux5frate-learningux5frateux5fpower-0.5-initialux5faccumulatorux5fvalue0.1-l1ux5fregularizationux5fstrength0.0-l2ux5fregularizationux5fstrength0.0-useux5flockingfalse-nameftrl} Construct a new FTRL optimizer. The Ftrl-proximal algorithm, abbreviated for Follow-the-regularized-leader, is described in the paper \href{https://www.eecs.tufts.edu/~dsculley/papers/ad-click-prediction.pdf}{Ad Click Prediction: a View from the Trenches}. It can give a good performance vs.~sparsity tradeoff. Ftrl-proximal uses its own global base learning rate and can behave like Adagrad with \texttt{learning\_rate\_power=-0.5}, or like gradient descent with \texttt{learning\_rate\_power=0.0}. The effective learning rate is adjusted per parameter, relative to this base learning rate as: \begin{verbatim} effective_learning_rate_i = (learning_rate / pow(k + summed_squared_gradients_for_i, learning_rate_power)); \end{verbatim} where k is the small constant \texttt{initial\_accumulator\_value}. Note that the real regularization coefficient of \texttt{\textbar{}w\textbar{}\^{}2} for objective function is \texttt{1\ /\ lambda\_2} if specifying \texttt{l2\ =\ lambda\_2} as argument when using this function. \subparagraph{Args: }\label{args-9} \begin{itemize} \tightlist \item \texttt{learning\_rate}: A float value or a constant float \texttt{Tensor}. \item \texttt{learning\_rate\_power}: A float value, must be less or equal to zero. \item \texttt{initial\_accumulator\_value}: The starting value for accumulators. Only positive values are allowed. \item \texttt{l1\_regularization\_strength}: A float value, must be greater than or equal to zero. \item \texttt{l2\_regularization\_strength}: A float value, must be greater than or equal to zero. \item \texttt{use\_locking}: If \texttt{True} use locks for update operations. \item \texttt{name}: Optional name prefix for the operations created when applying gradients. Defaults to ``Ftrl''. \end{itemize} \subparagraph{Raises: }\label{raises-5} \begin{itemize} \tightlist \item \texttt{ValueError}: if one of the arguments is invalid. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.RMSPropOptimizer} }{class tf.train.RMSPropOptimizer }}\label{class-tf.train.rmspropoptimizer} Optimizer that implements the RMSProp algorithm. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.RMSPropOptimizer.\_\_init\_\_(learning\_rate,\ decay,\ momentum=0.0,\ epsilon=1e-10,\ use\_locking=False,\ name=\textquotesingle{}RMSProp\textquotesingle{})} }{tf.train.RMSPropOptimizer.\_\_init\_\_(learning\_rate, decay, momentum=0.0, epsilon=1e-10, use\_locking=False, name='RMSProp') }}\label{tf.train.rmspropoptimizer.ux5fux5finitux5fux5flearningux5frate-decay-momentum0.0-epsilon1e-10-useux5flockingfalse-namermsprop} Construct a new RMSProp optimizer. \subparagraph{Args: }\label{args-10} \begin{itemize} \tightlist \item \texttt{learning\_rate}: A Tensor or a floating point value. The learning rate. \item \texttt{decay}: discounting factor for the history/coming gradient \item \texttt{momentum}: a scalar tensor. \item \texttt{epsilon}: small value to avoid zero denominator. \item \texttt{use\_locking}: If True use locks for update operation. \item \texttt{name}: Optional name prefic for the operations created when applying gradients. Defaults to ``RMSProp''. \end{itemize} \subsection{Gradient Computation }\label{gradient-computation} TensorFlow provides functions to compute the derivatives for a given TensorFlow computation graph, adding operations to the graph. The optimizer classes automatically compute derivatives on your graph, but creators of new Optimizers or expert users can call the lower-level functions below. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.gradients(ys,\ xs,\ grad\_ys=None,\ name=\textquotesingle{}gradients\textquotesingle{},\ colocate\_gradients\_with\_ops=False,\ gate\_gradients=False,\ aggregation\_method=None)} }{tf.gradients(ys, xs, grad\_ys=None, name='gradients', colocate\_gradients\_with\_ops=False, gate\_gradients=False, aggregation\_method=None) }}\label{tf.gradientsys-xs-gradux5fysnone-namegradients-colocateux5fgradientsux5fwithux5fopsfalse-gateux5fgradientsfalse-aggregationux5fmethodnone} Constructs symbolic partial derivatives of \texttt{ys} w.r.t. x in \texttt{xs}. \texttt{ys} and \texttt{xs} are each a \texttt{Tensor} or a list of tensors. \texttt{grad\_ys} is a list of \texttt{Tensor}, holding the gradients received by the \texttt{ys}. The list must be the same length as \texttt{ys}. \texttt{gradients()} adds ops to the graph to output the partial derivatives of \texttt{ys} with respect to \texttt{xs}. It returns a list of \texttt{Tensor} of length \texttt{len(xs)} where each tensor is the \texttt{sum(dy/dx)} for y in \texttt{ys}. \texttt{grad\_ys} is a list of tensors of the same length as \texttt{ys} that holds the initial gradients for each y in \texttt{ys}. When \texttt{grad\_ys} is None, we fill in a tensor of '1's of the shape of y for each y in \texttt{ys}. A user can provide their own initial 'grad\_ys` to compute the derivatives using a different initial gradient for each y (e.g., if one wanted to weight the gradient differently for each value in each y). \subparagraph{Args: }\label{args-11} \begin{itemize} \tightlist \item \texttt{ys}: A \texttt{Tensor} or list of tensors to be differentiated. \item \texttt{xs}: A \texttt{Tensor} or list of tensors to be used for differentiation. \item \texttt{grad\_ys}: Optional. A \texttt{Tensor} or list of tensors the same size as \texttt{ys} and holding the gradients computed for each y in \texttt{ys}. \item \texttt{name}: Optional name to use for grouping all the gradient ops together. defaults to `gradients'. \item \texttt{colocate\_gradients\_with\_ops}: If True, try colocating gradients with the corresponding op. \item \texttt{gate\_gradients}: If True, add a tuple around the gradients returned for an operations. This avoids some race conditions. \item \texttt{aggregation\_method}: Specifies the method used to combine gradient terms. Accepted values are constants defined in the class \texttt{AggregationMethod}. \end{itemize} \subparagraph{Returns: }\label{returns-5} A list of \texttt{sum(dy/dx)} for each x in \texttt{xs}. \subparagraph{Raises: }\label{raises-6} \begin{itemize} \tightlist \item \texttt{LookupError}: if one of the operations between \texttt{x} and \texttt{y} does not have a registered gradient function. \item \texttt{ValueError}: if the arguments are invalid. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.AggregationMethod} }{class tf.AggregationMethod }}\label{class-tf.aggregationmethod} A class listing aggregation methods used to combine gradients. Computing partial derivatives can require aggregating gradient contributions. This class lists the various methods that can be used to combine gradients in the graph: \begin{itemize} \tightlist \item \texttt{ADD\_N}: All of the gradient terms are summed as part of one operation using the ``AddN'' op. It has the property that all gradients must be ready before any aggregation is performed. \item \texttt{DEFAULT}: The system-chosen default aggregation method. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.stop\_gradient(input,\ name=None)} }{tf.stop\_gradient(input, name=None) }}\label{tf.stopux5fgradientinput-namenone} Stops gradient computation. When executed in a graph, this op outputs its input tensor as-is. When building ops to compute gradients, this op prevents the contribution of its inputs to be taken into account. Normally, the gradient generator adds ops to a graph to compute the derivatives of a specified `loss' by recursively finding out inputs that contributed to its computation. If you insert this op in the graph it inputs are masked from the gradient generator. They are not taken into account for computing gradients. This is useful any time you want to compute a value with TensorFlow but need to pretend that the value was a constant. Some examples include: \begin{itemize} \tightlist \item The \emph{EM} algorithm where the \emph{M-step} should not involve backpropagation through the output of the \emph{E-step}. \item Contrastive divergence training of Boltzmann machines where, when differentiating the energy function, the training must not backpropagate through the graph that generated the samples from the model. \item Adversarial training, where no backprop should happen through the adversarial example generation process. \end{itemize} \subparagraph{Args: }\label{args-12} \begin{itemize} \tightlist \item \texttt{input}: A \texttt{Tensor}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-6} A \texttt{Tensor}. Has the same type as \texttt{input}. \subsection{Gradient Clipping }\label{gradient-clipping} TensorFlow provides several operations that you can use to add clipping functions to your graph. You can use these functions to perform general data clipping, but they're particularly useful for handling exploding or vanishing gradients. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.clip\_by\_value(t,\ clip\_value\_min,\ clip\_value\_max,\ name=None)} }{tf.clip\_by\_value(t, clip\_value\_min, clip\_value\_max, name=None) }}\label{tf.clipux5fbyux5fvaluet-clipux5fvalueux5fmin-clipux5fvalueux5fmax-namenone} Clips tensor values to a specified min and max. Given a tensor \texttt{t}, this operation returns a tensor of the same type and shape as \texttt{t} with its values clipped to \texttt{clip\_value\_min} and \texttt{clip\_value\_max}. Any values less than \texttt{clip\_value\_min} are set to \texttt{clip\_value\_min}. Any values greater than \texttt{clip\_value\_max} are set to \texttt{clip\_value\_max}. \subparagraph{Args: }\label{args-13} \begin{itemize} \tightlist \item \texttt{t}: A \texttt{Tensor}. \item \texttt{clip\_value\_min}: A 0-D (scalar) \texttt{Tensor}. The minimum value to clip by. \item \texttt{clip\_value\_max}: A 0-D (scalar) \texttt{Tensor}. The maximum value to clip by. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-7} A clipped \texttt{Tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.clip\_by\_norm(t,\ clip\_norm,\ name=None)} }{tf.clip\_by\_norm(t, clip\_norm, name=None) }}\label{tf.clipux5fbyux5fnormt-clipux5fnorm-namenone} Clips tensor values to a maximum L2-norm. Given a tensor \texttt{t}, and a maximum clip value \texttt{clip\_norm}, this operation normalizes \texttt{t} so that its L2-norm is less than or equal to \texttt{clip\_norm\textquotesingle{}.\ Specifically,\ if\ the\ L2-norm\ is\ already\ less\ than\ or\ equal\ to}clip\_norm\texttt{,\ then}t\texttt{is\ not\ modified.\ If\ the\ L2-norm\ is\ greater\ than}clip\_norm\texttt{,\ then\ this\ operation\ returns\ a\ tensor\ of\ the\ same\ type\ and\ shape\ as}t` with its values set to: \texttt{t\ *\ clip\_norm\ /\ l2norm(t)} In this case, the L2-norm of the output tensor is \texttt{clip\_norm}. This operation is typically used to clip gradients before applying them with an optimizer. \subparagraph{Args: }\label{args-14} \begin{itemize} \tightlist \item \texttt{t}: A \texttt{Tensor}. \item \texttt{clip\_norm}: A 0-D (scalar) \texttt{Tensor} \textgreater{} 0. A maximum clipping value. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-8} A clipped \texttt{Tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.clip\_by\_average\_norm(t,\ clip\_norm,\ name=None)} }{tf.clip\_by\_average\_norm(t, clip\_norm, name=None) }}\label{tf.clipux5fbyux5faverageux5fnormt-clipux5fnorm-namenone} Clips tensor values to a maximum average L2-norm. Given a tensor \texttt{t}, and a maximum clip value \texttt{clip\_norm}, this operation normalizes \texttt{t} so that its average L2-norm is less than or equal to \texttt{clip\_norm\textquotesingle{}.\ Specifically,\ if\ the\ average\ L2-norm\ is\ already\ less\ than\ or\ equal\ to}clip\_norm\texttt{,\ then}t\texttt{is\ not\ modified.\ If\ the\ average\ L2-norm\ is\ greater\ than}clip\_norm\texttt{,\ then\ this\ operation\ returns\ a\ tensor\ of\ the\ same\ type\ and\ shape\ as}t` with its values set to: \texttt{t\ *\ clip\_norm\ /\ l2norm\_avg(t)} In this case, the average L2-norm of the output tensor is \texttt{clip\_norm}. This operation is typically used to clip gradients before applying them with an optimizer. \subparagraph{Args: }\label{args-15} \begin{itemize} \tightlist \item \texttt{t}: A \texttt{Tensor}. \item \texttt{clip\_norm}: A 0-D (scalar) \texttt{Tensor} \textgreater{} 0. A maximum clipping value. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-9} A clipped \texttt{Tensor}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.clip\_by\_global\_norm(t\_list,\ clip\_norm,\ use\_norm=None,\ name=None)} }{tf.clip\_by\_global\_norm(t\_list, clip\_norm, use\_norm=None, name=None) }}\label{tf.clipux5fbyux5fglobalux5fnormtux5flist-clipux5fnorm-useux5fnormnone-namenone} Clips values of multiple tensors by the ratio of the sum of their norms. Given a tuple or list of tensors \texttt{t\_list}, and a clipping ratio \texttt{clip\_norm}, this operation returns a list of clipped tensors \texttt{list\_clipped} and the global norm (\texttt{global\_norm}) of all tensors in \texttt{t\_list}. Optionally, if you've already computed the global norm for \texttt{t\_list}, you can specify the global norm with \texttt{use\_norm}. To perform the clipping, the values t\_list{[}i{]} are set to: \texttt{t\_list{[}i{]}\ *\ clip\_norm\ /\ max(global\_norm,\ clip\_norm)} where: \texttt{global\_norm\ =\ sqrt(sum({[}l2norm(t)**2\ for\ t\ in\ t\_list{]}))} If \texttt{clip\_norm\ \textgreater{}\ global\_norm} then the entries in \texttt{t\_list} remain as they are, otherwise they're all shrunk by the global ratio. Any of the entries of \texttt{t\_list} that are of type None are ignored. This is the correct way to perform gradient clipping (for example, see R. Pascanu, T. Mikolov, and Y. Bengio, ``On the difficulty of training Recurrent Neural Networks''. http://arxiv.org/abs/1211.5063) However, it is slower than \texttt{clip\_by\_norm()} because all the parameters must be ready before the clipping operation can be performed. \subparagraph{Args: }\label{args-16} \begin{itemize} \tightlist \item \texttt{t\_list}: A tuple or list of mixed \texttt{Tensors}, \texttt{IndexedSlices}, or None. \item \texttt{clip\_norm}: A 0-D (scalar) \texttt{Tensor} \textgreater{} 0. The clipping ratio. \item \texttt{use\_norm}: A 0-D (scalar) \texttt{Tensor} of type \texttt{float} (optional). The global norm to use. If not provided, \texttt{global\_norm()} is used to compute the norm. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-10} \begin{itemize} \tightlist \item \texttt{list\_clipped}: A list of \texttt{Tensors} of the same type as \texttt{list\_t}. \item \texttt{global\_norm}: A 0-D (scalar) \texttt{Tensor} representing the global norm. \end{itemize} \subparagraph{Raises: }\label{raises-7} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{t\_list} is not a sequence. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.global\_norm(t\_list,\ name=None)} }{tf.global\_norm(t\_list, name=None) }}\label{tf.globalux5fnormtux5flist-namenone} Computes the global norm of multiple tensors. Given a tuple or list of tensors \texttt{t\_list}, this operation returns the global norm of the elements in all tensors in \texttt{t\_list}. The global norm is computed as: \texttt{global\_norm\ =\ sqrt(sum({[}l2norm(t)**2\ for\ t\ in\ t\_list{]}))} Any entries in \texttt{t\_list} that are of type None are ignored. \subparagraph{Args: }\label{args-17} \begin{itemize} \tightlist \item \texttt{t\_list}: A tuple or list of mixed \texttt{Tensors}, \texttt{IndexedSlices}, or None. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-11} A 0-D (scalar) \texttt{Tensor} of type \texttt{float}. \subparagraph{Raises: }\label{raises-8} \begin{itemize} \tightlist \item \texttt{TypeError}: If \texttt{t\_list} is not a sequence. \end{itemize} \subsection{Decaying the learning rate }\label{decaying-the-learning-rate} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.exponential\_decay(learning\_rate,\ global\_step,\ decay\_steps,\ decay\_rate,\ staircase=False,\ name=None)} }{tf.train.exponential\_decay(learning\_rate, global\_step, decay\_steps, decay\_rate, staircase=False, name=None) }}\label{tf.train.exponentialux5fdecaylearningux5frate-globalux5fstep-decayux5fsteps-decayux5frate-staircasefalse-namenone} Applies exponential decay to the learning rate. When training a model, it is often recommended to lower the learning rate as the training progresses. This function applies an exponential decay function to a provided initial learning rate. It requires a \texttt{global\_step} value to compute the decayed learning rate. You can just pass a TensorFlow variable that you increment at each training step. The function returns the decayed learning rate. It is computed as: \begin{Shaded} \begin{Highlighting}[] \NormalTok{decayed_learning_rate }\OperatorTok{=} \NormalTok{learning_rate }\OperatorTok{*} \NormalTok{decay_rate }\OperatorTok{^} \NormalTok{(global_step }\OperatorTok{/} \NormalTok{decay_steps)} \end{Highlighting} \end{Shaded} If the argument \texttt{staircase} is \texttt{True}, then \texttt{global\_step\ /decay\_steps} is an integer division and the decayed learning rate follows a staircase function. Example: decay every 100000 steps with a base of 0.96: \begin{Shaded} \begin{Highlighting}[] \NormalTok{...} \NormalTok{global_step }\OperatorTok{=} \NormalTok{tf.Variable(}\DecValTok{0}\NormalTok{, trainable}\OperatorTok{=}\VariableTok{False}\NormalTok{)} \NormalTok{starter_learning_rate }\OperatorTok{=} \FloatTok{0.1} \NormalTok{learning_rate }\OperatorTok{=} \NormalTok{tf.exponential_decay(starter_learning_rate, global_step,} \DecValTok{100000}\NormalTok{, }\FloatTok{0.96}\NormalTok{, staircase}\OperatorTok{=}\VariableTok{True}\NormalTok{)} \NormalTok{optimizer }\OperatorTok{=} \NormalTok{tf.GradientDescent(learning_rate)} \CommentTok{# Passing global_step to minimize() will increment it at each step.} \NormalTok{optimizer.minimize(...my loss..., global_step}\OperatorTok{=}\NormalTok{global_step)} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-18} \begin{itemize} \tightlist \item \texttt{learning\_rate}: A scalar \texttt{float32} or \texttt{float64} \texttt{Tensor} or a Python number. The initial learning rate. \item \texttt{global\_step}: A scalar \texttt{int32} or \texttt{int64} \texttt{Tensor} or a Python number. Global step to use for the decay computation. Must not be negative. \item \texttt{decay\_steps}: A scalar \texttt{int32} or \texttt{int64} \texttt{Tensor} or a Python number. Must be positive. See the decay computation above. \item \texttt{decay\_rate}: A scalar \texttt{float32} or \texttt{float64} \texttt{Tensor} or a Python number. The decay rate. \item \texttt{staircase}: Boolean. It \texttt{True} decay the learning rate at discrete intervals. \item \texttt{name}: string. Optional name of the operation. Defaults to `ExponentialDecay' \end{itemize} \subparagraph{Returns: }\label{returns-12} A scalar \texttt{Tensor} of the same type as \texttt{learning\_rate}. The decayed learning rate. \subsection{Moving Averages }\label{moving-averages} Some training algorithms, such as GradientDescent and Momentum often benefit from maintaining a moving average of variables during optimization. Using the moving averages for evaluations often improve results significantly. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.ExponentialMovingAverage} }{class tf.train.ExponentialMovingAverage }}\label{class-tf.train.exponentialmovingaverage} Maintains moving averages of variables by employing and exponential decay. When training a model, it is often beneficial to maintain moving averages of the trained parameters. Evaluations that use averaged parameters sometimes produce significantly better results than the final trained values. The \texttt{apply()} method adds shadow copies of trained variables and add ops that maintain a moving average of the trained variables in their shadow copies. It is used when building the training model. The ops that maintain moving averages are typically run after each training step. The \texttt{average()} and \texttt{average\_name()} methods give access to the shadow variables and their names. They are useful when building an evaluation model, or when restoring a model from a checkpoint file. They help use the moving averages in place of the last trained values for evaluations. The moving averages are computed using exponential decay. You specify the decay value when creating the \texttt{ExponentialMovingAverage} object. The shadow variables are initialized with the same initial values as the trained variables. When you run the ops to maintain the moving averages, each shadow variable is updated with the formula: \texttt{shadow\_variable\ -=\ (1\ -\ decay)\ *\ (shadow\_variable\ -\ variable)} This is mathematically equivalent to the classic formula below, but the use of an \texttt{assign\_sub} op (the \texttt{"-="} in the formula) allows concurrent lockless updates to the variables: \texttt{shadow\_variable\ =\ decay\ *\ shadow\_variable\ +\ (1\ -\ decay)\ *\ variable} Reasonable values for \texttt{decay} are close to 1.0, typically in the multiple-nines range: 0.999, 0.9999, etc. Example usage when creating a training model: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Create variables.} \NormalTok{var0 }\OperatorTok{=} \NormalTok{tf.Variable(...)} \NormalTok{var1 }\OperatorTok{=} \NormalTok{tf.Variable(...)} \CommentTok{# ... use the variables to build a training model...} \NormalTok{...} \CommentTok{# Create an op that applies the optimizer. This is what we usually} \CommentTok{# would use as a training op.} \NormalTok{opt_op }\OperatorTok{=} \NormalTok{opt.minimize(my_loss, [var0, var1])} \CommentTok{# Create an ExponentialMovingAverage object} \NormalTok{ema }\OperatorTok{=} \NormalTok{tf.train.ExponentialMovingAverage(decay}\OperatorTok{=}\FloatTok{0.9999}\NormalTok{)} \CommentTok{# Create the shadow variables, and add ops to maintain moving averages} \CommentTok{# of var0 and var1.} \NormalTok{maintain_averages_op }\OperatorTok{=} \NormalTok{ema.}\BuiltInTok{apply}\NormalTok{([var0, var1])} \CommentTok{# Create an op that will update the moving averages after each training} \CommentTok{# step. This is what we will use in place of the usuall trainig op.} \ControlFlowTok{with} \NormalTok{tf.control_dependencies([opt_op]):} \NormalTok{training_op }\OperatorTok{=} \NormalTok{tf.group(maintain_averages_op)} \NormalTok{...train the model by running training_op...} \end{Highlighting} \end{Shaded} There are two ways to use the moving averages for evaluations: \begin{itemize} \tightlist \item Build a model that uses the shadow variables instead of the variables. For this, use the \texttt{average()} method which returns the shadow variable for a given variable. \item Build a model normally but load the checkpoint files to evaluate by using the shadow variable names. For this use the \texttt{average\_name()} method. See the \href{../../api_docs/python/train.md\#Saver}{Saver class} for more information on restoring saved variables. \end{itemize} Example of restoring the shadow variable values: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Create a Saver that loads variables from their saved shadow values.} \NormalTok{shadow_var0_name }\OperatorTok{=} \NormalTok{ema.average_name(var0)} \NormalTok{shadow_var1_name }\OperatorTok{=} \NormalTok{ema.average_name(var1)} \NormalTok{saver }\OperatorTok{=} \NormalTok{tf.train.Saver(\{shadow_var0_name: var0, shadow_var1_name: var1\})} \NormalTok{saver.restore(...checkpoint filename...)} \CommentTok{# var0 and var1 now hold the moving average values} \end{Highlighting} \end{Shaded} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.ExponentialMovingAverage.\_\_init\_\_(decay,\ num\_updates=None,\ name=\textquotesingle{}ExponentialMovingAverage\textquotesingle{})} }{tf.train.ExponentialMovingAverage.\_\_init\_\_(decay, num\_updates=None, name='ExponentialMovingAverage') }}\label{tf.train.exponentialmovingaverage.ux5fux5finitux5fux5fdecay-numux5fupdatesnone-nameexponentialmovingaverage} Creates a new ExponentialMovingAverage object. The \texttt{Apply()} method has to be called to create shadow variables and add ops to maintain moving averages. The optional \texttt{num\_updates} parameter allows one to tweak the decay rate dynamically. . It is typical to pass the count of training steps, usually kept in a variable that is incremented at each step, in which case the decay rate is lower at the start of training. This makes moving averages move faster. If passed, the actual decay rate used is: \texttt{min(decay,\ (1\ +\ num\_updates)\ /\ (10\ +\ num\_updates))} \subparagraph{Args: }\label{args-19} \begin{itemize} \tightlist \item \texttt{decay}: Float. The decay to use. \item \texttt{num\_updates}: Optional count of number of updates applied to variables. \item \texttt{name}: String. Optional prefix name to use for the name of ops added in \texttt{Apply()}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.ExponentialMovingAverage.apply(var\_list=None)} }{tf.train.ExponentialMovingAverage.apply(var\_list=None) }}\label{tf.train.exponentialmovingaverage.applyvarux5flistnone} Maintains moving averages of variables. \texttt{var\_list} must be a list of \texttt{Variable} or \texttt{Tensor} objects. This method creates shadow variables for all elements of \texttt{var\_list}. Shadow variables for \texttt{Variable} objects are initialized to the variable's initial value. For \texttt{Tensor} objects, the shadow variables are initialized to 0. shadow variables are created with \texttt{trainable=False} and added to the \texttt{GraphKeys.ALL\_VARIABLES} collection. They will be returned by calls to \texttt{tf.all\_variables()}. Returns an op that updates all shadow variables as described above. Note that \texttt{apply()} can be called multiple times with different lists of variables. \subparagraph{Args: }\label{args-20} \begin{itemize} \tightlist \item \texttt{var\_list}: A list of Variable or Tensor objects. The variables and Tensors must be of types float32 or float64. \end{itemize} \subparagraph{Returns: }\label{returns-13} An Operation that updates the moving averages. \subparagraph{Raises: }\label{raises-9} \begin{itemize} \tightlist \item \texttt{TypeError}: If the arguments are not all float32 or float64. \item \texttt{ValueError}: If the moving average of one of the variables is already being computed. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.ExponentialMovingAverage.average\_name(var)} }{tf.train.ExponentialMovingAverage.average\_name(var) }}\label{tf.train.exponentialmovingaverage.averageux5fnamevar} Returns the name of the \texttt{Variable} holding the average for \texttt{var}. The typical scenario for \texttt{ExponentialMovingAverage} is to compute moving averages of variables during training, and restore the variables from the computed moving averages during evaluations. To restore variables, you have to know the name of the shadow variables. That name and the original variable can then be passed to a \texttt{Saver()} object to restore the variable from the moving average value with: \texttt{saver\ =\ tf.train.Saver(\{ema.average\_name(var):\ var\})} \texttt{average\_name()} can be called whether or not \texttt{apply()} has been called. \subparagraph{Args: }\label{args-21} \begin{itemize} \tightlist \item \texttt{var}: A \texttt{Variable} object. \end{itemize} \subparagraph{Returns: }\label{returns-14} A string: the name of the variable that will be used or was used by the \texttt{ExponentialMovingAverage\ class} to hold the moving average of \texttt{var}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.ExponentialMovingAverage.average(var)} }{tf.train.ExponentialMovingAverage.average(var) }}\label{tf.train.exponentialmovingaverage.averagevar} Returns the \texttt{Variable} holding the average of \texttt{var}. \subparagraph{Args: }\label{args-22} \begin{itemize} \tightlist \item \texttt{var}: A \texttt{Variable} object. \end{itemize} \subparagraph{Returns: }\label{returns-15} A \texttt{Variable} object or \texttt{None} if the moving average of \texttt{var} is not maintained.. \subsection{Coordinator and QueueRunner }\label{coordinator-and-queuerunner} See \href{../../how_tos/threading_and_queues/index.md}{Threading and Queues} for how to use threads and queues. For documentation on the Queue API, see \href{../../api_docs/python/io_ops.md\#queues}{Queues}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.Coordinator} }{class tf.train.Coordinator }}\label{class-tf.train.coordinator} A coordinator for threads. This class implements a simple mechanism to coordinate the termination of a set of threads. \paragraph{Usage: }\label{usage-1} \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Create a coordinator.} \NormalTok{coord }\OperatorTok{=} \NormalTok{Coordinator()} \CommentTok{# Start a number of threads, passing the coordinator to each of them.} \NormalTok{...start thread }\DecValTok{1}\NormalTok{...(coord, ...)} \NormalTok{...start thread N...(coord, ...)} \CommentTok{# Wait for all the threads to terminate.} \NormalTok{coord.join(threads)} \end{Highlighting} \end{Shaded} Any of the threads can call \texttt{coord.request\_stop()} to ask for all the threads to stop. To cooperate with the requests, each thread must check for \texttt{coord.should\_stop()} on a regular basis. \texttt{coord.should\_stop()} returns \texttt{True} as soon as \texttt{coord.request\_stop()} has been called. A typical thread running with a Coordinator will do something like: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{while} \OperatorTok{not} \NormalTok{coord.should_stop():} \NormalTok{...do some work...} \end{Highlighting} \end{Shaded} \paragraph{Exception handling: }\label{exception-handling} A thread can report an exception to the Coordinator as part of the \texttt{should\_stop()} call. The exception will be re-raised from the \texttt{coord.join()} call. Thread code: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{try}\NormalTok{:} \ControlFlowTok{while} \OperatorTok{not} \NormalTok{coord.should_stop():} \NormalTok{...do some work...} \ControlFlowTok{except} \PreprocessorTok{Exception}\NormalTok{, e:} \NormalTok{coord.request_stop(e)} \end{Highlighting} \end{Shaded} Main code: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{try}\NormalTok{:} \NormalTok{...} \NormalTok{coord }\OperatorTok{=} \NormalTok{Coordinator()} \CommentTok{# Start a number of threads, passing the coordinator to each of them.} \NormalTok{...start thread }\DecValTok{1}\NormalTok{...(coord, ...)} \NormalTok{...start thread N...(coord, ...)} \CommentTok{# Wait for all the threads to terminate.} \NormalTok{coord.join(threads)} \ControlFlowTok{except} \PreprocessorTok{Exception}\NormalTok{, e:} \NormalTok{...exception that was passed to coord.request_stop()} \end{Highlighting} \end{Shaded} \paragraph{Grace period for stopping: }\label{grace-period-for-stopping} After a thread has called \texttt{coord.request\_stop()} the other threads have a fixed time to stop, this is called the `stop grace period' and defaults to 2 minutes. If any of the threads is still alive after the grace period expires \texttt{coord.join()} raises a RuntimeException reporting the laggards. \begin{verbatim} try: ... coord = Coordinator() # Start a number of threads, passing the coordinator to each of them. ...start thread 1...(coord, ...) ...start thread N...(coord, ...) # Wait for all the threads to terminate, give them 10s grace period coord.join(threads, stop_grace_period_secs=10) except RuntimeException: ...one of the threads took more than 10s to stop after request_stop() ...was called. except Exception: ...exception that was passed to coord.request_stop() \end{verbatim} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Coordinator.\_\_init\_\_()} }{tf.train.Coordinator.\_\_init\_\_() }}\label{tf.train.coordinator.ux5fux5finitux5fux5f} Create a new Coordinator. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Coordinator.join(threads,\ stop\_grace\_period\_secs=120)} }{tf.train.Coordinator.join(threads, stop\_grace\_period\_secs=120) }}\label{tf.train.coordinator.jointhreads-stopux5fgraceux5fperiodux5fsecs120} Wait for threads to terminate. Blocks until all `threads' have terminated or request\_stop() is called. After the threads stop, if an `exc\_info' was passed to request\_stop, that exception is re-reaised. Grace period handling: When request\_stop() is called, threads are given `stop\_grace\_period\_secs' seconds to terminate. If any of them is still alive after that period expires, a RuntimeError is raised. Note that if an `exc\_info' was passed to request\_stop() then it is raised instead of that RuntimeError. \subparagraph{Args: }\label{args-23} \begin{itemize} \tightlist \item \texttt{threads}: List threading.Threads. The started threads to join. \item \texttt{stop\_grace\_period\_secs}: Number of seconds given to threads to stop after request\_stop() has been called. \end{itemize} \subparagraph{Raises: }\label{raises-10} \begin{itemize} \tightlist \item \texttt{RuntimeError}: If any thread is still alive after request\_stop() is called and the grace period expires. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Coordinator.request\_stop(ex=None)} }{tf.train.Coordinator.request\_stop(ex=None) }}\label{tf.train.coordinator.requestux5fstopexnone} Request that the threads stop. After this is called, calls to should\_stop() will return True. \subparagraph{Args: }\label{args-24} \begin{itemize} \tightlist \item \texttt{ex}: Optional Exception, or Python `exc\_info' tuple as returned by sys.exc\_info(). If this is the first call to request\_stop() the corresponding exception is recorded and re-raised from join(). \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Coordinator.should\_stop()} }{tf.train.Coordinator.should\_stop() }}\label{tf.train.coordinator.shouldux5fstop} Check if stop was requested. \subparagraph{Returns: }\label{returns-16} True if a stop was requested. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.Coordinator.wait\_for\_stop(timeout=None)} }{tf.train.Coordinator.wait\_for\_stop(timeout=None) }}\label{tf.train.coordinator.waitux5fforux5fstoptimeoutnone} Wait till the Coordinator is told to stop. \subparagraph{Args: }\label{args-25} \begin{itemize} \tightlist \item \texttt{timeout}: float. Sleep for up to that many seconds waiting for should\_stop() to become True. \end{itemize} \subparagraph{Returns: }\label{returns-17} True if the Coordinator is told stop, False if the timeout expired. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.QueueRunner} }{class tf.train.QueueRunner }}\label{class-tf.train.queuerunner} Holds a list of enqueue operations for a queue, each to be run in a thread. Queues are a convenient TensorFlow mechanism to compute tensors asynchronously using multiple threads. For example in the canonical `Input Reader' setup one set of threads generates filenames in a queue; a second set of threads read records from the files, processes them, and enqueues tensors on a second queue; a third set of threads dequeues these input records to construct batches and runs them through training operations. There are several delicate issues when running multiple threads that way: closing the queues in sequence as the input is exhausted, correctly catching and reporting exceptions, etc. The \texttt{QueueRunner}, combined with the \texttt{Coordinator}, helps handle these issues. - - - \paragraph{\texorpdfstring{\texttt{tf.train.QueueRunner.\_\_init\_\_(queue,\ enqueue\_ops)} }{tf.train.QueueRunner.\_\_init\_\_(queue, enqueue\_ops) }}\label{tf.train.queuerunner.ux5fux5finitux5fux5fqueue-enqueueux5fops} Create a QueueRunner. On construction the \texttt{QueueRunner} adds an op to close the queue. That op will be run if the enqueue ops raise exceptions. When you later call the \texttt{create\_threads()} method, the \texttt{QueueRunner} will create one thread for each op in \texttt{enqueue\_ops}. Each thread will run its enqueue op in parallel with the other threads. The enqueue ops do not have to all be the same op, but it is expected that they all enqueue tensors in \texttt{queue}. \subparagraph{Args: }\label{args-26} \begin{itemize} \tightlist \item \texttt{queue}: A \texttt{Queue}. \item \texttt{enqueue\_ops}: List of enqueue ops to run in threads later. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.QueueRunner.create\_threads(sess,\ coord=None,\ daemon=False,\ start=False)} }{tf.train.QueueRunner.create\_threads(sess, coord=None, daemon=False, start=False) }}\label{tf.train.queuerunner.createux5fthreadssess-coordnone-daemonfalse-startfalse} Create threads to run the enqueue ops. This method requires a session in which the graph was launched. It creates a list of threads, optionally starting them. There is one thread for each op passed in \texttt{enqueue\_ops}. The \texttt{coord} argument is an optional coordinator, that the threads will use to terminate together and report exceptions. If a coordinator is given, this method starts an additional thread to close the queue when the coordinator requests a stop. This method may be called again as long as all threads from a previous call have stopped. \subparagraph{Args: }\label{args-27} \begin{itemize} \tightlist \item \texttt{sess}: A \texttt{Session}. \item \texttt{coord}: Optional \texttt{Coordinator} object for reporting errors and checking stop conditions. \item \texttt{daemon}: Boolean. If \texttt{True} make the threads daemon threads. \item \texttt{start}: Boolean. If \texttt{True} starts the threads. If \texttt{False} the caller must call the \texttt{start()} method of the returned threads. \end{itemize} \subparagraph{Returns: }\label{returns-18} A list of threads. \subparagraph{Raises: }\label{raises-11} \begin{itemize} \tightlist \item \texttt{RuntimeError}: If threads from a previous call to \texttt{create\_threads()} are still running. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.QueueRunner.exceptions\_raised} }{tf.train.QueueRunner.exceptions\_raised }}\label{tf.train.queuerunner.exceptionsux5fraised} Exceptions raised but not handled by the \texttt{QueueRunner} threads. Exceptions raised in queue runner threads are handled in one of two ways depending on whether or not a \texttt{Coordinator} was passed to \texttt{create\_threads()}: \begin{itemize} \tightlist \item With a \texttt{Coordinator}, exceptions are reported to the coordinator and forgotten by the \texttt{QueueRunner}. \item Without a \texttt{Coordinator}, exceptions are captured by the \texttt{QueueRunner} and made available in this \texttt{exceptions\_raised} property. \end{itemize} \subparagraph{Returns: }\label{returns-19} A list of Python \texttt{Exception} objects. The list is empty if no exception was captured. (No exceptions are captured when using a Coordinator.) \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.add\_queue\_runner(qr,\ collection=\textquotesingle{}queue\_runners\textquotesingle{})} }{tf.train.add\_queue\_runner(qr, collection='queue\_runners') }}\label{tf.train.addux5fqueueux5frunnerqr-collectionqueueux5frunners} Adds a \texttt{QueueRunner} to a collection in the graph. When building a complex model that uses many queues it is often difficult to gather all the queue runners that need to be run. This convenience function allows you to add a queue runner to a well known collection in the graph. The companion method \texttt{start\_queue\_runners()} can be used to start threads for all the collected queue runners. \subparagraph{Args: }\label{args-28} \begin{itemize} \tightlist \item \texttt{qr}: A \texttt{QueueRunner}. \item \texttt{collection}: A \texttt{GraphKey} specifying the graph collection to add the queue runner to. Defaults to \texttt{GraphKeys.QUEUE\_RUNNERS}. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.start\_queue\_runners(sess=None,\ coord=None,\ daemon=True,\ start=True,\ collection=\textquotesingle{}queue\_runners\textquotesingle{})} }{tf.train.start\_queue\_runners(sess=None, coord=None, daemon=True, start=True, collection='queue\_runners') }}\label{tf.train.startux5fqueueux5frunnerssessnone-coordnone-daemontrue-starttrue-collectionqueueux5frunners} Starts all queue runners collected in the graph. This is a companion method to \texttt{add\_queue\_runner()}. It just starts threads for all queue runners collected in the graph. It returns the list of all threads. \subparagraph{Args: }\label{args-29} \begin{itemize} \tightlist \item \texttt{sess}: \texttt{Session} used to run the queue ops. Defaults to the default session. \item \texttt{coord}: Optional \texttt{Coordinator} for coordinating the started threads. \item \texttt{daemon}: Whether the threads should be marked as \texttt{daemons}, meaning they don't block program exit. \item \texttt{start}: Set to \texttt{False} to only create the threads, not start them. \item \texttt{collection}: A \texttt{GraphKey} specifying the graph collection to get the queue runners from. Defaults to \texttt{GraphKeys.QUEUE\_RUNNERS}. \end{itemize} \subparagraph{Returns: }\label{returns-20} A list of threads. \subsection{Summary Operations }\label{summary-operations} The following ops output \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto}{\texttt{Summary}} protocol buffers as serialized string tensors. You can fetch the output of a summary op in a session, and pass it to a \href{../../api_docs/python/train.md\#SummaryWriter}{SummaryWriter} to append it to an event file. Event files contain \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/util/event.proto}{\texttt{Event}} protos that can contain \texttt{Summary} protos along with the timestamp and step. You can then use TensorBoard to visualize the contents of the event files. See \href{../../how_tos/summaries_and_tensorboard/index.md}{TensorBoard and Summaries} for more details. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.scalar\_summary(tags,\ values,\ collections=None,\ name=None)} }{tf.scalar\_summary(tags, values, collections=None, name=None) }}\label{tf.scalarux5fsummarytags-values-collectionsnone-namenone} Outputs a \texttt{Summary} protocol buffer with scalar values. The input \texttt{tags} and \texttt{values} must have the same shape. The generated summary has a summary value for each tag-value pair in \texttt{tags} and \texttt{values}. \subparagraph{Args: }\label{args-30} \begin{itemize} \tightlist \item \texttt{tags}: A 1-D \texttt{string} \texttt{Tensor}. Tags for the summaries. \item \texttt{values}: A 1-D \texttt{float32} or \texttt{float64} Tensor. Values for the summaries. \item \texttt{collections}: Optional list of graph collections keys. The new summary op is added to these collections. Defaults to \texttt{{[}GraphKeys.SUMMARIES{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-21} A scalar \texttt{Tensor} of type \texttt{string}. The serialized \texttt{Summary} protocol buffer. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.image\_summary(tag,\ tensor,\ max\_images=None,\ collections=None,\ name=None)} }{tf.image\_summary(tag, tensor, max\_images=None, collections=None, name=None) }}\label{tf.imageux5fsummarytag-tensor-maxux5fimagesnone-collectionsnone-namenone} Outputs a \texttt{Summary} protocol buffer with images. The summary has up to \texttt{max\_images} summary values containing images. The images are built from \texttt{tensor} which must be 4-D with shape \texttt{{[}batch\_size,\ height,\ width,\ channels{]}} and where \texttt{channels} can be: \begin{itemize} \tightlist \item 1: \texttt{tensor} is interpreted as Grayscale. \item 3: \texttt{tensor} is interpreted as RGB. \item 4: \texttt{tensor} is interpreted as RGBA. \end{itemize} The images have the same number of channels as the input tensor. Their values are normalized, one image at a time, to fit in the range \texttt{{[}0,\ 255{]}}. The op uses two different normalization algorithms: \begin{itemize} \item If the input values are all positive, they are rescaled so the largest one is 255. \item If any input value is negative, the values are shifted so input value 0.0 is at 127. They are then rescaled so that either the smallest value is 0, or the largest one is 255. \end{itemize} The \texttt{tag} argument is a scalar \texttt{Tensor} of type \texttt{string}. It is used to build the \texttt{tag} of the summary values: \begin{itemize} \tightlist \item If \texttt{max\_images} is 1, the summary value tag is `\emph{tag}/image'. \item If \texttt{max\_images} is greater than 1, the summary value tags are generated sequentially as `\emph{tag}/image/0', `\emph{tag}/image/1', etc. \end{itemize} \subparagraph{Args: }\label{args-31} \begin{itemize} \tightlist \item \texttt{tag}: A scalar \texttt{Tensor} of type \texttt{string}. Used to build the \texttt{tag} of the summary values. \item \texttt{tensor}: A 4-D \texttt{float32} \texttt{Tensor} of shape \texttt{{[}batch\_size,\ height,\ width,\ \ \ \ channels{]}} where \texttt{channels} is 1, 3, or 4. \item \texttt{max\_images}: Max number of batch elements to generate images for. \item \texttt{collections}: Optional list of ops.GraphKeys. The collections to add the summary to. Defaults to {[}ops.GraphKeys.SUMMARIES{]} \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-22} A scalar \texttt{Tensor} of type \texttt{string}. The serialized \texttt{Summary} protocol buffer. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.histogram\_summary(tag,\ values,\ collections=None,\ name=None)} }{tf.histogram\_summary(tag, values, collections=None, name=None) }}\label{tf.histogramux5fsummarytag-values-collectionsnone-namenone} Outputs a \texttt{Summary} protocol buffer with a histogram. The generated \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto}{\texttt{Summary}} has one summary value containing a histogram for \texttt{values}. This op reports an \texttt{OutOfRange} error if any value is not finite. \subparagraph{Args: }\label{args-32} \begin{itemize} \tightlist \item \texttt{tag}: A \texttt{string} \texttt{Tensor}. 0-D. Tag to use for the summary value. \item \texttt{values}: A \texttt{float32} \texttt{Tensor}. Any shape. Values to use to build the histogram. \item \texttt{collections}: Optional list of graph collections keys. The new summary op is added to these collections. Defaults to \texttt{{[}GraphKeys.SUMMARIES{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-23} A scalar \texttt{Tensor} of type \texttt{string}. The serialized \texttt{Summary} protocol buffer. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.nn.zero\_fraction(value,\ name=None)} }{tf.nn.zero\_fraction(value, name=None) }}\label{tf.nn.zeroux5ffractionvalue-namenone} Returns the fraction of zeros in \texttt{value}. If \texttt{value} is empty, the result is \texttt{nan}. This is useful in summaries to measure and report sparsity. For example, \begin{verbatim} z = tf.Relu(...) summ = tf.scalar_summary('sparsity', tf.zero_fraction(z)) \end{verbatim} \subparagraph{Args: }\label{args-33} \begin{itemize} \tightlist \item \texttt{value}: A tensor of numeric type. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-24} The fraction of zeros in \texttt{value}, with type \texttt{float32}. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.merge\_summary(inputs,\ collections=None,\ name=None)} }{tf.merge\_summary(inputs, collections=None, name=None) }}\label{tf.mergeux5fsummaryinputs-collectionsnone-namenone} Merges summaries. This op creates a \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto}{\texttt{Summary}} protocol buffer that contains the union of all the values in the input summaries. When the Op is run, it reports an \texttt{InvalidArgument} error if multiple values in the summaries to merge use the same tag. \subparagraph{Args: }\label{args-34} \begin{itemize} \tightlist \item \texttt{inputs}: A list of \texttt{string} \texttt{Tensor} objects containing serialized \texttt{Summary} protocol buffers. \item \texttt{collections}: Optional list of graph collections keys. The new summary op is added to these collections. Defaults to \texttt{{[}GraphKeys.SUMMARIES{]}}. \item \texttt{name}: A name for the operation (optional). \end{itemize} \subparagraph{Returns: }\label{returns-25} A scalar \texttt{Tensor} of type \texttt{string}. The serialized \texttt{Summary} protocol buffer resulting from the merging. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.merge\_all\_summaries(key=\textquotesingle{}summaries\textquotesingle{})} }{tf.merge\_all\_summaries(key='summaries') }}\label{tf.mergeux5fallux5fsummarieskeysummaries} Merges all summaries collected in the default graph. \subparagraph{Args: }\label{args-35} \begin{itemize} \tightlist \item \texttt{key}: \texttt{GraphKey} used to collect the summaries. Defaults to \texttt{GraphKeys.SUMMARIES}. \end{itemize} \subparagraph{Returns: }\label{returns-26} If no summaries were collected, returns None. Otherwise returns a scalar \texttt{Tensor} of type\texttt{string} containing the serialized \texttt{Summary} protocol buffer resulting from the merging. \subsection{Adding Summaries to Event Files }\label{adding-summaries-to-event-files} See \href{../../how_tos/summaries_and_tensorboard/index.md}{Summaries and TensorBoard} for an overview of summaries, event files, and visualization in TensorBoard. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{class\ tf.train.SummaryWriter} }{class tf.train.SummaryWriter }}\label{class-tf.train.summarywriter} Writes \texttt{Summary} protocol buffers to event files. The \texttt{SummaryWriter} class provides a mechanism to create an event file in a given directory and add summaries and events to it. The class updates the file contents asynchronously. This allows a training program to call methods to add data to the file directly from the training loop, without slowing down training. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.SummaryWriter.\_\_init\_\_(logdir,\ graph\_def=None,\ max\_queue=10,\ flush\_secs=120)} }{tf.train.SummaryWriter.\_\_init\_\_(logdir, graph\_def=None, max\_queue=10, flush\_secs=120) }}\label{tf.train.summarywriter.ux5fux5finitux5fux5flogdir-graphux5fdefnone-maxux5fqueue10-flushux5fsecs120} Creates a \texttt{SummaryWriter} and an event file. On construction the summary writer creates a new event file in \texttt{logdir}. This event file will contain \texttt{Event} protocol buffers constructed when you call one of the following functions: \texttt{add\_summary()}, \texttt{add\_event()}, or \texttt{add\_graph()}. If you pass a \texttt{graph\_def} protocol buffer to the constructor it is added to the event file. (This is equivalent to calling \texttt{add\_graph()} later). TensorBoard will pick the graph from the file and display it graphically so you can interactively explore the graph you built. You will usually pass the graph from the session in which you launched it: \begin{Shaded} \begin{Highlighting}[] \NormalTok{...create a graph...} \CommentTok{# Launch the graph in a session.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \CommentTok{# Create a summary writer, add the 'graph_def' to the event file.} \NormalTok{writer }\OperatorTok{=} \NormalTok{tf.train.SummaryWriter(}\OperatorTok{<}\NormalTok{some}\OperatorTok{-}\NormalTok{directory}\OperatorTok{>}\NormalTok{, sess.graph_def)} \end{Highlighting} \end{Shaded} The other arguments to the constructor control the asynchronous writes to the event file: \begin{itemize} \tightlist \item \texttt{flush\_secs}: How often, in seconds, to flush the added summaries and events to disk. \item \texttt{max\_queue}: Maximum number of summaries or events pending to be written to disk before one of the `add' calls block. \end{itemize} \subparagraph{Args: }\label{args-36} \begin{itemize} \tightlist \item \texttt{logdir}: A string. Directory where event file will be written. \item \texttt{graph\_def}: A \texttt{GraphDef} protocol buffer. \item \texttt{max\_queue}: Integer. Size of the queue for pending events and summaries. \item \texttt{flush\_secs}: Number. How often, in seconds, to flush the pending events and summaries to disk. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.SummaryWriter.add\_summary(summary,\ global\_step=None)} }{tf.train.SummaryWriter.add\_summary(summary, global\_step=None) }}\label{tf.train.summarywriter.addux5fsummarysummary-globalux5fstepnone} Adds a \texttt{Summary} protocol buffer to the event file. This method wraps the provided summary in an \texttt{Event} procotol buffer and adds it to the event file. You can pass the output of any summary op, as-is, to this function. You can also pass a \texttt{Summary} procotol buffer that you manufacture with your own data. This is commonly done to report evaluation results in event files. \subparagraph{Args: }\label{args-37} \begin{itemize} \tightlist \item \texttt{summary}: A \texttt{Summary} protocol buffer, optionally serialized as a string. \item \texttt{global\_step}: Number. Optional global step value to record with the summary. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.SummaryWriter.add\_event(event)} }{tf.train.SummaryWriter.add\_event(event) }}\label{tf.train.summarywriter.addux5feventevent} Adds an event to the event file. \subparagraph{Args: }\label{args-38} \begin{itemize} \tightlist \item \texttt{event}: An \texttt{Event} protocol buffer. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.SummaryWriter.add\_graph(graph\_def,\ global\_step=None)} }{tf.train.SummaryWriter.add\_graph(graph\_def, global\_step=None) }}\label{tf.train.summarywriter.addux5fgraphgraphux5fdef-globalux5fstepnone} Adds a \texttt{GraphDef} protocol buffer to the event file. The graph described by the protocol buffer will be displayed by TensorBoard. Most users pass a graph in the constructor instead. \subparagraph{Args: }\label{args-39} \begin{itemize} \tightlist \item \texttt{graph\_def}: A \texttt{GraphDef} protocol buffer. \item \texttt{global\_step}: Number. Optional global step counter to record with the graph. \end{itemize} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.SummaryWriter.flush()} }{tf.train.SummaryWriter.flush() }}\label{tf.train.summarywriter.flush} Flushes the event file to disk. Call this method to make sure that all pending events have been written to disk. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \paragraph{\texorpdfstring{\texttt{tf.train.SummaryWriter.close()} }{tf.train.SummaryWriter.close() }}\label{tf.train.summarywriter.close} Flushes the event file to disk and close the file. Call this method when you do not need the summary writer anymore. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.summary\_iterator(path)} }{tf.train.summary\_iterator(path) }}\label{tf.train.summaryux5fiteratorpath} An iterator for reading \texttt{Event} protocol buffers from an event file. You can use this function to read events written to an event file. It returns a Python iterator that yields \texttt{Event} protocol buffers. Example: Print the contents of an events file. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{for} \NormalTok{e }\OperatorTok{in} \NormalTok{tf.summary_iterator(path to events }\BuiltInTok{file}\NormalTok{):} \BuiltInTok{print} \NormalTok{e} \end{Highlighting} \end{Shaded} Example: Print selected summary values. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# This example supposes that the events file contains summaries with a} \CommentTok{# summary value tag 'loss'. These could have been added by calling} \CommentTok{# `add_summary()`, passing the output of a scalar summary op created with} \CommentTok{# with: `tf.scalar_summary(['loss'], loss_tensor)`.} \ControlFlowTok{for} \NormalTok{e }\OperatorTok{in} \NormalTok{tf.summary_iterator(path to events }\BuiltInTok{file}\NormalTok{):} \ControlFlowTok{for} \NormalTok{v }\OperatorTok{in} \NormalTok{e.summary.value:} \ControlFlowTok{if} \NormalTok{v.tag }\OperatorTok{==} \StringTok{'loss'}\NormalTok{:} \BuiltInTok{print} \NormalTok{v.simple_value} \end{Highlighting} \end{Shaded} See the protocol buffer definitions of \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/util/event.proto}{Event} and \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/summary.proto}{Summary} for more information about their attributes. \subparagraph{Args: }\label{args-40} \begin{itemize} \tightlist \item \texttt{path}: The path to an event file created by a \texttt{SummaryWriter}. \end{itemize} \subparagraph{Yields: }\label{yields} \texttt{Event} protocol buffers. \subsection{Training utilities }\label{training-utilities} \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.global\_step(sess,\ global\_step\_tensor)} }{tf.train.global\_step(sess, global\_step\_tensor) }}\label{tf.train.globalux5fstepsess-globalux5fstepux5ftensor} Small helper to get the global step. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Creates a variable to hold the global_step.} \NormalTok{global_step_tensor }\OperatorTok{=} \NormalTok{tf.Variable(}\DecValTok{10}\NormalTok{, trainable}\OperatorTok{=}\VariableTok{False}\NormalTok{, name}\OperatorTok{=}\StringTok{'global_step'}\NormalTok{)} \CommentTok{# Creates a session.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \CommentTok{# Initializes the variable.} \NormalTok{sess.run(global_step_tensor.initializer)} \BuiltInTok{print} \StringTok{'global_step:'}\NormalTok{, tf.train.global_step(sess, global_step_tensor)} \NormalTok{global_step: }\DecValTok{10} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-41} \begin{itemize} \tightlist \item \texttt{sess}: A brain \texttt{Session} object. \item \texttt{global\_step\_tensor}: \texttt{Tensor} or the \texttt{name} of the operation that contains the global step. \end{itemize} \subparagraph{Returns: }\label{returns-27} The global step value. \begin{center}\rule{0.5\linewidth}{\linethickness}\end{center} \subsubsection{\texorpdfstring{\texttt{tf.train.write\_graph(graph\_def,\ logdir,\ name,\ as\_text=True)} }{tf.train.write\_graph(graph\_def, logdir, name, as\_text=True) }}\label{tf.train.writeux5fgraphgraphux5fdef-logdir-name-asux5ftexttrue} Writes a graph proto on disk. The graph is written as a binary proto unless as\_text is \texttt{True}. \begin{Shaded} \begin{Highlighting}[] \NormalTok{v }\OperatorTok{=} \NormalTok{tf.Variable(}\DecValTok{0}\NormalTok{, name}\OperatorTok{=}\StringTok{'my_variable'}\NormalTok{)} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \NormalTok{tf.train.write_graph(sess.graph_def, }\StringTok{'/tmp/my-model'}\NormalTok{, }\StringTok{'train.pbtxt'}\NormalTok{)} \end{Highlighting} \end{Shaded} \subparagraph{Args: }\label{args-42} \begin{itemize} \tightlist \item \texttt{graph\_def}: A \texttt{GraphDef} protocol buffer. \item \texttt{logdir}: Directory where to write the graph. \item \texttt{name}: Filename for the graph. \item \texttt{as\_text}: If \texttt{True}, writes the graph as an ASCII proto. \end{itemize} ================================================ FILE: tex_pdf/api/python/index.tex ================================================ \section{TensorFlow Python reference documentation }\label{tensorflow-python-reference-documentation} \begin{itemize} \item \textbf{\href{../../api_docs/python/framework.md}{Building Graphs}}: \item \href{../../api_docs/python/framework.md\#add_to_collection}{\texttt{add\_to\_collection}} \item \href{../../api_docs/python/framework.md\#as_dtype}{\texttt{as\_dtype}} \item \href{../../api_docs/python/framework.md\#control_dependencies}{\texttt{control\_dependencies}} \item \href{../../api_docs/python/framework.md\#convert_to_tensor}{\texttt{convert\_to\_tensor}} \item \href{../../api_docs/python/framework.md\#device}{\texttt{device}} \item \href{../../api_docs/python/framework.md\#Dimension}{\texttt{Dimension}} \item \href{../../api_docs/python/framework.md\#DType}{\texttt{DType}} \item \href{../../api_docs/python/framework.md\#get_collection}{\texttt{get\_collection}} \item \href{../../api_docs/python/framework.md\#get_default_graph}{\texttt{get\_default\_graph}} \item \href{../../api_docs/python/framework.md\#get_seed}{\texttt{get\_seed}} \item \href{../../api_docs/python/framework.md\#Graph}{\texttt{Graph}} \item \href{../../api_docs/python/framework.md\#GraphKeys}{\texttt{GraphKeys}} \item \href{../../api_docs/python/framework.md\#import_graph_def}{\texttt{import\_graph\_def}} \item \href{../../api_docs/python/framework.md\#name_scope}{\texttt{name\_scope}} \item \href{../../api_docs/python/framework.md\#NoGradient}{\texttt{NoGradient}} \item \href{../../api_docs/python/framework.md\#op_scope}{\texttt{op\_scope}} \item \href{../../api_docs/python/framework.md\#Operation}{\texttt{Operation}} \item \href{../../api_docs/python/framework.md\#RegisterGradient}{\texttt{RegisterGradient}} \item \href{../../api_docs/python/framework.md\#RegisterShape}{\texttt{RegisterShape}} \item \href{../../api_docs/python/framework.md\#Tensor}{\texttt{Tensor}} \item \href{../../api_docs/python/framework.md\#TensorShape}{\texttt{TensorShape}} \item \textbf{\href{../../api_docs/python/constant_op.md}{Constants, Sequences, and Random Values}}: \item \href{../../api_docs/python/constant_op.md\#constant}{\texttt{constant}} \item \href{../../api_docs/python/constant_op.md\#fill}{\texttt{fill}} \item \href{../../api_docs/python/constant_op.md\#linspace}{\texttt{linspace}} \item \href{../../api_docs/python/constant_op.md\#ones}{\texttt{ones}} \item \href{../../api_docs/python/constant_op.md\#ones_like}{\texttt{ones\_like}} \item \href{../../api_docs/python/constant_op.md\#random_normal}{\texttt{random\_normal}} \item \href{../../api_docs/python/constant_op.md\#random_shuffle}{\texttt{random\_shuffle}} \item \href{../../api_docs/python/constant_op.md\#random_uniform}{\texttt{random\_uniform}} \item \href{../../api_docs/python/constant_op.md\#range}{\texttt{range}} \item \href{../../api_docs/python/constant_op.md\#set_random_seed}{\texttt{set\_random\_seed}} \item \href{../../api_docs/python/constant_op.md\#truncated_normal}{\texttt{truncated\_normal}} \item \href{../../api_docs/python/constant_op.md\#zeros}{\texttt{zeros}} \item \href{../../api_docs/python/constant_op.md\#zeros_like}{\texttt{zeros\_like}} \item \textbf{\href{../../api_docs/python/state_ops.md}{Variables}}: \item \href{../../api_docs/python/state_ops.md\#all_variables}{\texttt{all\_variables}} \item \href{../../api_docs/python/state_ops.md\#assert_variables_initialized}{\texttt{assert\_variables\_initialized}} \item \href{../../api_docs/python/state_ops.md\#assign}{\texttt{assign}} \item \href{../../api_docs/python/state_ops.md\#assign_add}{\texttt{assign\_add}} \item \href{../../api_docs/python/state_ops.md\#assign_sub}{\texttt{assign\_sub}} \item \href{../../api_docs/python/state_ops.md\#constant_initializer}{\texttt{constant\_initializer}} \item \href{../../api_docs/python/state_ops.md\#count_up_to}{\texttt{count\_up\_to}} \item \href{../../api_docs/python/state_ops.md\#device}{\texttt{device}} \item \href{../../api_docs/python/state_ops.md\#get_checkpoint_state}{\texttt{get\_checkpoint\_state}} \item \href{../../api_docs/python/state_ops.md\#get_variable}{\texttt{get\_variable}} \item \href{../../api_docs/python/state_ops.md\#get_variable_scope}{\texttt{get\_variable\_scope}} \item \href{../../api_docs/python/state_ops.md\#IndexedSlices}{\texttt{IndexedSlices}} \item \href{../../api_docs/python/state_ops.md\#initialize_all_variables}{\texttt{initialize\_all\_variables}} \item \href{../../api_docs/python/state_ops.md\#initialize_variables}{\texttt{initialize\_variables}} \item \href{../../api_docs/python/state_ops.md\#latest_checkpoint}{\texttt{latest\_checkpoint}} \item \href{../../api_docs/python/state_ops.md\#random_normal_initializer}{\texttt{random\_normal\_initializer}} \item \href{../../api_docs/python/state_ops.md\#random_uniform_initializer}{\texttt{random\_uniform\_initializer}} \item \href{../../api_docs/python/state_ops.md\#Saver}{\texttt{Saver}} \item \href{../../api_docs/python/state_ops.md\#scatter_add}{\texttt{scatter\_add}} \item \href{../../api_docs/python/state_ops.md\#scatter_sub}{\texttt{scatter\_sub}} \item \href{../../api_docs/python/state_ops.md\#scatter_update}{\texttt{scatter\_update}} \item \href{../../api_docs/python/state_ops.md\#sparse_mask}{\texttt{sparse\_mask}} \item \href{../../api_docs/python/state_ops.md\#trainable_variables}{\texttt{trainable\_variables}} \item \href{../../api_docs/python/state_ops.md\#truncated_normal_initializer}{\texttt{truncated\_normal\_initializer}} \item \href{../../api_docs/python/state_ops.md\#uniform_unit_scaling_initializer}{\texttt{uniform\_unit\_scaling\_initializer}} \item \href{../../api_docs/python/state_ops.md\#update_checkpoint_state}{\texttt{update\_checkpoint\_state}} \item \href{../../api_docs/python/state_ops.md\#Variable}{\texttt{Variable}} \item \href{../../api_docs/python/state_ops.md\#variable_scope}{\texttt{variable\_scope}} \item \href{../../api_docs/python/state_ops.md\#zeros_initializer}{\texttt{zeros\_initializer}} \item \textbf{\href{../../api_docs/python/array_ops.md}{Tensor Transformations}}: \item \href{../../api_docs/python/array_ops.md\#cast}{\texttt{cast}} \item \href{../../api_docs/python/array_ops.md\#concat}{\texttt{concat}} \item \href{../../api_docs/python/array_ops.md\#dynamic_partition}{\texttt{dynamic\_partition}} \item \href{../../api_docs/python/array_ops.md\#dynamic_stitch}{\texttt{dynamic\_stitch}} \item \href{../../api_docs/python/array_ops.md\#expand_dims}{\texttt{expand\_dims}} \item \href{../../api_docs/python/array_ops.md\#gather}{\texttt{gather}} \item \href{../../api_docs/python/array_ops.md\#pack}{\texttt{pack}} \item \href{../../api_docs/python/array_ops.md\#pad}{\texttt{pad}} \item \href{../../api_docs/python/array_ops.md\#rank}{\texttt{rank}} \item \href{../../api_docs/python/array_ops.md\#reshape}{\texttt{reshape}} \item \href{../../api_docs/python/array_ops.md\#reverse}{\texttt{reverse}} \item \href{../../api_docs/python/array_ops.md\#reverse_sequence}{\texttt{reverse\_sequence}} \item \href{../../api_docs/python/array_ops.md\#shape}{\texttt{shape}} \item \href{../../api_docs/python/array_ops.md\#size}{\texttt{size}} \item \href{../../api_docs/python/array_ops.md\#slice}{\texttt{slice}} \item \href{../../api_docs/python/array_ops.md\#split}{\texttt{split}} \item \href{../../api_docs/python/array_ops.md\#squeeze}{\texttt{squeeze}} \item \href{../../api_docs/python/array_ops.md\#string_to_number}{\texttt{string\_to\_number}} \item \href{../../api_docs/python/array_ops.md\#tile}{\texttt{tile}} \item \href{../../api_docs/python/array_ops.md\#to_bfloat16}{\texttt{to\_bfloat16}} \item \href{../../api_docs/python/array_ops.md\#to_double}{\texttt{to\_double}} \item \href{../../api_docs/python/array_ops.md\#to_float}{\texttt{to\_float}} \item \href{../../api_docs/python/array_ops.md\#to_int32}{\texttt{to\_int32}} \item \href{../../api_docs/python/array_ops.md\#to_int64}{\texttt{to\_int64}} \item \href{../../api_docs/python/array_ops.md\#transpose}{\texttt{transpose}} \item \href{../../api_docs/python/array_ops.md\#unpack}{\texttt{unpack}} \item \textbf{\href{../../api_docs/python/math_ops.md}{Math}}: \item \href{../../api_docs/python/math_ops.md\#abs}{\texttt{abs}} \item \href{../../api_docs/python/math_ops.md\#accumulate_n}{\texttt{accumulate\_n}} \item \href{../../api_docs/python/math_ops.md\#add}{\texttt{add}} \item \href{../../api_docs/python/math_ops.md\#add_n}{\texttt{add\_n}} \item \href{../../api_docs/python/math_ops.md\#argmax}{\texttt{argmax}} \item \href{../../api_docs/python/math_ops.md\#argmin}{\texttt{argmin}} \item \href{../../api_docs/python/math_ops.md\#batch_cholesky}{\texttt{batch\_cholesky}} \item \href{../../api_docs/python/math_ops.md\#batch_matmul}{\texttt{batch\_matmul}} \item \href{../../api_docs/python/math_ops.md\#batch_matrix_determinant}{\texttt{batch\_matrix\_determinant}} \item \href{../../api_docs/python/math_ops.md\#batch_matrix_inverse}{\texttt{batch\_matrix\_inverse}} \item \href{../../api_docs/python/math_ops.md\#ceil}{\texttt{ceil}} \item \href{../../api_docs/python/math_ops.md\#cholesky}{\texttt{cholesky}} \item \href{../../api_docs/python/math_ops.md\#complex}{\texttt{complex}} \item \href{../../api_docs/python/math_ops.md\#complex_abs}{\texttt{complex\_abs}} \item \href{../../api_docs/python/math_ops.md\#conj}{\texttt{conj}} \item \href{../../api_docs/python/math_ops.md\#cos}{\texttt{cos}} \item \href{../../api_docs/python/math_ops.md\#diag}{\texttt{diag}} \item \href{../../api_docs/python/math_ops.md\#div}{\texttt{div}} \item \href{../../api_docs/python/math_ops.md\#edit_distance}{\texttt{edit\_distance}} \item \href{../../api_docs/python/math_ops.md\#exp}{\texttt{exp}} \item \href{../../api_docs/python/math_ops.md\#floor}{\texttt{floor}} \item \href{../../api_docs/python/math_ops.md\#imag}{\texttt{imag}} \item \href{../../api_docs/python/math_ops.md\#inv}{\texttt{inv}} \item \href{../../api_docs/python/math_ops.md\#invert_permutation}{\texttt{invert\_permutation}} \item \href{../../api_docs/python/math_ops.md\#listdiff}{\texttt{listdiff}} \item \href{../../api_docs/python/math_ops.md\#log}{\texttt{log}} \item \href{../../api_docs/python/math_ops.md\#matmul}{\texttt{matmul}} \item \href{../../api_docs/python/math_ops.md\#matrix_determinant}{\texttt{matrix\_determinant}} \item \href{../../api_docs/python/math_ops.md\#matrix_inverse}{\texttt{matrix\_inverse}} \item \href{../../api_docs/python/math_ops.md\#maximum}{\texttt{maximum}} \item \href{../../api_docs/python/math_ops.md\#minimum}{\texttt{minimum}} \item \href{../../api_docs/python/math_ops.md\#mod}{\texttt{mod}} \item \href{../../api_docs/python/math_ops.md\#mul}{\texttt{mul}} \item \href{../../api_docs/python/math_ops.md\#neg}{\texttt{neg}} \item \href{../../api_docs/python/math_ops.md\#pow}{\texttt{pow}} \item \href{../../api_docs/python/math_ops.md\#real}{\texttt{real}} \item \href{../../api_docs/python/math_ops.md\#reduce_all}{\texttt{reduce\_all}} \item \href{../../api_docs/python/math_ops.md\#reduce_any}{\texttt{reduce\_any}} \item \href{../../api_docs/python/math_ops.md\#reduce_max}{\texttt{reduce\_max}} \item \href{../../api_docs/python/math_ops.md\#reduce_mean}{\texttt{reduce\_mean}} \item \href{../../api_docs/python/math_ops.md\#reduce_min}{\texttt{reduce\_min}} \item \href{../../api_docs/python/math_ops.md\#reduce_prod}{\texttt{reduce\_prod}} \item \href{../../api_docs/python/math_ops.md\#reduce_sum}{\texttt{reduce\_sum}} \item \href{../../api_docs/python/math_ops.md\#round}{\texttt{round}} \item \href{../../api_docs/python/math_ops.md\#rsqrt}{\texttt{rsqrt}} \item \href{../../api_docs/python/math_ops.md\#segment_max}{\texttt{segment\_max}} \item \href{../../api_docs/python/math_ops.md\#segment_mean}{\texttt{segment\_mean}} \item \href{../../api_docs/python/math_ops.md\#segment_min}{\texttt{segment\_min}} \item \href{../../api_docs/python/math_ops.md\#segment_prod}{\texttt{segment\_prod}} \item \href{../../api_docs/python/math_ops.md\#segment_sum}{\texttt{segment\_sum}} \item \href{../../api_docs/python/math_ops.md\#sign}{\texttt{sign}} \item \href{../../api_docs/python/math_ops.md\#sin}{\texttt{sin}} \item \href{../../api_docs/python/math_ops.md\#sparse_segment_mean}{\texttt{sparse\_segment\_mean}} \item \href{../../api_docs/python/math_ops.md\#sparse_segment_sum}{\texttt{sparse\_segment\_sum}} \item \href{../../api_docs/python/math_ops.md\#sqrt}{\texttt{sqrt}} \item \href{../../api_docs/python/math_ops.md\#square}{\texttt{square}} \item \href{../../api_docs/python/math_ops.md\#sub}{\texttt{sub}} \item \href{../../api_docs/python/math_ops.md\#transpose}{\texttt{transpose}} \item \href{../../api_docs/python/math_ops.md\#unique}{\texttt{unique}} \item \href{../../api_docs/python/math_ops.md\#unsorted_segment_sum}{\texttt{unsorted\_segment\_sum}} \item \href{../../api_docs/python/math_ops.md\#where}{\texttt{where}} \item \textbf{\href{../../api_docs/python/control_flow_ops.md}{Control Flow}}: \item \href{../../api_docs/python/control_flow_ops.md\#add_check_numerics_ops}{\texttt{add\_check\_numerics\_ops}} \item \href{../../api_docs/python/control_flow_ops.md\#Assert}{\texttt{Assert}} \item \href{../../api_docs/python/control_flow_ops.md\#check_numerics}{\texttt{check\_numerics}} \item \href{../../api_docs/python/control_flow_ops.md\#count_up_to}{\texttt{count\_up\_to}} \item \href{../../api_docs/python/control_flow_ops.md\#equal}{\texttt{equal}} \item \href{../../api_docs/python/control_flow_ops.md\#greater}{\texttt{greater}} \item \href{../../api_docs/python/control_flow_ops.md\#greater_equal}{\texttt{greater\_equal}} \item \href{../../api_docs/python/control_flow_ops.md\#group}{\texttt{group}} \item \href{../../api_docs/python/control_flow_ops.md\#identity}{\texttt{identity}} \item \href{../../api_docs/python/control_flow_ops.md\#is_finite}{\texttt{is\_finite}} \item \href{../../api_docs/python/control_flow_ops.md\#is_inf}{\texttt{is\_inf}} \item \href{../../api_docs/python/control_flow_ops.md\#is_nan}{\texttt{is\_nan}} \item \href{../../api_docs/python/control_flow_ops.md\#less}{\texttt{less}} \item \href{../../api_docs/python/control_flow_ops.md\#less_equal}{\texttt{less\_equal}} \item \href{../../api_docs/python/control_flow_ops.md\#logical_and}{\texttt{logical\_and}} \item \href{../../api_docs/python/control_flow_ops.md\#logical_not}{\texttt{logical\_not}} \item \href{../../api_docs/python/control_flow_ops.md\#logical_or}{\texttt{logical\_or}} \item \href{../../api_docs/python/control_flow_ops.md\#logical_xor}{\texttt{logical\_xor}} \item \href{../../api_docs/python/control_flow_ops.md\#no_op}{\texttt{no\_op}} \item \href{../../api_docs/python/control_flow_ops.md\#not_equal}{\texttt{not\_equal}} \item \href{../../api_docs/python/control_flow_ops.md\#Print}{\texttt{Print}} \item \href{../../api_docs/python/control_flow_ops.md\#select}{\texttt{select}} \item \href{../../api_docs/python/control_flow_ops.md\#tuple}{\texttt{tuple}} \item \href{../../api_docs/python/control_flow_ops.md\#verify_tensor_all_finite}{\texttt{verify\_tensor\_all\_finite}} \item \href{../../api_docs/python/control_flow_ops.md\#where}{\texttt{where}} \item \textbf{\href{../../api_docs/python/image.md}{Images}}: \item \href{../../api_docs/python/image.md\#adjust_brightness}{\texttt{adjust\_brightness}} \item \href{../../api_docs/python/image.md\#adjust_contrast}{\texttt{adjust\_contrast}} \item \href{../../api_docs/python/image.md\#crop_to_bounding_box}{\texttt{crop\_to\_bounding\_box}} \item \href{../../api_docs/python/image.md\#decode_jpeg}{\texttt{decode\_jpeg}} \item \href{../../api_docs/python/image.md\#decode_png}{\texttt{decode\_png}} \item \href{../../api_docs/python/image.md\#encode_jpeg}{\texttt{encode\_jpeg}} \item \href{../../api_docs/python/image.md\#encode_png}{\texttt{encode\_png}} \item \href{../../api_docs/python/image.md\#extract_glimpse}{\texttt{extract\_glimpse}} \item \href{../../api_docs/python/image.md\#flip_left_right}{\texttt{flip\_left\_right}} \item \href{../../api_docs/python/image.md\#flip_up_down}{\texttt{flip\_up\_down}} \item \href{../../api_docs/python/image.md\#pad_to_bounding_box}{\texttt{pad\_to\_bounding\_box}} \item \href{../../api_docs/python/image.md\#per_image_whitening}{\texttt{per\_image\_whitening}} \item \href{../../api_docs/python/image.md\#random_brightness}{\texttt{random\_brightness}} \item \href{../../api_docs/python/image.md\#random_contrast}{\texttt{random\_contrast}} \item \href{../../api_docs/python/image.md\#random_crop}{\texttt{random\_crop}} \item \href{../../api_docs/python/image.md\#random_flip_left_right}{\texttt{random\_flip\_left\_right}} \item \href{../../api_docs/python/image.md\#random_flip_up_down}{\texttt{random\_flip\_up\_down}} \item \href{../../api_docs/python/image.md\#resize_area}{\texttt{resize\_area}} \item \href{../../api_docs/python/image.md\#resize_bicubic}{\texttt{resize\_bicubic}} \item \href{../../api_docs/python/image.md\#resize_bilinear}{\texttt{resize\_bilinear}} \item \href{../../api_docs/python/image.md\#resize_image_with_crop_or_pad}{\texttt{resize\_image\_with\_crop\_or\_pad}} \item \href{../../api_docs/python/image.md\#resize_images}{\texttt{resize\_images}} \item \href{../../api_docs/python/image.md\#resize_nearest_neighbor}{\texttt{resize\_nearest\_neighbor}} \item \href{../../api_docs/python/image.md\#transpose_image}{\texttt{transpose\_image}} \item \textbf{\href{../../api_docs/python/sparse_ops.md}{Sparse Tensors}}: \item \href{../../api_docs/python/sparse_ops.md\#shape}{\texttt{shape}} \item \href{../../api_docs/python/sparse_ops.md\#sparse_concat}{\texttt{sparse\_concat}} \item \href{../../api_docs/python/sparse_ops.md\#sparse_fill_empty_rows}{\texttt{sparse\_fill\_empty\_rows}} \item \href{../../api_docs/python/sparse_ops.md\#sparse_reorder}{\texttt{sparse\_reorder}} \item \href{../../api_docs/python/sparse_ops.md\#sparse_retain}{\texttt{sparse\_retain}} \item \href{../../api_docs/python/sparse_ops.md\#sparse_tensor_to_dense}{\texttt{sparse\_tensor\_to\_dense}} \item \href{../../api_docs/python/sparse_ops.md\#sparse_to_dense}{\texttt{sparse\_to\_dense}} \item \href{../../api_docs/python/sparse_ops.md\#sparse_to_indicator}{\texttt{sparse\_to\_indicator}} \item \href{../../api_docs/python/sparse_ops.md\#SparseTensor}{\texttt{SparseTensor}} \item \href{../../api_docs/python/sparse_ops.md\#SparseTensorValue}{\texttt{SparseTensorValue}} \item \textbf{\href{../../api_docs/python/io_ops.md}{Inputs and Readers}}: \item \href{../../api_docs/python/io_ops.md\#batch}{\texttt{batch}} \item \href{../../api_docs/python/io_ops.md\#batch_join}{\texttt{batch\_join}} \item \href{../../api_docs/python/io_ops.md\#decode_csv}{\texttt{decode\_csv}} \item \href{../../api_docs/python/io_ops.md\#decode_raw}{\texttt{decode\_raw}} \item \href{../../api_docs/python/io_ops.md\#FIFOQueue}{\texttt{FIFOQueue}} \item \href{../../api_docs/python/io_ops.md\#FixedLengthRecordReader}{\texttt{FixedLengthRecordReader}} \item \href{../../api_docs/python/io_ops.md\#IdentityReader}{\texttt{IdentityReader}} \item \href{../../api_docs/python/io_ops.md\#limit_epochs}{\texttt{limit\_epochs}} \item \href{../../api_docs/python/io_ops.md\#match_filenames_once}{\texttt{match\_filenames\_once}} \item \href{../../api_docs/python/io_ops.md\#matching_files}{\texttt{matching\_files}} \item \href{../../api_docs/python/io_ops.md\#parse_example}{\texttt{parse\_example}} \item \href{../../api_docs/python/io_ops.md\#parse_single_example}{\texttt{parse\_single\_example}} \item \href{../../api_docs/python/io_ops.md\#placeholder}{\texttt{placeholder}} \item \href{../../api_docs/python/io_ops.md\#QueueBase}{\texttt{QueueBase}} \item \href{../../api_docs/python/io_ops.md\#RandomShuffleQueue}{\texttt{RandomShuffleQueue}} \item \href{../../api_docs/python/io_ops.md\#range_input_producer}{\texttt{range\_input\_producer}} \item \href{../../api_docs/python/io_ops.md\#read_file}{\texttt{read\_file}} \item \href{../../api_docs/python/io_ops.md\#ReaderBase}{\texttt{ReaderBase}} \item \href{../../api_docs/python/io_ops.md\#shuffle_batch}{\texttt{shuffle\_batch}} \item \href{../../api_docs/python/io_ops.md\#shuffle_batch_join}{\texttt{shuffle\_batch\_join}} \item \href{../../api_docs/python/io_ops.md\#size}{\texttt{size}} \item \href{../../api_docs/python/io_ops.md\#slice_input_producer}{\texttt{slice\_input\_producer}} \item \href{../../api_docs/python/io_ops.md\#string_input_producer}{\texttt{string\_input\_producer}} \item \href{../../api_docs/python/io_ops.md\#TextLineReader}{\texttt{TextLineReader}} \item \href{../../api_docs/python/io_ops.md\#TFRecordReader}{\texttt{TFRecordReader}} \item \href{../../api_docs/python/io_ops.md\#WholeFileReader}{\texttt{WholeFileReader}} \item \textbf{\href{../../api_docs/python/python_io.md}{Data IO (Python functions)}}: \item \href{../../api_docs/python/python_io.md\#tf_record_iterator}{\texttt{tf\_record\_iterator}} \item \href{../../api_docs/python/python_io.md\#TFRecordWriter}{\texttt{TFRecordWriter}} \item \textbf{\href{../../api_docs/python/nn.md}{Neural Network}}: \item \href{../../api_docs/python/nn.md\#avg_pool}{\texttt{avg\_pool}} \item \href{../../api_docs/python/nn.md\#bias_add}{\texttt{bias\_add}} \item \href{../../api_docs/python/nn.md\#compute_accidental_hits}{\texttt{compute\_accidental\_hits}} \item \href{../../api_docs/python/nn.md\#conv2d}{\texttt{conv2d}} \item \href{../../api_docs/python/nn.md\#depthwise_conv2d}{\texttt{depthwise\_conv2d}} \item \href{../../api_docs/python/nn.md\#dropout}{\texttt{dropout}} \item \href{../../api_docs/python/nn.md\#embedding_lookup}{\texttt{embedding\_lookup}} \item \href{../../api_docs/python/nn.md\#fixed_unigram_candidate_sampler}{\texttt{fixed\_unigram\_candidate\_sampler}} \item \href{../../api_docs/python/nn.md\#in_top_k}{\texttt{in\_top\_k}} \item \href{../../api_docs/python/nn.md\#l2_loss}{\texttt{l2\_loss}} \item \href{../../api_docs/python/nn.md\#l2_normalize}{\texttt{l2\_normalize}} \item \href{../../api_docs/python/nn.md\#learned_unigram_candidate_sampler}{\texttt{learned\_unigram\_candidate\_sampler}} \item \href{../../api_docs/python/nn.md\#local_response_normalization}{\texttt{local\_response\_normalization}} \item \href{../../api_docs/python/nn.md\#log_uniform_candidate_sampler}{\texttt{log\_uniform\_candidate\_sampler}} \item \href{../../api_docs/python/nn.md\#max_pool}{\texttt{max\_pool}} \item \href{../../api_docs/python/nn.md\#max_pool_with_argmax}{\texttt{max\_pool\_with\_argmax}} \item \href{../../api_docs/python/nn.md\#moments}{\texttt{moments}} \item \href{../../api_docs/python/nn.md\#nce_loss}{\texttt{nce\_loss}} \item \href{../../api_docs/python/nn.md\#relu}{\texttt{relu}} \item \href{../../api_docs/python/nn.md\#relu6}{\texttt{relu6}} \item \href{../../api_docs/python/nn.md\#sampled_softmax_loss}{\texttt{sampled\_softmax\_loss}} \item \href{../../api_docs/python/nn.md\#separable_conv2d}{\texttt{separable\_conv2d}} \item \href{../../api_docs/python/nn.md\#sigmoid}{\texttt{sigmoid}} \item \href{../../api_docs/python/nn.md\#sigmoid_cross_entropy_with_logits}{\texttt{sigmoid\_cross\_entropy\_with\_logits}} \item \href{../../api_docs/python/nn.md\#softmax}{\texttt{softmax}} \item \href{../../api_docs/python/nn.md\#softmax_cross_entropy_with_logits}{\texttt{softmax\_cross\_entropy\_with\_logits}} \item \href{../../api_docs/python/nn.md\#softplus}{\texttt{softplus}} \item \href{../../api_docs/python/nn.md\#tanh}{\texttt{tanh}} \item \href{../../api_docs/python/nn.md\#top_k}{\texttt{top\_k}} \item \href{../../api_docs/python/nn.md\#uniform_candidate_sampler}{\texttt{uniform\_candidate\_sampler}} \item \textbf{\href{../../api_docs/python/client.md}{Running Graphs}}: \item \href{../../api_docs/python/client.md\#AbortedError}{\texttt{AbortedError}} \item \href{../../api_docs/python/client.md\#AlreadyExistsError}{\texttt{AlreadyExistsError}} \item \href{../../api_docs/python/client.md\#CancelledError}{\texttt{CancelledError}} \item \href{../../api_docs/python/client.md\#DataLossError}{\texttt{DataLossError}} \item \href{../../api_docs/python/client.md\#DeadlineExceededError}{\texttt{DeadlineExceededError}} \item \href{../../api_docs/python/client.md\#FailedPreconditionError}{\texttt{FailedPreconditionError}} \item \href{../../api_docs/python/client.md\#get_default_session}{\texttt{get\_default\_session}} \item \href{../../api_docs/python/client.md\#InteractiveSession}{\texttt{InteractiveSession}} \item \href{../../api_docs/python/client.md\#InternalError}{\texttt{InternalError}} \item \href{../../api_docs/python/client.md\#InvalidArgumentError}{\texttt{InvalidArgumentError}} \item \href{../../api_docs/python/client.md\#NotFoundError}{\texttt{NotFoundError}} \item \href{../../api_docs/python/client.md\#OpError}{\texttt{OpError}} \item \href{../../api_docs/python/client.md\#OutOfRangeError}{\texttt{OutOfRangeError}} \item \href{../../api_docs/python/client.md\#PermissionDeniedError}{\texttt{PermissionDeniedError}} \item \href{../../api_docs/python/client.md\#ResourceExhaustedError}{\texttt{ResourceExhaustedError}} \item \href{../../api_docs/python/client.md\#Session}{\texttt{Session}} \item \href{../../api_docs/python/client.md\#UnauthenticatedError}{\texttt{UnauthenticatedError}} \item \href{../../api_docs/python/client.md\#UnavailableError}{\texttt{UnavailableError}} \item \href{../../api_docs/python/client.md\#UnimplementedError}{\texttt{UnimplementedError}} \item \href{../../api_docs/python/client.md\#UnknownError}{\texttt{UnknownError}} \item \textbf{\href{../../api_docs/python/train.md}{Training}}: \item \href{../../api_docs/python/train.md\#AdagradOptimizer}{\texttt{AdagradOptimizer}} \item \href{../../api_docs/python/train.md\#AdamOptimizer}{\texttt{AdamOptimizer}} \item \href{../../api_docs/python/train.md\#add_queue_runner}{\texttt{add\_queue\_runner}} \item \href{../../api_docs/python/train.md\#AggregationMethod}{\texttt{AggregationMethod}} \item \href{../../api_docs/python/train.md\#clip_by_average_norm}{\texttt{clip\_by\_average\_norm}} \item \href{../../api_docs/python/train.md\#clip_by_global_norm}{\texttt{clip\_by\_global\_norm}} \item \href{../../api_docs/python/train.md\#clip_by_norm}{\texttt{clip\_by\_norm}} \item \href{../../api_docs/python/train.md\#clip_by_value}{\texttt{clip\_by\_value}} \item \href{../../api_docs/python/train.md\#Coordinator}{\texttt{Coordinator}} \item \href{../../api_docs/python/train.md\#exponential_decay}{\texttt{exponential\_decay}} \item \href{../../api_docs/python/train.md\#ExponentialMovingAverage}{\texttt{ExponentialMovingAverage}} \item \href{../../api_docs/python/train.md\#FtrlOptimizer}{\texttt{FtrlOptimizer}} \item \href{../../api_docs/python/train.md\#global_norm}{\texttt{global\_norm}} \item \href{../../api_docs/python/train.md\#global_step}{\texttt{global\_step}} \item \href{../../api_docs/python/train.md\#GradientDescentOptimizer}{\texttt{GradientDescentOptimizer}} \item \href{../../api_docs/python/train.md\#gradients}{\texttt{gradients}} \item \href{../../api_docs/python/train.md\#histogram_summary}{\texttt{histogram\_summary}} \item \href{../../api_docs/python/train.md\#image_summary}{\texttt{image\_summary}} \item \href{../../api_docs/python/train.md\#merge_all_summaries}{\texttt{merge\_all\_summaries}} \item \href{../../api_docs/python/train.md\#merge_summary}{\texttt{merge\_summary}} \item \href{../../api_docs/python/train.md\#MomentumOptimizer}{\texttt{MomentumOptimizer}} \item \href{../../api_docs/python/train.md\#Optimizer}{\texttt{Optimizer}} \item \href{../../api_docs/python/train.md\#QueueRunner}{\texttt{QueueRunner}} \item \href{../../api_docs/python/train.md\#RMSPropOptimizer}{\texttt{RMSPropOptimizer}} \item \href{../../api_docs/python/train.md\#scalar_summary}{\texttt{scalar\_summary}} \item \href{../../api_docs/python/train.md\#start_queue_runners}{\texttt{start\_queue\_runners}} \item \href{../../api_docs/python/train.md\#stop_gradient}{\texttt{stop\_gradient}} \item \href{../../api_docs/python/train.md\#summary_iterator}{\texttt{summary\_iterator}} \item \href{../../api_docs/python/train.md\#SummaryWriter}{\texttt{SummaryWriter}} \item \href{../../api_docs/python/train.md\#write_graph}{\texttt{write\_graph}} \item \href{../../api_docs/python/train.md\#zero_fraction}{\texttt{zero\_fraction}} \end{itemize} --\textgreater{} ================================================ FILE: tex_pdf/get_started/c1s01_introduction.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2016 | seika@live.ca % Chapter 1 % Section 1.1 Introduction \section{Introduction || 简介} Ⓔ \textcolor{etc}{Let's get you up and running with TensorFlow!} Ⓒ 本章的目的是让你了解和运行 TensorFlow! Ⓔ \textcolor{etc}{But before we even get started, let's peek at what TensorFlow code looks like in the Python API, so you have a sense of where we're headed.} Ⓒ 在开始之前,让我们先看一段使用 Python API 撰写的 TensorFlow 示例代码, 让你对将要学习的内容有初步的印象. Ⓔ \textcolor{etc}{Here's a little Python program that makes up some data in two dimensions, and then fits a line to it.} Ⓒ 下面这段短小的Python程序将把一些数据放入二维空间,再用一条线来拟合这些数据. \begin{lstlisting} import tensorflow as tf import numpy as np # Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3 x_data = np.random.rand(100).astype("float32") y_data = x_data * 0.1 + 0.3 # Try to find values for W and b that compute y_data = W * x_data + b # (We know that W should be 0.1 and b 0.3, but Tensorflow will # figure that out for us.) W = tf.Variable(tf.random_uniform([1], -1.0, 1.0)) b = tf.Variable(tf.zeros([1])) y = W * x_data + b # Minimize the mean squared errors. loss = tf.reduce_mean(tf.square(y - y_data)) optimizer = tf.train.GradientDescentOptimizer(0.5) train = optimizer.minimize(loss) # Before starting, initialize the variables. We will 'run' this first. init = tf.global_variables_initializer() # Launch the graph. sess = tf.Session() sess.run(init) # Fit the line. for step in xrange(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b)) # Learns best fit is W: [0.1], b: [0.3] \end{lstlisting} Ⓔ \textcolor{etc}{The first part of this code builds the data flow graph. TensorFlow does not actually run any computation until the session is created and the run function is called.} Ⓒ 以上代码的第一部分构建了数据的流向图(flow graph).在一个session被建立并且\lstinline{run()}函数被运行前,TensorFlow不会进行任何实质的计算. Ⓔ \textcolor{etc}{To whet your appetite further, we suggest you check out what a classical machine learning problem looks like in TensorFlow. In the land of neural networks the most "classic" classical problem is the MNIST handwritten digit classification. We offer two introductions here, one for machine learning newbies, and one for pros. If you've already trained dozens of MNIST models in other software packages, please take the red pill. If you've never even heard of MNIST, definitely take the blue pill. If you're somewhere in between, we suggest skimming blue, then red.} Ⓒ 为了进一步激发你的学习欲望,我们想让你先看一下TensorFlow是如何解决一个经典的机器学习问题的.在神经网络领域,最为经典的问题莫过于MNIST手写数字分类.为此,我们准备了两篇不同的教程,分别面向初学者和专家.如果你已经使用其它软件训练过许多MNIST模型,请参阅\hyperref[MINIST_pros]{高级教程(红色药丸)}.如果你以前从未听说过 MNIST ,请先阅读\hyperref[MINIST_beginner]{初级教程(蓝色药丸)}.如果你的水平介于这两类人之间,我们建议你先快速浏览\hyperref[MINIST_beginner]{初级教程},然后再阅读\hyperref[MINIST_pros]{高级教程}. % Add pics and links here. Ⓔ \textcolor{etc}{If you're already sure you want to learn and install TensorFlow you can skip these and charge ahead. Don't worry, you'll still get to see MNIST -- we'll also use MNIST as an example in our technical tutorial where we elaborate on TensorFlow features.} \begin{center} \includegraphics[width=.45\textwidth]{../SOURCE/images/blue_pill.png} \includegraphics[width=.45\textwidth]{../SOURCE/images/red_pill.png} \end{center} Ⓒ 如果你已下定决心准备学习和安装 TensorFlow ,你可以略过这些文字,直接阅读 后面的章节\footnote{\textbf{推荐随后阅读内容}:\hyperref[download_install]{1 下载与安装}, \hyperref[basic_usage]{2 基本使用}, \hyperref[tf_mech101]{3 TensorFlow 101}.}.不用担心,你仍然会看到 MNIST --- 在阐述TensorFlow的特性时, 我们还会使用MNIST作为一个样例. ================================================ FILE: tex_pdf/get_started/c1s02_os_setup.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2016 | seika@live.ca % Chapter 1 % Section 1.2 OS_Setup \section {Download and Setup || 下载与安装} \label{download_install} Ⓔ \textcolor{etc}{You can install TensorFlow either from our provided binary packages or from the github source.} Ⓒ 您可以使用我们提供的二进制包,或者源代码,安装 TensorFlow. % %% \subsection {Requirements | 安装需求} Ⓔ \textcolor{etc}{The TensorFlow Python API currently supports Python 2.7 and Python 3.3+ from source.} Ⓒ TensorFlow Python API 目前支持 Python 2.7 和 python 3.3以上版本. Ⓔ \textcolor{etc}{The GPU version (Linux only) currently requires the Cuda Toolkit 7.0 and CUDNN 6.5 V2. Please see \hyperref[install_cuda]{Cuda installation}.} Ⓒ 支持 GPU 运算的版本 (仅限Linux) 需要 Cuda Toolkit 7.0 和 CUDNN 6.5 V2. 具体请参考\hyperref[install_cuda]{Cuda安装}. % %% \subsection {Overview | 安装总述} We support different ways to install TensorFlow: TensorFlow 支持通过以下不同的方式安装: \begin{itemize} \item \hyperref[pip_install]{Pip Install}: \textcolor{etc}{Install TensorFlow on your machine, possibly upgrading previously installed Python packages. May impact existing Python programs on your machine.}\\ \item \hyperref[pip_install]{Pip 安装}: 在你的机器上安装TensorFlow,可能会同时更新之前安装的Python包,并且影响到你机器当前可运行的Python程序.\\ \item \hyperref[virtualenv_install]{Virtualenv Install}: \textcolor{etc}{Install TensorFlow in its own directory, not impacting any existing Python programs on your machine.}\\ \item \hyperref[virtualenv_install]{Virtualenv 安装}: 在一个独立的路径下安装TensorFlow,不会影响到你机器当前运行的Python程序.\\ \item \hyperref[docker_install]{Docker Install}: \textcolor{etc}{Run TensorFlow in a Docker container isolated from all other programs on your machine.}\\ \item \hyperref[docker_install]{Docker 安装}: 在一个独立的Docker容器中安装TensorFlow,并且不会影响到你机器上的任何其他程序. \end{itemize} Ⓔ \textcolor{etc}{If you are familiar with Pip, Virtualenv, or Docker, please feel free to adapt the instructions to your particular needs. The names of the pip and Docker images are listed in the corresponding installation sections.} Ⓒ 如果你已经很熟悉Pip、Virtualenv、Docker这些工具的使用,请利用教程中提供的代码,根据你的需求安装TensorFlow.你会在下文的对应的安装教程中找到Pip或Docker安装所需的镜像. Ⓔ \textcolor{etc}{If you encounter installation errors, see common problems for some solutions.} Ⓒ 如果你遇到了安装错误,请参考章节\hyperref[comm_prob]{常见问题}寻找解决方案. % %% \subsection {Pip Installatioin | Pip 安装} \label{pip_install} Ⓔ \textcolor{etc}{\href{https://en.wikipedia.org/wiki/Pip_(package_manager)}{Pip} is a package management system used to install and manage software packages written in Python.} Ⓒ \href{https://en.wikipedia.org/wiki/Pip_(package_manager)}{Pip} 是一个用于安装和管理Python软件包的管理系统. Ⓔ \textcolor{etc}{The packages that will be installed or upgraded during the pip install are listed in the \href{https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/pip_package/setup.py}{REQUIRED\_PACKAGES section of setup.py}} Ⓒ 安装依赖包(\href{https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/pip_package/setup.py}{REQUIRED\_PACKAGES section of setup.py}) 列出了pip安装时将会被安装或更新的库文件. Ⓔ \textcolor{etc}{Install pip (or pip3 for python3) if it is not already installed:} Ⓒ 如果pip尚未被安装,请使用以下代码先安装pip(如果你使用的是Python 3请安装 pip3 ): \begin{lstlisting}[language = bash] # Ubuntu/Linux 64-bit $ sudo apt-get install python-pip python-dev \end{lstlisting} \begin{lstlisting} # Mac OS X $ sudo easy_install pip \end{lstlisting} Ⓔ \textcolor{etc}{Install TensorFlow:} Ⓒ 安装 TensorFlow: \begin{lstlisting} # Ubuntu/Linux 64-bit, CPU only: $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.6.0-cp27-none-linux_x86_64.whl \end{lstlisting} \begin{lstlisting} # Ubuntu/Linux 64-bit, GPU enabled: $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.6.0-cp27-none-linux_x86_64.whl \end{lstlisting} \begin{lstlisting}[language = bash] # Mac OS X, CPU only: $ sudo easy_install --upgrade six $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.6.0-py2-none-any.whl \end{lstlisting} Ⓔ \textcolor{etc}{For Python 3:} Ⓒ 基于 Python 3 的 TensorFlow 安装: \begin{lstlisting} # Ubuntu/Linux 64-bit, CPU only: $ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.6.0-cp34-none-linux_x86_64.whl \end{lstlisting} \begin{lstlisting} # Ubuntu/Linux 64-bit, GPU enabled: $ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.6.0-cp34-none-linux_x86_64.whl \end{lstlisting} \begin{lstlisting}[language = bash] # Mac OS X, CPU only: $ sudo easy_install --upgrade six $ sudo pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.6.0-py3-none-any.whl \end{lstlisting} Ⓔ \textcolor{etc}{You can now test your \hyperref[test_install]{installation}.} Ⓒ 至此你可以\hyperref[test_install]{测试安装}是否成功. % %% \subsection {Virtualenv installation | 基于 Virtualenv 安装} \label{virtualenv_install} Ⓔ \textcolor{etc}{\href{http://docs.python-guide.org/en/latest/dev/virtualenvs/}{Virtualenv} is a tool to keep the dependencies required by different Python projects in separate places. The Virtualenv installation of TensorFlow will not override pre-existing version of the Python packages needed by TensorFlow.} Ⓒ \href{http://docs.python-guide.org/en/latest/dev/virtualenvs/}{Virtualenv} 是一个管理在不同位置存放和调用 Python 项目所需依赖库的工具. TensorFlow 的 Virtualenv 安装不会覆盖先前已安装的 TensorFlow Python依赖包. Ⓔ \textcolor{etc}{With \href{https://pypi.python.org/pypi/virtualenv}{Virtualenv} the installation is as follows:} Ⓒ 基于\href{https://pypi.python.org/pypi/virtualenv}{Virtualenv}的安装分为以下几步: \begin{itemize} \item \textcolor{etc}{Install pip and Virtualenv.} \item \textcolor{etc}{Create a Virtualenv environment.} \item \textcolor{etc}{Activate the Virtualenv environment and install TensorFlow in it.} \item \textcolor{etc}{After the install you will activate the Virtualenv environment each time you want to use TensorFlow.} \item 安装 pip 和 Virtualenv. \item 建立一个 Virtualenv 环境. \item 激活这个 Virtualenv 环境,并且在此环境下安装 TensorFlow. \item 安装完成之后,每次你需要使用 TensorFlow 之前必须激活这个 Virtualenv 环境. \end{itemize} Ⓔ \textcolor{etc}{Install pip and Virtualenv:} Ⓒ 安装 pip 和 Virtualenv: \begin{lstlisting} # Ubuntu/Linux 64-bit $ sudo apt-get install python-pip python-dev python-virtualenv \end{lstlisting} \begin{lstlisting} # Mac OS X $ sudo easy_install pip $ sudo pip install --upgrade virtualenv \end{lstlisting} Ⓔ \textcolo{etc}{Create a Virtualenv environment in the directory} \lstinline{~/tensorflow}: Ⓒ 在\lstinline{~/tensorflow}路径下建立一个 Virtualenv 环境: \begin{lstlisting} $ virtualenv --system-site-packages ~/tensorflow \end{lstlisting} Ⓔ \textcolor{etc}{Activate the environment and use pip to install TensorFlow inside it:} Ⓒ 激活 Virtualenv 环境并使用pip在该环境下安装TensorFlow: \begin{lstlisting} $ source ~/tensorflow/bin/activate # If using bash $ source ~/tensorflow/bin/activate.csh # If using csh (tensorflow)$ # Your prompt should change # Ubuntu/Linux 64-bit, CPU only: (tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl # Ubuntu/Linux 64-bit, GPU enabled: (tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl # Mac OS X, CPU only: (tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.5.0-py2-none-any.whl \end{lstlisting} Ⓔ and again for python3: \begin{lstlisting} $ source ~/tensorflow/bin/activate # If using bash $ source ~/tensorflow/bin/activate.csh # If using csh (tensorflow)$ # Your prompt should change # Ubuntu/Linux 64-bit, CPU only: (tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.6.0-cp34-none-linux_x86_64.whl # Ubuntu/Linux 64-bit, GPU enabled: (tensorflow)$ pip install --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.6.0-cp34-none-linux_x86_64.whl # Mac OS X, CPU only: (tensorflow)$ pip3 install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.6.0-py3-none-any.whl \end{lstlisting} Ⓔ \textcolor{etc}{With the Virtualenv environment activated, you can now \hyperref[test_install]{test your installation}.} Ⓒ 在 Virtualenv 环境被激活时,您可以\hyperref[test_install]{测试安装}. Ⓔ \textcolor{etc}{When you are done using TensorFlow, deactivate the environment.} Ⓒ 当您无需使用 TensorFlow 时,取消激活该环境. \begin{lstlisting} (tensorflow)$ deactivate $ # Your prompt should change back \end{lstlisting} Ⓔ \textcolor{etc}{To use TensorFlow later you will have to activate the Virtualenv environment again:} Ⓒ 如果需要再次使用 TensorFlow 您需要先再次激活 Virtualenv 环境: \begin{lstlisting} $ source ~/tensorflow/bin/activate # If using bash. $ source ~/tensorflow/bin/activate.csh # If using csh. (tensorflow)$ # Your prompt should change. # Run Python programs that use TensorFlow. ... # When you are done using TensorFlow, deactivate the environment. (tensorflow)$ deactivate \end{lstlisting} %%%%%%%%%%%%%%%% % Done to here % %%%%%%%%%%%%%%%% % %% \subsection {Docker Installation} \label{docker_install} \href{http://docker.com/}{Docker} is a system to build self contained versions of a Linux operating system running on your machine. When you install and run TensorFlow via Docker it completely isolates the installation from pre-existing packages on your machine. We provide 4 Docker images: \begin{itemize} \item \lstinline{b.gcr.io/tensorflow/tensorflow}: TensorFlow CPU binary image. \item \lstinline{b.gcr.io/tensorflow/tensorflow:latest-devel}:CPU Binary image plus source code. \item \lstinline{b.gcr.io/tensorflow/tensorflow:latest-gpu}:TensorFlow GPU binary image. \item \lstinline{b.gcr.io/tensorflow/tensorflow:latest-devel-gpu}:GPU Binary image plus source code. \end{itemize} We also have tags with latest replaced by a released version (eg \lstinline{0.6.0-gpu}). With Docker the installation is as follows: \begin{itemize} \item Install Docker on your machine. \item Create a \href{http://docs.docker.com/engine/installation/ubuntulinux/#create-a-docker-group}{Docker group} to allow launching containers without sudo. \item Launch a Docker container with the TensorFlow image. The image gets downloaded automatically on first launch. \end{itemize} See \href{http://docs.docker.com/engine/installation/}{installing Docker} for instructions on installing Docker on your machine. After Docker is installed, launch a Docker container with the TensorFlow binary image as follows. \begin{lstlisting} $ docker run -it b.gcr.io/tensorflow/tensorflow \end{lstlisting} If you're using a container with GPU support, some additional flags must be passed to expose the GPU device to the container. For the default config, we include a \href{https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/docker/docker_run_gpu.sh}{script} in the repo with these flags, so the command-line would look like: \begin{lstlisting} $ path/to/repo/tensorflow/tools/docker/docker_run_gpu.sh b.gcr.io/tensorflow/tensorflow:gpu \end{lstlisting} You can now \hyperref[test_install]{test your installation} within the Docker container. % %% \subsection {Test the TensorFlow installation | 测试 TensorFlow 安装} \label{test_install} %%% \subsubsection {(Optional, Linux) Enable GPU Support} \textcolor{etc}{If you installed the GPU version of TensorFlow, you must also install the Cuda Toolkit~7.0 and CUDNN~6.5~V2. Please see \hyperref[install_cuda]{Cuda installation}.} Ⓒ 如果您安装了GPU版本的TensorFlow, 您还需要安装 Cuda Toolkit~7.0 和 CUDNN~6.5~V2.请参阅\hyperref[install_cuda]{Cuda 安装}. \textcolor{etc}{You also need to set the \lstinline{LD_LIBRARY_PATH} and \lstinline{CUDA_HOME} environment variables. Consider adding the commands below to your} \lstinline{~/.bash_profile}. \textcolor{etc}{These assume your CUDA installation is in \lstinline{/usr/local/cuda}:} Ⓒ 您需要在先环境变量中设置\lstinline{LD_LIBRARY_PATH} 和 \lstinline{CUDA_HOME}.您可以在\lstinline{~/.bash_profile}中追加一下命令,假设您的CUDA安装位置为\lstinline{/usr/local/cuda}: \begin{lstlisting} export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/local/cuda/lib64" export CUDA_HOME=/usr/local/cuda \end{lstlisting} %%% \subsubsection {Run TensorFlow from the Command Line | 从命令行运行TensorFlow} See \hyperref[comm_prob]{common problems} if an error happens. Ⓒ 如果遇到任何报错,请参考\hyperref[comm_prob]{常见问题}. Open a terminal and type the following: Ⓒ 打开终端,输入以下指令: \begin{lstlisting} $ python ... >>> import tensorflow as tf >>> hello = tf.constant('Hello, TensorFlow!') >>> sess = tf.Session() >>> print(sess.run(hello)) Hello, TensorFlow! >>> a = tf.constant(10) >>> b = tf.constant(32) >>> print(sess.run(a + b)) 42 >>> \end{lstlisting} %%% \subsubsection {Run a TensorFlow demo model | 运行一个TensorFlow的演示模型} All TensorFlow packages, including the demo models, are installed in the Python library. The exact location of the Python library depends on your system, but is usually one of: Ⓒ 所有版本的TensorFlow的Python库中包都附带了一些演示模型. 具体位位置取决于您的系统,它们通常会在以下位置出现: \begin{lstlisting} /usr/local/lib/python2.7/dist-packages/tensorflow /usr/local/lib/python2.7/site-packages/tensorflow \end{lstlisting} You can find out the directory with the following command: Ⓒ 您可以用以下指令找到它的路径: \begin{lstlisting} $ python -c 'import site; print("\n".join(site.getsitepackages()))' \end{lstlisting} The simple demo model for classifying handwritten digits from the MNIST dataset is in the sub-directory \lstinline{models/image/mnist/convolutional.py}. You can run it from the command line as follows: Ⓒ 在子目录\lstinline{models/image/mnist/convolutional.py}可以找到一个使用MNIST数据集进行手写数字识别的简单案例.您可以使用以下指令在命令行中直接运行: \begin{lstlisting} # Using 'python -m' to find the program in the python search path: $ python -m tensorflow.models.image.mnist.convolutional Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz ...etc... # You can alternatively pass the path to the model program file to the python interpreter. $ python /usr/local/lib/python2.7/dist-packages/tensorflow/models/image/mnist/convolutional.py ... \end{lstlisting} % %% \subsection {Installing from source} Ⓔ \textcolor{etc}{When installing from source you will build a pip wheel that you then install using pip. You'll need pip for that, so install it as described \hyperref[pip_install]{above}.} %%% \subsubsection {Clone the TensorFlow repository} \begin{lstlisting} $ git clone --recurse-submodules https://github.com/tensorflow/tensorflow \end{lstlisting} \lstinline{--recurse-submodules} is required to fetch the protobuf library that TensorFlow depends on. %%% \subsubsection {Installation for Linux} \paragraph{Install Bazel} Follow instructions here to install the dependencies for Bazel. Then download bazel version 0.1.1 using the installer for your system and run the installer as mentioned there: \begin{lstlisting} $ chmod +x PATH_TO_INSTALL.SH $ ./PATH_TO_INSTALL.SH --user \end{lstlisting} Remember to replace \lstinline{PATH_TO_INSTALL.SH} with the location where you downloaded the installer. Finally, follow the instructions in that script to place bazel into your binary path. \paragraph{Install other dependencies}\\ \begin{lstlisting} $ sudo apt-get install python-numpy swig python-dev \end{lstlisting} \paragraph{Configure the installation} Run the configure script at the root of the tree. The configure script asks you for the path to your python interpreter and allows (optional) configuration of the CUDA libraries (see \hyperref[install_cuda]{below}). This step is used to locate the python and numpy header files. \begin{lstlisting} $ ./configure Please specify the location of python. [Default is /usr/bin/python]: \end{lstlisting} \paragraph{Optional: Install CUDA (GPUs on Linux)} \label{install_cuda} In order to build or run TensorFlow with GPU support, both Cuda Toolkit 7.0 and CUDNN 6.5 V2 from NVIDIA need to be installed. TensorFlow GPU support requires having a GPU card with NVidia Compute Capability >= 3.5. Supported cards include but are not limited to: \begin{itemize} \item NVidia Titan \item NVidia Titan X \item NVidia K20 \item NVidia K40 \end{itemize} Download and install Cuda Toolkit 7.0 https://developer.nvidia.com/cuda-toolkit-70 Install the toolkit into e.g. /usr/local/cuda Download and install CUDNN Toolkit 6.5 https://developer.nvidia.com/rdp/cudnn-archive Uncompress and copy the cudnn files into the toolkit directory. Assuming the toolkit is installed in /usr/local/cuda: \begin{lstlisting} tar xvzf cudnn-6.5-linux-x64-v2.tgz sudo cp cudnn-6.5-linux-x64-v2/cudnn.h /usr/local/cuda/include sudo cp cudnn-6.5-linux-x64-v2/libcudnn* /usr/local/cuda/lib64 \end{lstlisting} Configure TensorFlow's canonical view of Cuda libraries When running the configure script from the root of your source tree, select the option Y when asked to build TensorFlow with GPU support. \begin{lstlisting} $ ./configure Please specify the location of python. [Default is /usr/bin/python]: Do you wish to build TensorFlow with GPU support? [y/N] y GPU support will be enabled for TensorFlow Please specify the location where CUDA 7.0 toolkit is installed. Refer to README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda Please specify the location where CUDNN 6.5 V2 library is installed. Refer to README.md for more details. [default is: /usr/local/cuda]: /usr/local/cuda Setting up Cuda include Setting up Cuda lib64 Setting up Cuda bin Setting up Cuda nvvm Configuration finished \end{lstlisting} This creates a canonical set of symbolic links to the Cuda libraries on your system. Every time you change the Cuda library paths you need to run this step again before you invoke the bazel build command. Build your target with GPU support From the root of your source tree, run: \begin{lstlisting} $ bazel build -c opt --config=cuda //tensorflow/cc:tutorials_example_trainer $ bazel-bin/tensorflow/cc/tutorials_example_trainer --use_gpu # Lots of output. This tutorial iteratively calculates the major eigenvalue of # a 2x2 matrix, on GPU. The last few lines look like this. 000009/000005 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] 000006/000001 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] 000009/000009 lambda = 2.000000 x = [0.894427 -0.447214] y = [1.788854 -0.894427] \end{lstlisting} Note that "--config=cuda" is needed to enable the GPU support. Enabling Cuda 3.0 TensorFlow officially supports Cuda devices with 3.5 and 5.2 compute capabilities. In order to enable earlier Cuda devices such as Grid K520, you need to target Cuda 3.0. This can be done through TensorFlow unofficial settings with "configure". \begin{lstlisting} $ TF_UNOFFICIAL_SETTING=1 ./configure # Same as the official settings above WARNING: You are configuring unofficial settings in TensorFlow. Because some external libraries are not backward compatible, these settings are largely untested and unsupported. Please specify a list of comma-separated Cuda compute capabilities you want to build with. You can find the compute capability of your device at: https://developer.nvidia.com/cuda-gpus. Please note that each additional compute capability significantly increases your build time and binary size. [Default is: "3.5,5.2"]: 3.0 Setting up Cuda include Setting up Cuda lib64 Setting up Cuda bin Setting up Cuda nvvm Configuration finished \end{lstlisting} Known issues Although it is possible to build both Cuda and non-Cuda configs under the same source tree, we recommend to run \"bazel clean\" when switching between these two configs in the same source tree. You have to run configure before running bazel build. Otherwise, the build will fail with a clear error message. In the future, we might consider making this more conveninent by including the configure step in our build process, given necessary bazel new feature support. %%% \subsubsection {Installation for Mac OS X} We recommend using \href{http://brew.sh/}{homebrew} to install the bazel and SWIG dependencies, and installing python dependencies using easy_install or pip. \paragraph{Dependencies} Follow instructions here to install the dependencies for Bazel. You can then use homebrew to install bazel and SWIG: \begin{lstlisting} $ brew install bazel swig \end{lstlisting} You can install the python dependencies using easy_install or pip. Using easy_install, run \begin{lstlisting} $ sudo easy_install -U six $ sudo easy_install -U numpy $ sudo easy_install wheel \end{lstlisting} We also recommend the \href{https://ipython.org/}{ipython} enhanced python shell, so best install that too: \begin{lstlisting} $ sudo easy_install ipython \end{lstlisting} \paragraph{Configure the installation} Run the \lstinline{configure} script at the root of the tree. The configure script asks you for the path to your python interpreter. This step is used to locate the python and numpy header files. \begin{lstlisting} $ ./configure Please specify the location of python. [Default is /usr/bin/python]: Do you wish to build TensorFlow with GPU support? [y/N] \end{lstlisting} %%% \subsubsection {Create the pip package and install} \begin{lstlisting} $ bazel build -c opt //tensorflow/tools/pip_package:build_pip_package # To build with GPU support: $ bazel build -c opt --config=cuda //tensorflow/tools/pip_package:build_pip_package $ bazel-bin/tensorflow/tools/pip_package/build_pip_package /tmp/tensorflow_pkg # The name of the .whl file will depend on your platform. $ pip install /tmp/tensorflow_pkg/tensorflow-0.5.0-cp27-none-linux_x86_64.whl \end{lstlisting} % %% \subsection {Train your first TensorFlow neural net model | 训练第一个TensorFlow模型} Ⓔ \textcolor{etc}{Starting from the root of your source tree, run:} 从根目录开始运行一下指令: \begin{lstlisting} $ cd tensorflow/models/image/mnist $ python convolutional.py Succesfully downloaded train-images-idx3-ubyte.gz 9912422 bytes. Succesfully downloaded train-labels-idx1-ubyte.gz 28881 bytes. Succesfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes. Succesfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes. Extracting data/train-images-idx3-ubyte.gz Extracting data/train-labels-idx1-ubyte.gz Extracting data/t10k-images-idx3-ubyte.gz Extracting data/t10k-labels-idx1-ubyte.gz Initialized! Epoch 0.00 Minibatch loss: 12.054, learning rate: 0.010000 Minibatch error: 90.6% Validation error: 84.6% Epoch 0.12 Minibatch loss: 3.285, learning rate: 0.010000 Minibatch error: 6.2% Validation error: 7.0% ... ... \end{lstlisting} % %% % \subsection {Common Problems} \label{comm_prob} \subsection {Common Problems | 常见问题} \label{comm_prob} %%% \subsubsection {GPU-related issues | GPU有关问题} If you encounter the following when trying to run a TensorFlow program: \begin{lstlisting} ImportError: libcudart.so.7.0: cannot open shared object file: No such file or directory \end{lstlisting} Make sure you followed the the GPU installation \hyperref[install_cuda]{instructions}. %%% \subsubsection {Pip installation issues | Pip安装中的问题} \paragraph{\lstinline{Can't find setup.py}} If, during pip install, you encounter an error like: \begin{lstlisting} ... IOError: [Errno 2] No such file or directory: '/tmp/pip-o6Tpui-build/setup.py' \end{lstlisting} Solution: upgrade your version of pip: \begin{lstlisting} pip install --upgrade pip \end{lstlisting} This may require sudo, depending on how pip is installed. \paragraph{\lstinline{SSLError: SSL_VERIFY_FAILED}} If, during pip install from a URL, you encounter an error like: \begin{lstlisting} ... SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed \end{lstlisting} Solution: Download the wheel manually via curl or wget, and pip install locally. %%% \subsubsection {Linux issues} If you encounter: \begin{lstlisting} ... "__add__", "__radd__", ^ SyntaxError: invalid syntax \end{lstlisting} Solution: make sure you are using Python 2.7. %%% \subsubsection {Mac OS X: ImportError: No module named copyreg} On Mac OS X, you may encounter the following when importing tensorflow. \begin{lstlisting} >>> import tensorflow as tf ... ImportError: No module named copyreg \end{lstlisting} Solution: TensorFlow depends on protobuf, which requires the Python package \lstinline{six-1.10.0}. Apple's default Python installation only provides \lstinline{six-1.4.1}. You can resolve the issue in one of the following ways: \begin{itemize} \item pgrade the Python installation with the current version of \lstinline{six}: \begin{lstlisting} $ sudo easy_install -U six \end{lstlisting} \item Install TensorFlow with a separate Python library: \begin{itemize} \item Virtualenv \item Docker \end{itemize} Install a separate copy of Python via Homebrew or MacPorts and re-install TensorFlow in that copy of Python. \end{itemize} %%% \subsubsection {Mac OS X: TypeError: \lstinline{__init__()} got an unexpected keyword argument 'syntax'} On Mac OS X, you may encounter the following when importing tensorflow. \begin{lstlisting} >>> import tensorflow as tf Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python2.7/site-packages/tensorflow/__init__.py", line 4, in from tensorflow.python import * File "/usr/local/lib/python2.7/site-packages/tensorflow/python/__init__.py", line 13, in from tensorflow.core.framework.graph_pb2 import * ... File "/usr/local/lib/python2.7/site-packages/tensorflow/core/framework/tensor_shape_pb2.py", line 22, in serialized_pb=_b('\n,tensorflow/core/framework/tensor_shape.proto\x12\ntensorflow\"d\n\x10TensorShapeProto\x12-\n\x03\x64im\x18\x02 \x03(\x0b\x32 .tensorflow.TensorShapeProto.Dim\x1a!\n\x03\x44im\x12\x0c\n\x04size\x18\x01 \x01(\x03\x12\x0c\n\x04name\x18\x02 \x01(\tb\x06proto3') TypeError: __init__() got an unexpected keyword argument 'syntax' \end{lstlisting} This is due to a conflict between protobuf versions (we require protobuf 3.0.0). The best current solution is to make sure older versions of protobuf are not installed, such as: \begin{lstlisting} $ pip install --upgrade protobuf \end{lstlisting} 原文:\href{http://tensorflow.org/get_started/os_setup.md}{Download and Setup} ================================================ FILE: tex_pdf/get_started/c1s03_basic_usage.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2016 | seika@live.ca % Chapter 1 % Section 1.3 Introduction \section{Basic Usage || 使用基础} \label{basic_usage} Ⓔ \textcolor{etc}{To use TensorFlow you need to understand how TensorFlow:} \begin{itemize} \item \textcolor{etc}{Represents computations as \emph{graphs}.} \item \textcolor{etc}{Executes graphs in the context of \lstinline{Sessions}.} \item \textcolor{etc}{Represents data as \emph{tensors}.} \item \textcolor{etc}{Maintains state with \lstinline{Variables}.} \item \textcolor{etc}{Uses feeds and fetches to get data into and out of arbitrary operations.} \end{itemize} Ⓒ 使用TensorFlow之前你需要了解关于TensorFlow的以下基础知识: \begin{itemize} \item 使用\emph{图}(\emph{graphs})来表示计算.\index{graph} \item 在\emph{会话}(\lstinline{Session})中执行图.\index{session} \item 使用\emph{张量}(\emph{tensors})来代表数据.\index{tensor} \item 通过\emph{变量}(\lstinline{Variables})维护状态.\index{variable} \item 使用\emph{供给}(\lstinline{feeds})和\emph{取回}(\lstinline{fetches})将数据传入或传出任何操作. \end{itemize} % %% \subsection{Overview | 总览} Ⓔ \textcolor{etc}{TensorFlow is a programming system in which you represent computations as graphs. Nodes in the graph are called ops (short for operations). An op takes zero or more \emph{Tensors}, performs some computation, and produces zero or more \emph{Tensors}. A Tensor is a typed multi-dimensional array. For example, you can represent a mini-batch of images as a 4-D array of floating point numbers with dimensions \lstinline{[batch, height, width, channels]}.} Ⓒ TensorFlow 是一个以\emph{图}(\emph{graphs})来表示计算的编程系统,图中的节点被称之为 op~(operation的缩写). 一个 op 获得零或多个\emph{张量}(\emph{tensors})执行计算,产生零或多个\emph{张量}。\emph{张量} 是一个按类型划分的多维数组。例如, 你可以将一小组图像集表示为一个四维浮点数数组,这四个维度分别是\lstinline{[batch, height, width, channels]}。 Ⓔ \textcolor{etc}{A TensorFlow graph is a description of computations. To compute anything, a graph must be launched in a \lstinline{Session}. A Session places the graph ops onto \lstinline{Devices}, such as CPUs or GPUs, and provides methods to execute them. These methods return tensors produced by ops as \href{http://www.numpy.org/}{numpy} ndarray objects in Python, and as \lstinline{tensorflow::Tensor} instances in C and C++.} Ⓒ TensorFlow的图是一种对计算的抽象描述。在计算开始前, 图必须在 \emph{会话}(\lstinline{Session()}) 中被启动.\emph{会话}将图的op分发到如 CPU 或 GPU 之类的 \emph{设备}(\lstinline{Devices()}) 上, 同时提供执行 op 的方法。这些方法执行后, 将产生的\emph{张量}(\emph{tensor})返回。在 Python 语言中, 将返回\href{http://www.numpy.org}{numpy}的\lstinline{ndarray} 对象; 在 C 和 C++ 语言中, 将返回\lstinline{tensorflow::Tensor}实例。 % %% %✠ \subsection{The computation graph} \subsection {The computation graph | 计算图} \label{computation_graph} Ⓔ \textcolor{etc}{TensorFlow programs are usually structured into a \emph{construction phase}, that assembles a graph, and an \emph{execution phase} that uses a session to execute ops in the graph.} Ⓒ 通常,TensorFlow编程可按两个阶段组织起来:\emph{构建阶段}和\emph{执行阶段};前者用于组织\emph{计算图},而后者利用session中执行\emph{计算图}中的op操作。 Ⓔ \textcolor{etc}{For example, it is common to create a graph to represent and train a neural network in the construction phase, and then repeatedly execute a set of training ops in the graph in the execution phase.} Ⓒ 例如,在构建阶段创建一个图来表示和训练神经网络,然后在执行阶段反复执行一组op来实现图中的训练。 Ⓔ \textcolor{etc}{TensorFlow can be used from C, C++, and Python programs. It is presently much easier to use the Python library to assemble graphs, as it provides a large set of helper functions not available in the C and C++ libraries.} Ⓒ TensorFlow 支持 C、 C++、 Python 编程语言。 目前, TensorFlow 的 Python 库更加易用,它提供了大量的辅助函数来简化构建图的工作, 而这些函数在 C 和 C++ 库中尚不被支持。 Ⓔ \textcolor{etc}{The session libraries have equivalent functionalities for the three languages.} Ⓒ 这三种语言的会话库 (session libraries) 是一致的. %%% % \subsubsection {Building the graph} \subsubsection {Building the graph | 构建计算图} Ⓔ \textcolor{etc}{To build a graph start with ops that do not need any input (source ops), such as Constant, and pass their output to other ops that do computation.} Ⓒ 刚开始基于op建立图的时候一般不需要任何的输入源(source op),例如输入常量(\lstinline{Constance}),再将它们传递给其它 op 执行运算。 Ⓔ \textcolor{etc}{The ops constructors in the Python library return objects that stand for the output of the constructed ops. You can pass these to other ops constructors to use as inputs.} Ⓒ Python库中的op构造函数返回代表已被组织好的op作为输出对象,这些对象可以传递给其它 op构造函数作为输入。 Ⓔ \textcolor{etc}{The TensorFlow Python library has a default graph to which ops constructors add nodes. The default graph is sufficient for many applications. See the \hyperref[class-tf.graph]{Graph class} documentation for how to explicitly manage multiple graphs.} Ⓒ TensorFlow Python 库有一个可被 op构造函数加入计算结点的默认图 (default graph)。 对大多数应用来说,这个默认图已经足够用了。阅读 \hyperref[class-tf.graph]{Graph 类}文档来了解如何明晰的管理多个图. \begin{lstlisting} import tensorflow as tf # Create a Constant op that produces a 1x2 matrix. The op is # added as a node to the default graph. # # The value returned by the constructor represents the output # of the Constant op. matrix1 = tf.constant([[3., 3.]]) # Create another Constant that produces a 2x1 matrix. matrix2 = tf.constant([[2.],[2.]]) # Create a Matmul op that takes 'matrix1' and 'matrix2' as inputs. # The returned value, 'product', represents the result of the matrix # multiplication. product = tf.matmul(matrix1, matrix2) \end{lstlisting} Ⓔ \textcolor{etc}{The default graph now has three nodes: two constant() ops and one matmul() op. To actually multiply the matrices, and get the result of the multiplication, you must launch the graph in a session.} Ⓒ 默认图现在拥有三个节点,两个\lstinline{constant()} op,一个\lstinline{matmul()} op. 为了真正进行矩阵乘法运算,得到乘法结果, 你必须在一个会话(session)中载入启动这个图。 %%% \subsubsection {Launching the graph in a session | 在会话中载入图} \label{launching_graph} Ⓔ \textcolor{etc}{Launching follows construction. To launch a graph, create a Session object. Without arguments the session constructor launches the default graph.} Ⓔ \textcolor{etc}{See the \hyperref[class-tf.session]{Session class} for the complete session API.} Ⓒ 构建过程完成后就可运行执行过程。为了载入之前所构建的图,必须先创建一个\emph{会话}对象(\lstinline{Session} object)。会话构建器在未指明参数时会载入默认的图。 Ⓒ 完整的会话API资料,请参见\hyperref[class-tf.session]{\emph{会话类}(Session object)}。 \begin{lstlisting} # Launch the default graph. sess = tf.Session() # To run the matmul op we call the session 'run()' method, passing 'product' # which represents the output of the matmul op. This indicates to the call # that we want to get the output of the matmul op back. # # All inputs needed by the op are run automatically by the session. They # typically are run in parallel. # # The call 'run(product)' thus causes the execution of threes ops in the # graph: the two constants and matmul. # # The output of the op is returned in 'result' as a numpy `ndarray` object. result = sess.run(product) print(result) # ==> [[ 12.]] # Close the Session when we're done. sess.close() \end{lstlisting} Ⓔ \textcolor{etc}{Sessions should be closed to release resources. You can also enter a Session with a "with" block. The Session closes automatically at the end of the with block.} Ⓒ 会话在完成后必须关闭以释放资源。你也可以使用\lstinline{"with"}句块开始一个会话,该会话将在\lstinline{"with"}句块结束时自动关闭。 \begin{lstlisting} with tf.Session() as sess: result = sess.run([product]) print(result) \end{lstlisting} Ⓔ \textcolor{etc}{The TensorFlow implementation translates the graph definition into executable operations distributed across available compute resources, such as the CPU or one of your computer's GPU cards. In general you do not have to specify CPUs or GPUs explicitly. TensorFlow uses your first GPU, if you have one, for as many operations as possible.} Ⓒ TensorFlow 事实上通过一个“翻译”过程,将定义的图转化为不同的可用计算资源间实现分布计算的操作,如CPU或是显卡GPU。通常不需要用户指定具体使用的CPU或GPU,TensorFlow 能自动检测并尽可能的充分利用找到的第一个 GPU 进行运算。 Ⓔ \textcolor{etc}{If you have more than one GPU available on your machine, to use a GPU beyond the first you must assign ops to it explicitly. Use with...Device statements to specify which CPU or GPU to use for operations:} Ⓒ 如果你的设备上有不止一个GPU,你需要明确指定op操作到不同的运算设备以调用它们。使用\lstinline{with...Device}语句明确指定哪个CPU或GPU将被调用: \begin{lstlisting} with tf.Session() as sess: with tf.device("/gpu:1"): matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) ... \end{lstlisting} Ⓔ \textcolor{etc}{Devices are specified with strings. The currently supported devices are:}\\ Ⓔ \textcolor{etc}{\lstinline{"/cpu:0"}: The CPU of your machine.}\\ Ⓔ \textcolor{etc}{\lstinline{"/gpu:0"}: The GPU of your machine, if you have one.}\\ Ⓔ \textcolor{etc}{\lstinline{"/gpu:1"}: The second GPU of your machine, etc.} Ⓔ \textcolor{etc}{See Using GPUs for more information about GPUs and TensorFlow.} Ⓒ 使用字符串指定设备,目前支持的设备包括:\\ Ⓒ \lstinline{"/cpu:0"}:计算机的CPU;\\ Ⓒ \lstinline{"/gpu:0"}:计算机的第一个GPU,如果可用;\\ Ⓒ \lstinline{"/gpu:1"}:计算机的第二个GPU,以此类推。 Ⓒ 关于使用GPU的更多信息,请参阅\textbf{GPU使用}。 % %% \subsection{Interactive Usage | 交互式使用} Ⓔ \textcolor{etc}{The Python examples in the documentation launch the graph with a \hyperref[class-tf.session]{\lstinline{Session}} and use the \hyperref[tf.session.run]{\lstinline{Session.run()}} method to execute operations.} Ⓔ \textcolor{etc}{For ease of use in interactive Python environments, such as \href{http://ipython.org/}{\lstinline{IPython}} you can instead use the \hyperref[class-tf.interactivesession]{\lstinline{InteractiveSession}} class, and the \hyperref[tf.tensor.eval]{\lstinline{Tensor.eval()}} and \hyperref[tf.operation.run]{\lstinline{Operation.run()}} methods. This avoids having to keep a variable holding the session.} Ⓒ 文档中的 Python 示例使用一个会话 \hyperref[class-tf.session]{\lstinline{Session}} 来启动图, 并调用 \hyperref[tf.session.run]{\lstinline{Session.run()}} 方法执行操作。 Ⓒ 考虑到如\href{http://ipython.org}{IPython}这样的交互式Python环境的易用, 可以使用\hyperref[class-tf.interactivesession]{\lstinline{InteractiveSession}} 代替\lstinline{Session}类, 使用 \hyperref[tf.tensor.eval]{\lstinline{Tensor.eval()}}和 \hyperref[tf.operation.run]{\lstinline{Operation.run()}} 方法代替 \lstinline{Session.run()}. 这样可以避免使用一个变量来持有会话. \begin{lstlisting} # Enter an interactive TensorFlow Session. import tensorflow as tf sess = tf.InteractiveSession() x = tf.Variable([1.0, 2.0]) a = tf.constant([3.0, 3.0]) # Initialize 'x' using the run() method of its initializer op. x.initializer.run() # Add an op to subtract 'a' from 'x'. Run it and print the result sub = tf.subtract(x, a) print(sub.eval()) # ==> [-2. -1.] # Close the Session when we're done. sess.close() \end{lstlisting} % %% \subsection{Tensors | 张量} Ⓔ \textcolor{etc}{TensorFlow programs use a tensor data structure to represent all data -- only tensors are passed between operations in the computation graph. You can think of a TensorFlow tensor as an n-dimensional array or list. A tensor has a static type, a rank, and a shape. To learn more about how TensorFlow handles these concepts, see the \href{https://www.tensorflow.org/versions/master/resources/dims_types.html#tensor-ranks-shapes-and-types}{Rank, Shape, and Type} reference.} Ⓒ TensorFlow 程序使用 tensor 数据结构来代表所有的数据, 计算图中, 操作间传递的数据都是 tensor. 你可以把 TensorFlow 的张量看作是一个 n 维的数组或列表. 一个 tensor 包含一个静态类型 rank, 和一个 shape. 想了解 TensorFlow 是如何处理这些概念的, 参见 Rank, Shape, 和 Type]。 % %% \subsection{Variables | 变量} Ⓔ \textcolor{etc}{Variables maintain state across executions of the graph. The following example shows a variable serving as a simple counter. See \hyperref[variables]{Variables} for more details.} Ⓒ \emph{变量}维持了图执行过程中的状态信息。下面的例子演示了如何使用变量实现一个简单的计数器,更多细节详见\hyperref[variables]{变量}章节。 \begin{lstlisting} # Create a Variable, that will be initialized to the scalar value 0. # 建立一个变量,用0初始化它的值 state = tf.Variable(0, name="counter") # Create an Op to add one to `state`. one = tf.constant(1) new_value = tf.add(state, one) update = tf.assign(state, new_value) # Variables must be initialized by running an `init` Op after having # launched the graph. We first have to add the `init` Op to the graph. init_op = tf.global_variables_initializer() # Launch the graph and run the ops. with tf.Session() as sess: # Run the 'init' op sess.run(init_op) # Print the initial value of 'state' print(sess.run(state)) # Run the op that updates 'state' and print 'state'. for _ in range(3): sess.run(update) print(sess.run(state)) # output: # 0 # 1 # 2 # 3 \end{lstlisting} Ⓔ \textcolor{etc}{The \lstinline{assign()} operation in this code is a part of the expression graph just like the \lstinline{add()} operation, so it does not actually perform the assignment until \lstinline{run()} executes the expression.} Ⓒ 代码中\lstinline{assign()}操作是图所描绘的表达式的一部分, 正如\lstinline {add()}操作一样. 所以在调用\lstinline {run()}执行表达式之前, 它并不会真正执行赋值操作. Ⓔ \textcolor{etc}{TYou typically represent the parameters of a statistical model as a set of Variables. For example, you would store the weights for a neural network as a tensor in a Variable. During training you update this tensor by running a training graph repeatedly.} Ⓒ 通常会将一个统计模型中的参数表示为一组变量. 例如, 你可以将一个神经网络的权重作为某个变量存储在一个 tensor 中. 在训练过程中, 通过重复运行训练图, 更新这个 tensor. % %% \subsection{Fetches | 取回} Ⓔ \textcolor{etc}{To fetch the outputs of operations, execute the graph with a \lstinline{run()} call on the \lstinline{Session} object and pass in the tensors to retrieve. In the previous example we fetched the single node \lstinline{state}, but you can also fetch multiple tensors:} Ⓒ 为了取回操作的输出内容, 可以在使用 \lstinline{Session} 对象的 \lstinline{run()} 调用 执行图时, 传入一些 tensor,这些 tensor 会帮助你取回结果. 在之前的例子里, 我们只取回了单个节点\lstinline{state}, 但是你也可以取回多个tensor: \begin{lstlisting} input1 = tf.constant(3.0) input2 = tf.constant(2.0) input3 = tf.constant(5.0) intermed = tf.add(input2, input3) mul = tf.multiply(input1, intermed) with tf.Session() as sess: result = sess.run([mul, intermed]) print(result) # output: # [array([ 21.], dtype=float32), array([ 7.], dtype=float32)] \end{lstlisting} Ⓔ \textcolor{etc}{All the ops needed to produce the values of the requested tensors are run once (not once per requested tensor).} Ⓒ 需要获取的多个 tensor 值,在 op 的一次运行中一起获得(而不是逐个去获取 tensor)。 % %% \subsection{Feeds | 供给} Ⓔ \textcolor{etc}{The examples above introduce tensors into the computation graph by storing them in \lstinline{Constants} and \lstinline{Variables}. TensorFlow also provides a feed mechanism for patching a tensor directly into any operation in the graph.} Ⓒ 上述示例在计算图中引入了 tensor, 以 \emph{常量} (\lstinline{Constants}) 或 \emph{变量}(\lstinline{Variables}) 的形式存储. TensorFlow 还提 \emph{供给}(\emph{feed}) 机制, 该机制可临时替代图中的任意操作中的 tensor 可以对图中任何操作提交补丁, 直接插入一个 tensor. Ⓔ \textcolor{etc}{A feed temporarily replaces the output of an operation with a tensor value. You supply feed data as an argument to a \lstinline{run()} call. The feed is only used for the run call to which it is passed. The most common use case involves designating specific operations to be "feed" operations by using \lstinline{tf.placeholder()} to create them:} Ⓒ feed 使用一个 tensor 值临时替换一个操作的输出结果. 你可以提供 feed 数据作为 \lstinline {run()} 调用的参数.feed 只在调用它的方法内有效, 方法结束, feed 就会消失. 最常见的用例是将某些特殊的操作指定为 "feed" 操作, 标记的方法是使用\lstinline{tf.placeholder()}为这些操作创建占位符. \begin{lstlisting} input1 = tf.placeholder(tf.float32) input2 = tf.placeholder(tf.float32) output = tf.multiply(input1, input2) with tf.Session() as sess: print(sess.run([output], feed_dict={input1:[7.], input2:[2.]})) # output: # [array([ 14.], dtype=float32)] \end{lstlisting} Ⓔ \textcolor{etc}{A \lstinline{placeholder()} operation generates an error if you do not supply a feed for it. See the \hyperref[minist_tf]{MNIST fully-connected feed tutorial} (\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py}{source code}) for a larger-scale example of feeds.} Ⓒ 如果没有正确供给, \lstinline{placeholder()} 操作将会产生一个错误提示.关于feed的规模更大的案例,参见\hyperref[minist_tf]{MNIST 全连通 feed 教程}以及其\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py}{源代码}。 \href{http://tensorflow.org/get_started/basic_usage.md}{原文:Basic Usage} ================================================ FILE: tex_pdf/how_tos/c3s00_overview.tex ================================================ % \section{综述 Overview}\label{ux7efcux8ff0-overview} 综述 Overview \subsection{Variables: 创建,初始化,保存,和恢复}\label{variables-ux521bux5efaux521dux59cbux5316ux4fddux5b58ux548cux6062ux590d} TensorFlow Variables 是内存中的容纳 tensor 的缓存。这一小节介绍了用它们在模型训练时(during training)创建、保存和更新模型参数(model parameters) 的方法。 \href{../how_tos/variables.md}{参看教程} \subsection{TensorFlow 机制 101}\label{tensorflow-ux673aux5236-101} 用 MNIST 手写数字识别作为一个小例子,一步一步的将使用 TensorFlow 基础架构(infrastructure)训练大规模模型的细节做详细介绍。 \href{../tutorials/mnist_tf.md}{参看教程} \subsection{TensorBoard: 学习过程的可视化}\label{tensorboard-ux5b66ux4e60ux8fc7ux7a0bux7684ux53efux89c6ux5316} 对模型进行训练和评估时,TensorBoard 是一个很有用的可视化工具。此教程解释了创建和运行 TensorBoard 的方法,和使用摘要操作(Summary ops)的方法,通过添加摘要操作(Summary ops),可以自动把数据传输到 TensorBoard 所使用的事件文件。 \href{../how_tos/summaries_and_tensorboard.md}{参看教程} \subsection{TensorBoard: 图的可视化}\label{tensorboard-ux56feux7684ux53efux89c6ux5316} 此教程介绍了在 TensorBoard 中使用可视化工具的方法,它可以帮助你理解张量流图的过程并 debug。 \href{../how_tos/graph_viz.md}{参看教程} \subsection{数据读入}\label{ux6570ux636eux8bfbux5165} 此教程介绍了把数据传入 TensorSlow 程序的三种主要的方法: Feeding, Reading 和 Preloading. \href{../how_tos/reading_data.md}{参看教程} \subsection{线程和队列}\label{ux7ebfux7a0bux548cux961fux5217} 此教程介绍 TensorFlow 中为了更容易进行异步和并发训练的各种不同结构(constructs)。 \href{../how_tos/threading_and_queues.md}{参看教程} \subsection{添加新的 Op}\label{ux6dfbux52a0ux65b0ux7684-op} TensorFlow 已经提供一整套节点操作()operation),你可以在你的 graph 中随意使用它们,不过这里有关于添加自定义操作(custom op)的细节。 \href{../how_tos/adding_an_op.md}{参看教程}。 \subsection{自定义数据的 Readers}\label{ux81eaux5b9aux4e49ux6570ux636eux7684-readers} 如果你有相当大量的自定义数据集合,可能你想要对 TensorFlow 的 Data Readers 进行扩展,使它能直接以数据自身的格式将其读入。 \href{../how_tos/new_data_formats.md}{参看教程}。 \subsection{使用 GPUs}\label{ux4f7fux7528-gpus} 此教程描述了用多个 GPU 构建和运行模型的方法。 \href{../how_tos/using_gpu.md}{参看教程} \subsection{共享变量 Sharing Variables}\label{ux5171ux4eabux53d8ux91cf-sharing-variables} 当在多 GPU 上部署大型的模型,或展开复杂的 LSTMs 或 RNNs 时,在模型构建代码的不同位置对许多相同的变量(Variable)进行读写常常是必须的。设计变量作用域(Variable Scope)机制的目的就是为了帮助上述任务的实现。 \href{../how_tos/variable_scope/index.md}{参看教程}。 原文: \href{http://tensorflow.org/how_tos/index.html}{How-to} 翻译:\href{https://github.com/TerenceCooper}{Terence Cooper} 校对:\href{https://github.com/lonlonago}{lonlonago} ================================================ FILE: tex_pdf/how_tos/c3s01_variables.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 % Chapter 3 How to ... % Section 3.1 \section{变量:创建、初始化、保存和加载}\label{variables} 当训练模型时,用变量来存储和更新参数。变量包含张量 (Tensor)存放于内存的缓存区。建模时它们需要被明确地初始化,模型训练后它们必须被存储到磁盘。这些变量的值可在之后模型训练和分析是被加载。 本文档描述以下两个TensorFlow类。点击以下链接可查看完整的API文档: \begin{itemize} \item tf.Variable 类 % add link here \item tf.train.Saver 类 % add link here \end{itemize} \subsection {变量创建} 当创建一个变量时,你将一个张量作为初始值传入构造函数Variable()。TensorFlow提供了一系列操作符来初始化张量,初始值是常量或是随机值。 % add link here \begin{lstlisting} # Create two variables. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") biases = tf.Variable(tf.zeros([200]), name="biases") \end{lstlisting} 调用tf.Variable()添加一些操作(Op, operation)到graph: \begin{itemize} \item 一个Variable操作存放变量的值。 \item 一个初始化op将变量设置为初始值。这事实上是一个tf.assign操作。 \item 初始值的操作,例如示例中对biases变量的zeros操作也被加入了graph。 \end{itemize} \lstinline{tf.Variable}的返回值是Python的\lstinline{tf.Variable}类的一个实例。 \subsection {变量初始化} 变量的初始化必须在模型的其它操作运行之前先明确地完成。最简单的方法就是添加一个给所有变量初始化的操作,并在使用模型之前首先运行那个操作。 你或者可以从检查点文件中重新获取变量值,详见下文。 使用\lstinline{tf.initialize_all_variables()}添加一个操作对变量做初始化。记得在完全构建好模型并加载之后再运行那个操作。 \begin{lstlisting} # Create two variables. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35), name="weights") biases = tf.Variable(tf.zeros([200]), name="biases") ... # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Later, when launching the model with tf.Session() as sess: # Run the init operation. sess.run(init_op) ... # Use the model ... \end{lstlisting} \subsubsection{由另一个变量初始化} 你有时候会需要用另一个变量的初始化值给当前变量初始化。由于\lstinline{tf.initialize_all_variables()}是并行地初始化所有变量,所以在有这种需求的情况下需要小心。 用其它变量的值初始化一个新的变量时,使用其它变量的\lstinline{initialized_value()}属性。你可以直接把已初始化的值作为新变量的初始值,或者把它当做tensor计算得到一个值赋予新变量。 \begin{lstlisting} # Create a variable with a random value. weights = tf.Variable(tf.random_normal([784, 200], stddev=0.35),name="weights") # Create another variable with the same value as 'weights'. w2 = tf.Variable(weights.initialized_value(), name="w2") # Create another variable with twice the value of 'weights' w_twice = tf.Variable(weights.initialized_value() * 0.2, name="w_twice") \end{lstlisting} \subsubsection{自定义初始化} \lstinline{tf.initialize_all_variables()}函数便捷地添加一个op来初始化模型的所有变量。你也可以给它传入一组变量进行初始化。详情请见Variables Documentation,包括检查变量是否被初始化。 % add link of Variables Documentaion here \subsection {保存和加载} 最简单的保存和恢复模型的方法是使用\lseinline{tf.train.Saver}对象。构造器给graph的所有变量,或是定义在列表里的变量,添加\lstinline{save}和\lstinline{restoreops}。\lstinline{saver}对象提供了方法来运行这些ops,定义检查点文件的读写路径。 \subsubsection{Checkpoint Files} Variables are saved in binary files that, roughly, contain a map from variable names to tensor values. When you create a Saver object, you can optionally choose names for the variables in the checkpoint files. By default, it uses the value of the \lstinline{Variable.name} property for each variable. % add link of Variable.name here % https://www.tensorflow.org/versions/master/api_docs/python/state_ops.html#Variable.name \subsubsection{保存变量} 用\lstinline{tf.train.Saver()}创建一个\lstinline{Saver}来管理模型中的所有变量。 \begin{lstlisting} # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add an op to initialize the variables. init_op = tf.initialize_all_variables() # Add ops to save and restore all the variables. saver = tf.train.Saver() # Later, launch the model, initialize the variables, do some work, save the # variables to disk. with tf.Session() as sess: sess.run(init_op) # Do some work with the model. .. # Save the variables to disk. save_path = saver.save(sess, "/tmp/model.ckpt") print "Model saved in file: ", save_path \end{lstlisting} \subsubsection{恢复变量} 用同一个\lstinline{Saver}对象来恢复变量。注意,当你从文件中恢复变量时,不需要事先对它们做初始化。 \begin{lstlisting} # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add ops to save and restore all the variables. saver = tf.train.Saver() # Later, launch the model, use the saver to restore variables from disk, and # do some work with the model. with tf.Session() as sess: # Restore variables from disk. saver.restore(sess, "/tmp/model.ckpt") print "Model restored." # Do some work with the model ... \end{lstlisting} \subsubsection{选择存储和恢复哪些变量} 如果你不给\lstinline{tf.train.Saver()}传入任何参数,那么\lstinline{saver}将处理\lstinline{graph}中的所有变量。其中每一个变量都以变量创建时传入的名称被保存。 有时候在检查点文件中明确定义变量的名称很有用。举个例子,你也许已经训练得到了一个模型,其中有个变量命名为\lstinline{"weights"},你想把它的值恢复到一个新的变量\lstinline{"params"}中。 有时候仅保存和恢复模型的一部分变量很有用。再举个例子,你也许训练得到了一个5层神经网络,现在想训练一个6层的新模型,可以将之前5层模型的参数导入到新模型的前5层中。 你可以通过给\lstinline{tf.train.Saver()}构造函数传入Python字典,很容易地定义需要保持的变量及对应名称:键对应使用的名称,值对应被管理的变量。 \textbf{注意}: \begin{quote} You can create as many saver objects as you want if you need to save and restore different subsets of the model variables. The same variable can be listed in multiple saver objects, its value is only changed when the saver restore() method is run. If you only restore a subset of the model variables at the start of a session, you have to run an initialize op for the other variables. See \lstinline{tf.initialize_variables()} for more information. \end{quote} 如果需要保存和恢复模型变量的不同子集,可以创建任意多个saver对象。同一个变量可被列入多个saver对象中,只有当saver的\lstinline{restore()}函数被运行时,它的值才会发生改变。 如果你仅在session开始时恢复模型变量的一个子集,你需要对剩下的变量执行初始化op。详情请见\lstinline{tf.initialize_variables()}。 % add link of tf.initialize_variables() here % https://www.tensorflow.org/versions/master/api_docs/python/state_ops.html#initialize_variables \begin{lstlisting} # Create some variables. v1 = tf.Variable(..., name="v1") v2 = tf.Variable(..., name="v2") ... # Add ops to save and restore only 'v2' using the name "my_v2" saver = tf.train.Saver({"my_v2": v2}) # Use the saver object normally after that. ... \end{lstlisting} ================================================ FILE: tex_pdf/how_tos/c3s02_variable_scope.tex ================================================ \section{共享变量}\label{ux5171ux4eabux53d8ux91cf} 你可以在\href{tensorflow-zh/SOURCE/how_tos/variables/index.md}{怎么使用变量}中所描述的方式来创建,初始化,保存及加载单一的变量.但是当创建复杂的模块时,通常你需要共享大量变量集并且如果你还想在同一个地方初始化这所有的变量,我们又该怎么做呢.本教程就是演示如何使用\texttt{tf.variable\_scope()} 和\texttt{tf.get\_variable()}两个方法来实现这一点. \subsection{问题}\label{ux95eeux9898} 假设你为图片过滤器创建了一个简单的模块,和我们的\href{tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md}{卷积神经网络教程}模块相似,但是这里包括两个卷积(为了简化实例这里只有两个).如果你仅使用\texttt{tf.Variable}变量,那么你的模块就如\href{tensorflow-zh/SOURCE/how_tos/variables/index.md}{怎么使用变量}里面所解释的是一样的模块. \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{my_image_filter(input_images):} \NormalTok{conv1_weights }\OperatorTok{=} \NormalTok{tf.Variable(tf.random_normal([}\DecValTok{5}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{32}\NormalTok{, }\DecValTok{32}\NormalTok{]),} \NormalTok{name}\OperatorTok{=}\StringTok{"conv1_weights"}\NormalTok{)} \NormalTok{conv1_biases }\OperatorTok{=} \NormalTok{tf.Variable(tf.zeros([}\DecValTok{32}\NormalTok{]), name}\OperatorTok{=}\StringTok{"conv1_biases"}\NormalTok{)} \NormalTok{conv1 }\OperatorTok{=} \NormalTok{tf.nn.conv2d(input_images, conv1_weights,} \NormalTok{strides}\OperatorTok{=}\NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], padding}\OperatorTok{=}\StringTok{'SAME'}\NormalTok{)} \NormalTok{relu1 }\OperatorTok{=} \NormalTok{tf.nn.relu(conv1 }\OperatorTok{+} \NormalTok{conv1_biases)} \NormalTok{conv2_weights }\OperatorTok{=} \NormalTok{tf.Variable(tf.random_normal([}\DecValTok{5}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{32}\NormalTok{, }\DecValTok{32}\NormalTok{]),} \NormalTok{name}\OperatorTok{=}\StringTok{"conv2_weights"}\NormalTok{)} \NormalTok{conv2_biases }\OperatorTok{=} \NormalTok{tf.Variable(tf.zeros([}\DecValTok{32}\NormalTok{]), name}\OperatorTok{=}\StringTok{"conv2_biases"}\NormalTok{)} \NormalTok{conv2 }\OperatorTok{=} \NormalTok{tf.nn.conv2d(relu1, conv2_weights,} \NormalTok{strides}\OperatorTok{=}\NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], padding}\OperatorTok{=}\StringTok{'SAME'}\NormalTok{)} \ControlFlowTok{return} \NormalTok{tf.nn.relu(conv2 }\OperatorTok{+} \NormalTok{conv2_biases)} \end{Highlighting} \end{Shaded} 你很容易想到,模块集很快就比一个模块变得更为复杂,仅在这里我们就有了四个不同的变量:\texttt{conv1\_weights},\texttt{conv1\_biases}, \texttt{conv2\_weights}, 和\texttt{conv2\_biases}. 当我们想重用这个模块时问题还在增多.假设你想把你的图片过滤器运用到两张不同的图片, \texttt{image1}和\texttt{image2}.你想通过拥有同一个参数的同一个过滤器来过滤两张图片,你可以调用\texttt{my\_image\_filter()}两次,但是这会产生两组变量. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# First call creates one set of variables.} \NormalTok{result1 }\OperatorTok{=} \NormalTok{my_image_filter(image1)} \CommentTok{# Another set is created in the second call.} \NormalTok{result2 }\OperatorTok{=} \NormalTok{my_image_filter(image2)} \end{Highlighting} \end{Shaded} 通常共享变量的方法就是在单独的代码块中来创建他们并且通过使用他们的函数.如使用字典的例子: \begin{Shaded} \begin{Highlighting}[] \NormalTok{variables_dict }\OperatorTok{=} \NormalTok{\{} \StringTok{"conv1_weights"}\NormalTok{: tf.Variable(tf.random_normal([}\DecValTok{5}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{32}\NormalTok{, }\DecValTok{32}\NormalTok{]),} \NormalTok{name}\OperatorTok{=}\StringTok{"conv1_weights"}\NormalTok{)} \CommentTok{"conv1_biases"}\NormalTok{: tf.Variable(tf.zeros([}\DecValTok{32}\NormalTok{]), name}\OperatorTok{=}\StringTok{"conv1_biases"}\NormalTok{)} \NormalTok{... etc. ...} \NormalTok{\}} \KeywordTok{def} \NormalTok{my_image_filter(input_images, variables_dict):} \NormalTok{conv1 }\OperatorTok{=} \NormalTok{tf.nn.conv2d(input_images, variables_dict[}\StringTok{"conv1_weights"}\NormalTok{],} \NormalTok{strides}\OperatorTok{=}\NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], padding}\OperatorTok{=}\StringTok{'SAME'}\NormalTok{)} \NormalTok{relu1 }\OperatorTok{=} \NormalTok{tf.nn.relu(conv1 }\OperatorTok{+} \NormalTok{variables_dict[}\StringTok{"conv1_biases"}\NormalTok{])} \NormalTok{conv2 }\OperatorTok{=} \NormalTok{tf.nn.conv2d(relu1, variables_dict[}\StringTok{"conv2_weights"}\NormalTok{],} \NormalTok{strides}\OperatorTok{=}\NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], padding}\OperatorTok{=}\StringTok{'SAME'}\NormalTok{)} \ControlFlowTok{return} \NormalTok{tf.nn.relu(conv2 }\OperatorTok{+} \NormalTok{variables_dict[}\StringTok{"conv2_biases"}\NormalTok{])} \CommentTok{# The 2 calls to my_image_filter() now use the same variables} \NormalTok{result1 }\OperatorTok{=} \NormalTok{my_image_filter(image1, variables_dict)} \NormalTok{result2 }\OperatorTok{=} \NormalTok{my_image_filter(image2, variables_dict)} \end{Highlighting} \end{Shaded} 虽然使用上面的方式创建变量是很方便的,但是在这个模块代码之外却破坏了其封装性: \begin{itemize} \tightlist \item 在构建试图的代码中标明变量的名字,类型,形状来创建. \item 当代码改变了,调用的地方也许就会产生或多或少或不同类型的变量. \end{itemize} 解决此类问题的方法之一就是使用类来创建模块,在需要的地方使用类来小心地管理他们需要的变量. 一个更高明的做法,不用调用类,而是利用TensorFlow 提供了\emph{变量作用域} 机制,当构建一个视图时,很容易就可以共享命名过的变量. \subsection{变量作用域实例}\label{ux53d8ux91cfux4f5cux7528ux57dfux5b9eux4f8b} 变量作用域机制在TensorFlow中主要由两部分组成: \begin{itemize} \tightlist \item \texttt{tf.get\_variable(\textless{}name\textgreater{},\ \textless{}shape\textgreater{},\ \textless{}initializer\textgreater{})}: 通过所给的名字创建或是返回一个变量. \item \texttt{tf.variable\_scope(\textless{}scope\_name\textgreater{})}: 通过 \texttt{tf.get\_variable()}为变量名指定命名空间. \end{itemize} 方法 \texttt{tf.get\_variable()} 用来获取或创建一个变量,而不是直接调用\texttt{tf.Variable}.它采用的不是像`tf.Variable这样直接获取值来初始化的方法.一个初始化就是一个方法,创建其形状并且为这个形状提供一个张量.这里有一些在TensorFlow中使用的初始化变量: \begin{itemize} \tightlist \item \texttt{tf.constant\_initializer(value)} 初始化一切所提供的值, \item \texttt{tf.random\_uniform\_initializer(a,\ b)}从a到b均匀初始化, \item \texttt{tf.random\_normal\_initializer(mean,\ stddev)} 用所给平均值和标准差初始化均匀分布. \end{itemize} 为了了解\texttt{tf.get\_variable()}怎么解决前面所讨论的问题,让我们在单独的方法里面创建一个卷积来重构一下代码,命名为\texttt{conv\_relu}: \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{conv_relu(}\BuiltInTok{input}\NormalTok{, kernel_shape, bias_shape):} \CommentTok{# Create variable named "weights".} \NormalTok{weights }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"weights"}\NormalTok{, kernel_shape,} \NormalTok{initializer}\OperatorTok{=}\NormalTok{tf.random_normal_initializer())} \CommentTok{# Create variable named "biases".} \NormalTok{biases }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"biases"}\NormalTok{, bias_shape,} \NormalTok{initializer}\OperatorTok{=}\NormalTok{tf.constant_intializer(}\FloatTok{0.0}\NormalTok{))} \NormalTok{conv }\OperatorTok{=} \NormalTok{tf.nn.conv2d(}\BuiltInTok{input}\NormalTok{, weights,} \NormalTok{strides}\OperatorTok{=}\NormalTok{[}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], padding}\OperatorTok{=}\StringTok{'SAME'}\NormalTok{)} \ControlFlowTok{return} \NormalTok{tf.nn.relu(conv }\OperatorTok{+} \NormalTok{biases)} \end{Highlighting} \end{Shaded} 这个方法中用了\texttt{"weights"} 和\texttt{"biases"}两个简称.而我们更偏向于用\texttt{conv1} 和 \texttt{conv2}这两个变量的写法,但是不同的变量需要不同的名字.这就是\texttt{tf.variable\_scope()} 变量起作用的地方.他为变量指定了相应的命名空间. \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{my_image_filter(input_images):} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"conv1"}\NormalTok{):} \CommentTok{# Variables created here will be named "conv1/weights", "conv1/biases".} \NormalTok{relu1 }\OperatorTok{=} \NormalTok{conv_relu(input_images, [}\DecValTok{5}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{32}\NormalTok{, }\DecValTok{32}\NormalTok{], [}\DecValTok{32}\NormalTok{])} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"conv2"}\NormalTok{):} \CommentTok{# Variables created here will be named "conv2/weights", "conv2/biases".} \ControlFlowTok{return} \NormalTok{conv_relu(relu1, [}\DecValTok{5}\NormalTok{, }\DecValTok{5}\NormalTok{, }\DecValTok{32}\NormalTok{, }\DecValTok{32}\NormalTok{], [}\DecValTok{32}\NormalTok{])} \end{Highlighting} \end{Shaded} 现在,让我们看看当我们调用 \texttt{my\_image\_filter()} 两次时究竟会发生了什么. \begin{verbatim} result1 = my_image_filter(image1) result2 = my_image_filter(image2) # Raises ValueError(... conv1/weights already exists ...) \end{verbatim} 就像你看见的一样,\texttt{tf.get\_variable()}会检测已经存在的变量是否已经共享.如果你想共享他们,你需要像下面使用的一样,通过\texttt{reuse\_variables()}这个方法来指定. \begin{verbatim} with tf.variable_scope("image_filters") as scope: result1 = my_image_filter(image1) scope.reuse_variables() result2 = my_image_filter(image2) \end{verbatim} 用这种方式来共享变量是非常好的,轻量级而且安全. \subsection{变量作用域是怎么工作的?}\label{ux53d8ux91cfux4f5cux7528ux57dfux662fux600eux4e48ux5de5ux4f5cux7684} \subsubsection{\texorpdfstring{理解 \texttt{tf.get\_variable()} }{理解 tf.get\_variable() }}\label{ux7406ux89e3-tf.getux5fvariable} 为了理解变量作用域,首先完全理解\texttt{tf.get\_variable()}是怎么工作的是很有必要的. 通常我们就是这样调用\texttt{tf.get\_variable} 的. \begin{Shaded} \begin{Highlighting}[] \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(name, shape, dtype, initializer)} \end{Highlighting} \end{Shaded} 此调用做了有关作用域的两件事中的其中之一,方法调入.总的有两种情况. \begin{itemize} \tightlist \item 情况1:当\texttt{tf.get\_variable\_scope().reuse\ ==\ False}时,作用域就是为创建新变量所设置的. \end{itemize} 这种情况下,\texttt{v}将通过\texttt{tf.Variable}所提供的形状和数据类型来重新创建.创建变量的全称将会由当前变量作用域名+所提供的\texttt{名字}所组成,并且还会检查来确保没有任何变量使用这个全称.如果这个全称已经有一个变量使用了,那么方法将会抛出\texttt{ValueError}错误.如果一个变量被创建,他将会用\texttt{initializer(shape)}进行初始化.比如: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v.name }\OperatorTok{==} \StringTok{"foo/v:0"} \end{Highlighting} \end{Shaded} \begin{itemize} \tightlist \item 情况1:当\texttt{tf.get\_variable\_scope().reuse\ ==\ True}时,作用域是为重用变量所设置 \end{itemize} 这种情况下,调用就会搜索一个已经存在的变量,他的全称和当前变量的作用域名+所提供的\texttt{名字}是否相等.如果不存在相应的变量,就会抛出\texttt{ValueError} 错误.如果变量找到了,就返回这个变量.如下: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{, reuse}\OperatorTok{=}\VariableTok{True}\NormalTok{):} \NormalTok{v1 }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v1 }\OperatorTok{==} \NormalTok{v} \end{Highlighting} \end{Shaded} \subsubsection{\texorpdfstring{\texttt{tf.variable\_scope()} 基础}{tf.variable\_scope() 基础}}\label{tf.variableux5fscope-ux57faux7840} 知道\texttt{tf.get\_variable()}是怎么工作的,使得理解变量作用域变得很容易.变量作用域的主方法带有一个名称,它将会作为前缀用于变量名,并且带有一个重用标签来区分以上的两种情况.嵌套的作用域附加名字所用的规则和文件目录的规则很类似: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"bar"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v.name }\OperatorTok{==} \StringTok{"foo/bar/v:0"} \end{Highlighting} \end{Shaded} 当前变量作用域可以用\texttt{tf.get\_variable\_scope()}进行检索并且\texttt{reuse} 标签可以通过调用\texttt{tf.get\_variable\_scope().reuse\_variables()}设置为\texttt{True} . \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \NormalTok{tf.get_variable_scope().reuse_variables()} \NormalTok{v1 }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v1 }\OperatorTok{==} \NormalTok{v} \end{Highlighting} \end{Shaded} 注意你\emph{不能}设置\texttt{reuse}标签为\texttt{False}.其中的原因就是允许改写创建模块的方法.想一下你前面写得方法\texttt{my\_image\_filter(inputs)}.有人在变量作用域内调用\texttt{reuse=True} 是希望所有内部变量都被重用.如果允许在方法体内强制执行\texttt{reuse=False},将会打破内部结构并且用这种方法使得很难再共享参数. 即使你不能直接设置 \texttt{reuse} 为 \texttt{False} ,但是你可以输入一个重用变量作用域,然后就释放掉,就成为非重用的变量.当打开一个变量作用域时,使用\texttt{reuse=True} 作为参数是可以的.但也要注意,同一个原因,\texttt{reuse} 参数是不可继承.所以当你打开一个重用变量作用域,那么所有的子作用域也将会被重用. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"root"}\NormalTok{):} \CommentTok{# At start, the scope is not reusing.} \ControlFlowTok{assert} \NormalTok{tf.get_variable_scope().reuse }\OperatorTok{==} \VariableTok{False} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \CommentTok{# Opened a sub-scope, still not reusing.} \ControlFlowTok{assert} \NormalTok{tf.get_variable_scope().reuse }\OperatorTok{==} \VariableTok{False} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{, reuse}\OperatorTok{=}\VariableTok{True}\NormalTok{):} \CommentTok{# Explicitly opened a reusing scope.} \ControlFlowTok{assert} \NormalTok{tf.get_variable_scope().reuse }\OperatorTok{==} \VariableTok{True} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"bar"}\NormalTok{):} \CommentTok{# Now sub-scope inherits the reuse flag.} \ControlFlowTok{assert} \NormalTok{tf.get_variable_scope().reuse }\OperatorTok{==} \VariableTok{True} \CommentTok{# Exited the reusing scope, back to a non-reusing one.} \ControlFlowTok{assert} \NormalTok{tf.get_variable_scope().reuse }\OperatorTok{==} \VariableTok{False} \end{Highlighting} \end{Shaded} \subsubsection{获取变量作用域}\label{ux83b7ux53d6ux53d8ux91cfux4f5cux7528ux57df} 在上面的所有例子中,我们共享参数只因为他们的名字是一致的,那是因为我们开启一个变量作用域重用时刚好用了同一个字符串.在更复杂的情况,他可以通过变量作用域对象来使用,而不是通过依赖于右边的名字来使用.为此,变量作用域可以被获取并使用,而不是仅作为当开启一个新的变量作用域的名字. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{) }\ImportTok{as} \NormalTok{foo_scope:} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{with} \NormalTok{tf.variable_scope(foo_scope)} \NormalTok{w }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"w"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{with} \NormalTok{tf.variable_scope(foo_scope, reuse}\OperatorTok{=}\VariableTok{True}\NormalTok{)} \NormalTok{v1 }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \NormalTok{w1 }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"w"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v1 }\OperatorTok{==} \NormalTok{v} \ControlFlowTok{assert} \NormalTok{w1 }\OperatorTok{==} \NormalTok{w} \end{Highlighting} \end{Shaded} 当开启一个变量作用域,使用一个预先已经存在的作用域时,我们会跳过当前变量作用域的前缀而直接成为一个完全不同的作用域.这就是我们做得完全独立的地方. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{) }\ImportTok{as} \NormalTok{foo_scope:} \ControlFlowTok{assert} \NormalTok{foo_scope.name }\OperatorTok{==} \StringTok{"foo"} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"bar"}\NormalTok{)} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"baz"}\NormalTok{) }\ImportTok{as} \NormalTok{other_scope:} \ControlFlowTok{assert} \NormalTok{other_scope.name }\OperatorTok{==} \StringTok{"bar/baz"} \ControlFlowTok{with} \NormalTok{tf.variable_scope(foo_scope) }\ImportTok{as} \NormalTok{foo_scope2:} \ControlFlowTok{assert} \NormalTok{foo_scope2.name }\OperatorTok{==} \StringTok{"foo"} \CommentTok{# Not changed.} \end{Highlighting} \end{Shaded} \subsubsection{变量作用域中的初始化器}\label{ux53d8ux91cfux4f5cux7528ux57dfux4e2dux7684ux521dux59cbux5316ux5668} 使用\texttt{tf.get\_variable()}允许你重写方法来创建或者重用变量,并且可以被外部透明调用.但是如果我们想改变创建变量的初始化器那要怎么做呢?是否我们需要为所有的创建变量方法传递一个额外的参数呢?那在大多数情况下,当我们想在一个地方并且为所有的方法的所有的变量设置一个默认初始化器,那又改怎么做呢?为了解决这些问题,变量作用域可以携带一个默认的初始化器.他可以被子作用域继承并传递给\texttt{tf.get\_variable()} 调用.但是如果其他初始化器被明确地指定,那么他将会被重写. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{, initializer}\OperatorTok{=}\NormalTok{tf.constant_initializer(}\FloatTok{0.4}\NormalTok{)):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v.}\BuiltInTok{eval}\NormalTok{() }\OperatorTok{==} \FloatTok{0.4} \CommentTok{# Default initializer as set above.} \NormalTok{w }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"w"}\NormalTok{, [}\DecValTok{1}\NormalTok{], initializer}\OperatorTok{=}\NormalTok{tf.constant_initializer(}\FloatTok{0.3}\NormalTok{)):} \ControlFlowTok{assert} \NormalTok{w.}\BuiltInTok{eval}\NormalTok{() }\OperatorTok{==} \FloatTok{0.3} \CommentTok{# Specific initializer overrides the default.} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"bar"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v.}\BuiltInTok{eval}\NormalTok{() }\OperatorTok{==} \FloatTok{0.4} \CommentTok{# Inherited default initializer.} \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"baz"}\NormalTok{, initializer}\OperatorTok{=}\NormalTok{tf.constant_initializer(}\FloatTok{0.2}\NormalTok{)):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{v.}\BuiltInTok{eval}\NormalTok{() }\OperatorTok{==} \FloatTok{0.2} \CommentTok{# Changed default initializer.} \end{Highlighting} \end{Shaded} \subsubsection{\texorpdfstring{在\texttt{tf.variable\_scope()}中ops的名称}{在tf.variable\_scope()中ops的名称}}\label{ux5728tf.variableux5fscopeux4e2dopsux7684ux540dux79f0} 我们讨论 \texttt{tf.variable\_scope} 怎么处理变量的名字.但是又是如何在作用域中影响到 其他ops的名字的呢?ops在一个变量作用域的内部创建,那么他应该是共享他的名字,这是很自然的想法.出于这样的原因,当我们用\texttt{with\ tf.variable\_scope("name")}时,这就间接地开启了一个\texttt{tf.name\_scope("name")}.比如: \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \NormalTok{x }\OperatorTok{=} \FloatTok{1.0} \OperatorTok{+} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \ControlFlowTok{assert} \NormalTok{x.op.name }\OperatorTok{==} \StringTok{"foo/add"} \end{Highlighting} \end{Shaded} 名称作用域可以被开启并添加到一个变量作用域中,然后他们只会影响到ops的名称,而不会影响到变量. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.variable_scope(}\StringTok{"foo"}\NormalTok{):} \ControlFlowTok{with} \NormalTok{tf.name_scope(}\StringTok{"bar"}\NormalTok{):} \NormalTok{v }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"v"}\NormalTok{, [}\DecValTok{1}\NormalTok{])} \NormalTok{x }\OperatorTok{=} \FloatTok{1.0} \OperatorTok{+} \NormalTok{v} \ControlFlowTok{assert} \NormalTok{v.name }\OperatorTok{==} \StringTok{"foo/v:0"} \ControlFlowTok{assert} \NormalTok{x.op.name }\OperatorTok{==} \StringTok{"foo/bar/add"} \end{Highlighting} \end{Shaded} 当用一个引用对象而不是一个字符串去开启一个变量作用域时,我们就不会为ops改变当前的名称作用域. \subsection{使用实例 }\label{ux4f7fux7528ux5b9eux4f8b} 这里有一些指向怎么使用变量作用域的文件.特别是,他被大量用于 \href{https://zh.wikipedia.org/wiki/\%E9\%80\%92\%E5\%BD\%92\%E7\%A5\%9E\%E7\%BB\%8F\%E7\%BD\%91\%E7\%BB\%9C}{时间递归神经网络}和\texttt{sequence-to-sequence}模型, File \textbar{} What's in it? --- \textbar{} --- \texttt{models/image/cifar10.py} \textbar{}图像中检测对象的模型. \texttt{models/rnn/rnn\_cell.py} \textbar{}时间递归神经网络的元方法集. \texttt{models/rnn/seq2seq.py} \textbar{}为创建\texttt{sequence-to-sequence}模型的方法集. 原文:\href{http://www.tensorflow.org/how_tos/variable_scope/index.md}{Sharing Variables} 翻译:\href{https://github.com/nb312}{nb312}校对:\href{https://github.com/jikexueyuanwiki}{Wiki} ================================================ FILE: tex_pdf/how_tos/c3s03_viz_learning.tex ================================================ \section{TensorBoard:可视化学习 }\label{vis_learning} TensorBoard 涉及到的运算,通常是在训练庞大的深度神经网络中出现的复杂而又难以理解的运算。 为了更方便 TensorFlow 程序的理解、调试与优化,我们发布了一套叫做 TensorBoard 的可视化工具。你可以用 TensorBoard 来展现你的 TensorFlow 图像,绘制图像生成的定量指标图以及附加数据。 当 TensorBoard 设置完成后,它应该是这样子的: \begin{figure}[htbp] \centering \includegraphics[width=.95\textwidth]{../SOURCE/images/mnist_tensorboard.png} \caption{MNIST TensorBoard} \end{figure} \subsection{数据序列化 }\label{ux6570ux636eux5e8fux5217ux5316} TensorBoard 通过读取 TensorFlow 的事件文件来运行。TensorFlow 的事件文件包括了你会在 TensorFlow 运行中涉及到的主要数据。下面是 TensorBoard 中汇总数据(Summary data)的大体生命周期。 首先,创建你想汇总数据的 TensorFlow 图,然后再选择你想在哪个节点进行\href{../api_docs/python/train.md\#summary_options}{汇总(summary)操作}。 比如,假设你正在训练一个卷积神经网络,用于识别 MNISt 标签。你可能希望记录学习速度(learning rate)的如何变化,以及目标函数如何变化。通过向节点附加\href{../api_docs/python/train.md\#scalary_summary}{scalar\_summary}操作来分别输出学习速度和期望误差。然后你可以给每个 scalary\_summary 分配一个有意义的 \texttt{标签},比如 \texttt{\textquotesingle{}learning\ rate\textquotesingle{}} 和 \texttt{\textquotesingle{}loss\ function\textquotesingle{}}。 或者你还希望显示一个特殊层中激活的分布,或者梯度权重的分布。可以通过分别附加 \href{../api_docs/python/train.md\#histogram_summary}{histogram\_summary} 运算来收集权重变量和梯度输出。 所有可用的 summary 操作详细信息,可以查看\href{../api_docs/python/train.md\#summary_operation}{summary\_operation}文档。 在TensorFlow中,所有的操作只有当你执行,或者另一个操作依赖于它的输出时才会运行。我们刚才创建的这些节点(summary nodes)都围绕着你的图像:没有任何操作依赖于它们的结果。因此,为了生成汇总信息,我们需要运行所有这些节点。这样的手动工作是很乏味的,因此可以使用\href{../api_docs/python/train.md\#scalary_summary}{tf.merge\_all\_summaries}来将他们合并为一个操作。 然后你可以执行合并命令,它会依据特点步骤将所有数据生成一个序列化的\texttt{Summary} protobuf对象。最后,为了将汇总数据写入磁盘,需要将汇总的protobuf对象传递给\href{../api_docs/python/train.md\#SummaryWriter}{tf.train.Summarywriter}。 \texttt{SummaryWriter} 的构造函数中包含了参数 logdir。这个 logdir 非常重要,所有事件都会写到它所指的目录下。此外,\texttt{SummaryWriter} 中还包含了一个可选择的参数 \texttt{GraphDef}。如果输入了该参数,那么 TensorBoard 也会显示你的图像。 现在已经修改了你的图,也有了 \texttt{SummaryWriter},现在就可以运行你的神经网络了!如果你愿意的话,你可以每一步执行一次合并汇总,这样你会得到一大堆训练数据。这很有可能超过了你想要的数据量。你也可以每一百步执行一次合并汇总,或者如下面代码里示范的这样。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{merged_summary_op }\OperatorTok{=} \NormalTok{tf.merge_all_summaries()} \NormalTok{summary_writer }\OperatorTok{=} \NormalTok{tf.train.SummaryWriter(}\StringTok{'/tmp/mnist_logs'}\NormalTok{, sess.graph)} \NormalTok{total_step }\OperatorTok{=} \DecValTok{0} \ControlFlowTok{while} \NormalTok{training:} \NormalTok{total_step }\OperatorTok{+=} \DecValTok{1} \NormalTok{session.run(training_op)} \ControlFlowTok{if} \NormalTok{total_step }\OperatorTok{%} \DecValTok{100} \OperatorTok{==} \DecValTok{0}\NormalTok{:} \NormalTok{summary_str }\OperatorTok{=} \NormalTok{session.run(merged_summary_op)} \NormalTok{summary_writer.add_summary(summary_str, total_step)} \end{Highlighting} \end{Shaded} 现在已经准备好用 TensorBoard 来可视化这些数据了。 \subsection{启动TensorBoard }\label{ux542fux52a8tensorboard} 输入下面的指令来启动TensorBoard \begin{verbatim} python tensorflow/tensorboard/tensorboard.py --logdir=path/to/log-directory \end{verbatim} 这里的参数 \texttt{logdir} 指向 \texttt{SummaryWriter} 序列化数据的存储路径。如果\texttt{logdir}目录的子目录中包含另一次运行时的数据,那么 TensorBoard 会展示所有运行的数据。一旦 TensorBoard 开始运行,你可以通过在浏览器中输入 \texttt{localhost:6006} 来查看 TensorBoard。 如果你已经通过pip安装了 TensorBoard,你可以通过执行更为简单地命令来访问 TensorBoard \begin{verbatim} tensorboard --logdir=/path/to/log-directory \end{verbatim} 进入 TensorBoard 的界面时,你会在右上角看到导航选项卡,每一个选项卡将展现一组可视化的序列化数据集 。对于你查看的每一个选项卡,如果 TensorBoard 中没有数据与这个选项卡相关的话,则会显示一条提示信息指示你如何序列化相关数据。 更多更详细的关于如何使用 graph 选项来显示你的图像的信息。参见 \href{./graph_viz.md}{TensorBoard:图表可视化} 原文地址:\href{http://tensorflow.org/how_tos/summaries_and_tensorboard/index.html\#tensorboard-visualizing-learning}{TensorBoard:Visualizing Learning} 翻译:\href{https://github.com/thylaco1eo}{thylaco1eo} 校对:\href{https://github.com/lucky521}{lucky521} ================================================ FILE: tex_pdf/how_tos/c3s04_graph_viz.tex ================================================ \section{TensorBoard: 图表可视化 }\label{tensorboard-ux56feux8868ux53efux89c6ux5316} TensorFlow 图表计算强大而又复杂,图表可视化在理解和调试时显得非常有帮助。 下面是一个运作时的可式化例子。 \includegraphics{../images/graph_vis_animation.gif} ``一个TensorFlow图表的可视化'') \emph{一个TensorFlow图表的可视化。} 为了显示自己的图表,需将 TensorBoard 指向此工作的日志目录并运行,点击图表顶部窗格的标签页,然后在左上角的菜单中选择合适的运行。想要深入学习关于如何运行 TensorBoard 以及如何保证所有必要信息被记录下来,请查看 \href{tensorflow-zh/SOURCE/how_tos/summaries_and_tensorboard/index.md}{Summaries 和 TensorBoard}. \subsection{名称域(Name scoping)和节点(Node) }\label{ux540dux79f0ux57dfname-scopingux548cux8282ux70b9node} 典型的 TensorFlow 可以有数以千计的节点,如此多而难以一下全部看到,甚至无法使用标准图表工具来展示。为简单起见,我们为变量名划定范围,并且可视化把该信息用于在图表中的节点上定义一个层级。默认情况下, 只有顶层节点会显示。下面这个例子使用\href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#name_scope}{\texttt{tf.name\_scope}}在\texttt{hidden}命名域下定义了三个操作: \begin{Shaded} \begin{Highlighting}[] \ImportTok{import} \NormalTok{tensorflow }\ImportTok{as} \NormalTok{tf} \ControlFlowTok{with} \NormalTok{tf.name_scope(}\StringTok{'hidden'}\NormalTok{) }\ImportTok{as} \NormalTok{scope:} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant(}\DecValTok{5}\NormalTok{, name}\OperatorTok{=}\StringTok{'alpha'}\NormalTok{)} \NormalTok{W }\OperatorTok{=} \NormalTok{tf.Variable(tf.random_uniform([}\DecValTok{1}\NormalTok{, }\DecValTok{2}\NormalTok{], }\OperatorTok{-}\FloatTok{1.0}\NormalTok{, }\FloatTok{1.0}\NormalTok{), name}\OperatorTok{=}\StringTok{'weights'}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.Variable(tf.zeros([}\DecValTok{1}\NormalTok{]), name}\OperatorTok{=}\StringTok{'biases'}\NormalTok{)} \end{Highlighting} \end{Shaded} 结果是得到了下面三个操作名: \begin{itemize} \tightlist \item \emph{hidden}/alpha \item \emph{hidden}/weights \item \emph{hidden}/biases \end{itemize} 默认地,三个操作名会折叠为一个节点并标注为\texttt{hidden}。其额外细节并没有丢失,你可以双击,或点击右上方橙色的\texttt{+}来展开节点,然后就会看到三个子节点\texttt{alpha},\texttt{weights}和\texttt{biases}了。 这有一个生动的例子,例中有一个更复杂的节点,节点处于其初始和展开状态。 \begin{verbatim} 未展开的名称域 展开的名称域 \end{verbatim} \begin{verbatim} 顶级名称域的初始视图pool_1,点击右上方橙色的+按钮或双击节点来展开。 展开的pool_1名称域视图,点击右上方橙色的-按钮或双击节点来收起此名称域。 \end{verbatim} 通过名称域把节点分组来得到可读性高的图表很关键的。如果你在构建一个模型,名称域就可以用来控制可视化结果。\textbf{你的名称域越好,可视性就越好。} 上面的图像例子说明了可视化的另一方面, TensorFlow 图表有两种连接关系:数据依赖和控制依赖。数据依赖显示两个操作之间的tensor流程,用实心箭头指示,而控制依赖用点线表示。在已展开的视图(上面的右图)中,除了用点线连接的\texttt{CheckNumerics}和\texttt{control\_dependency}之外,所有连接都是数据依赖的。 还有一种手段用来简化布局。大多数 TensorFlow 图表有一部分节点,这部分节点和其他节点之间有很多连接。比如,许多节点在初始化阶段可能会有一个控制依赖,而绘制所有\texttt{init}节点的边缘和其依赖可能会创造出一个混乱的视图。 为了减少混乱,可视化把所有 high-degree 节点分离到右边的一个\emph{从属}区域, 而不会绘制线条来表示他们的边缘。线条也不用来表示连接了,我们绘制了小\emph{节点图标}来指示这些连接关系。分离出从属节点通常不会把关键信息删除掉,因为这些节点和内构功能是相关的。 \begin{verbatim} conv_1是主图表的部分 save被抽出为从属节点 \end{verbatim} \begin{verbatim} 节点conv_1被连接到save,注意其右边save节点图标。 save has a high degree, 并会作为从属节点出现,与conv_1的连接作为一个节点图标显示在其左边。为了继续减少杂乱,既然save有很多连接,我们则只显示前5个,而把其余的缩略为... 12 more。 \end{verbatim} 最后一个结构上的简化法叫做\emph{序列折叠(series collapsing)}。 序列基序(Sequential motifs)是拥有相同结构并且其名称结尾的数字不同的节点,它们被折叠进一个单独的节点块(stack)中。对长序列网络来说,序列折叠极大地简化了视图,对于已层叠的节点,双击会展开序列。 \begin{verbatim} 节点序列 展开的节点序列 \end{verbatim} \begin{verbatim} 一个节点序列的折叠视图。 视图的一小块, 双击后展开。 \end{verbatim} 最后,针对易读性的最后一点要说到的是,可视化为常节点和摘要节点使用了特别的图标,总结起来有下面这些节点符号: \begin{longtable}[c]{@{}ll@{}} \toprule \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 符号 \strut\end{minipage} & \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 意义 \strut\end{minipage}\tabularnewline \midrule \endhead \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/namespace_node.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \emph{High-level}节点代表一个名称域,双击则展开一个高层节点。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/horizontal_stack.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 彼此之间不连接的有限个节点序列。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/vertical_stack.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 彼此之间相连的有限个节点序列。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/op_node.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 一个单独的操作节点。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/constant.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 一个常量结点。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/summary.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 一个摘要节点。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/dataflow_edge.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 显示各操作间的数据流边。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/control_edge.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 显示各操作间的控制依赖边。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \includegraphics{../images/reference_edge.png} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 引用边,表示出度操作节点可以使入度tensor发生变化。 \strut\end{minipage}\tabularnewline \bottomrule \end{longtable} \subsection{交互 }\label{ux4ea4ux4e92} 通过平移和缩放来导航图表,点击和拖动用于平移,滚动手势用于缩放。双击一个节点或点击其\texttt{+}按钮来展开代表一组操作的名称域。右下角有一个小地图可以在缩放和平移时方便的改变当前视角。 要关闭一个打开的节点,再次双击它或点击它的\texttt{-}按钮,你也可以只点击一次来选中一个节点,节点的颜色会加深,并且会看到节点的详情,其连接到的节点会在可视化右上角的详情卡片显现。 \begin{verbatim} 一个名称域的详情卡片 操作节点的详情卡片 \end{verbatim} \begin{verbatim} 详情卡片展示conv2名称域的详细信息,名称域中操作节点的输入和输出被结合在一起,适用于不显示属性的名称域。 详情卡片展示DecodeRaw操作节点,除了输入和输出,卡片也会展示与当前节点相关的设备和属性。 \end{verbatim} 选择对于 high-degree 节点的理解也很有帮助,选择任意节点,则与它的其余连接相应的节点也会选中,这使得在进行例如查看哪一个节点是否已保存等操作时非常容易。 点击详情卡片中的一个节点名称时会选中该节点,必要的话,视角会自动平移以使该节点可见。 最后,使用图例上方的颜色菜单,你可以给你的图表选择两个颜色方案。默认的\emph{结构视图}下,当两个 high-level 节点颜色一样时,其会以相同的彩虹色彩出现,而结构唯一的节点颜色是灰色。还有一个视图则展示了不同的操作运行于什么设备之上。名称域被恰当的根据其中的操作节点的设备片件来着色。 下图是一张真实图表的图解: \begin{verbatim} 按结构着色 按设备着色 \end{verbatim} \begin{verbatim} 结构视图:灰色节点的结构是唯一的。橙色的conv1conv2节点有相同的结构, 其他颜色的节点也类似。 设备视图:名称域根据其中的操作节点的设备片件来着色,在此紫色代表GPU,绿色代表CPU。 \end{verbatim} 原文: \href{../images/index.html\#tensorboard-graph-visualization}{TensorBoard: Graph Visualization} 翻译: {[}@Warln{]}(https://github.com/Warln) 校对: \href{https://github.com/lucky521}{lucky521} ================================================ FILE: tex_pdf/how_tos/c3s05_reading_data.tex ================================================ \section{数据读取 }\label{ux6570ux636eux8bfbux53d6} TensorFlow程序读取数据一共有3种方法: \begin{itemize} \tightlist \item 供给数据(Feeding): 在TensorFlow程序运行的每一步, 让Python代码来供给数据。 \item 从文件读取数据: 在TensorFlow图的起始, 让一个输入管线从文件中读取数据。 \item 预加载数据: 在TensorFlow图中定义常量或变量来保存所有数据(仅适用于数据量比较小的情况)。 \end{itemize} \subsection{目录}\label{ux76eeux5f55} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-reading-data}{数据读取}}{数据读取}}\label{ux6570ux636eux8bfbux53d6-1} \begin{itemize} \tightlist \item \protect\hyperlink{Feeding}{供给数据(Feeding)} \item \protect\hyperlink{AUTOGENERATED-reading-from-files}{从文件读取数据} \item \protect\hyperlink{AUTOGENERATED-filenames--shuffling--and-epoch-limits}{文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits)} \item \protect\hyperlink{AUTOGENERATED-file-formats}{文件格式} \item \protect\hyperlink{AUTOGENERATED-preprocessing}{预处理} \item \protect\hyperlink{AUTOGENERATED-batching}{批处理} \item \protect\hyperlink{QueueRunner}{使用\texttt{QueueRunner}创建预读线程} \item \protect\hyperlink{AUTOGENERATED-filtering-records-or-producing-multiple-examples-per-record}{对记录进行过滤或者为每个纪录创建多个样本} \item \protect\hyperlink{AUTOGENERATED-sparse-input-data}{序列化输入数据(Sparse input data)} \item \protect\hyperlink{AUTOGENERATED-preloaded-data}{预加载数据} \item \protect\hyperlink{AUTOGENERATED-multiple-input-pipelines}{多管线输入} \end{itemize} \subsection{供给数据 }\label{ux4f9bux7ed9ux6570ux636e} TensorFlow的数据供给机制允许你在TensorFlow运算图中将数据注入到任一张量中。因此,python运算可以把数据直接设置到TensorFlow图中。 通过给run()或者eval()函数输入\texttt{feed\_dict}参数, 可以启动运算过程。 \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{with} \NormalTok{tf.Session():} \BuiltInTok{input} \OperatorTok{=} \NormalTok{tf.placeholder(tf.float32)} \NormalTok{classifier }\OperatorTok{=} \NormalTok{...} \BuiltInTok{print} \NormalTok{classifier.}\BuiltInTok{eval}\NormalTok{(feed_dict}\OperatorTok{=}\NormalTok{\{}\BuiltInTok{input}\NormalTok{: my_python_preprocessing_fn()\})} \end{Highlighting} \end{Shaded} 虽然你可以使用常量和变量来替换任何一个张量, 但是最好的做法应该是使用\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#placeholder}{\texttt{placeholder} op}节点。设计\texttt{placeholder}节点的唯一的意图就是为了提供数据供给(feeding)的方法。\texttt{placeholder}节点被声明的时候是未初始化的, 也不包含数据, 如果没有为它供给数据, 则TensorFlow运算的时候会产生错误, 所以千万不要忘了为\texttt{placeholder}提供数据。 可以在\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/fully_connected_feed.py}{\texttt{tensorflow/g3doc/tutorials/mnist/fully\_connected\_feed.py}}找到使用\texttt{placeholder}和MNIST训练的例子,\href{tensorflow-zh/SOURCE/tutorials/mnist/tf/index.md}{MNIST tutorial}也讲述了这一例子。 \subsection{从文件读取数据 }\label{ux4eceux6587ux4ef6ux8bfbux53d6ux6570ux636e} 一共典型的文件读取管线会包含下面这些步骤: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item 文件名列表 \item \emph{可配置的} 文件名乱序(shuffling) \item \emph{可配置的} 最大训练迭代数(epoch limit) \item 文件名队列 \item 针对输入文件格式的阅读器 \item 纪录解析器 \item \emph{可配置的}预处理器 \item 样本队列 \end{enumerate} \subsubsection{文件名, 乱序(shuffling), 和最大训练迭代数(epoch limits) }\label{ux6587ux4ef6ux540d-ux4e71ux5e8fshuffling-ux548cux6700ux5927ux8badux7ec3ux8fedux4ee3ux6570epoch-limits} 可以使用字符串张量(比如\texttt{{[}"file0",\ "file1"{]}}, \texttt{{[}("file\%d"\ \%\ i)\ for\ i\ in\ range(2){]}}, \texttt{{[}("file\%d"\ \%\ i)\ for\ i\ in\ range(2){]}}) 或者\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#match_filenames_once}{\texttt{tf.train.match\_filenames\_once} 函数}来产生文件名列表。 将文件名列表交给\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#string_input_producer}{\texttt{tf.train.string\_input\_producer} 函数}.\texttt{string\_input\_producer}来生成一个先入先出的队列, 文件阅读器会需要它来读取数据。 \texttt{string\_input\_producer} 提供的可配置参数来设置文件名乱序和最大的训练迭代数, \texttt{QueueRunner}会为每次迭代(epoch)将所有的文件名加入文件名队列中, 如果\texttt{shuffle=True}的话, 会对文件名进行乱序处理。这一过程是比较均匀的,因此它可以产生均衡的文件名队列。 这个\texttt{QueueRunner}的工作线程是独立于文件阅读器的线程, 因此乱序和将文件名推入到文件名队列这些过程不会阻塞文件阅读器运行。 \subsubsection{文件格式 }\label{ux6587ux4ef6ux683cux5f0f} 根据你的文件格式, 选择对应的文件阅读器, 然后将文件名队列提供给阅读器的\texttt{read}方法。阅读器的\texttt{read}方法会输出一个key来表征输入的文件和其中的纪录(对于调试非常有用),同时得到一个字符串标量, 这个字符串标量可以被一个或多个解析器,或者转换操作将其解码为张量并且构造成为样本。 \paragraph{CSV 文件 }\label{csv-ux6587ux4ef6} 从CSV文件中读取数据, 需要使用\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#TextLineReader}{\texttt{TextLineReader}}和\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#decode_csv}{\texttt{decode\_csv}} 操作, 如下面的例子所示: \begin{Shaded} \begin{Highlighting}[] \NormalTok{filename_queue }\OperatorTok{=} \NormalTok{tf.train.string_input_producer([}\StringTok{"file0.csv"}\NormalTok{, }\StringTok{"file1.csv"}\NormalTok{])} \NormalTok{reader }\OperatorTok{=} \NormalTok{tf.TextLineReader()} \NormalTok{key, value }\OperatorTok{=} \NormalTok{reader.read(filename_queue)} \CommentTok{# Default values, in case of empty columns. Also specifies the type of the} \CommentTok{# decoded result.} \NormalTok{record_defaults }\OperatorTok{=} \NormalTok{[[}\DecValTok{1}\NormalTok{], [}\DecValTok{1}\NormalTok{], [}\DecValTok{1}\NormalTok{], [}\DecValTok{1}\NormalTok{], [}\DecValTok{1}\NormalTok{]]} \NormalTok{col1, col2, col3, col4, col5 }\OperatorTok{=} \NormalTok{tf.decode_csv(} \NormalTok{value, record_defaults}\OperatorTok{=}\NormalTok{record_defaults)} \NormalTok{features }\OperatorTok{=} \NormalTok{tf.concat(}\DecValTok{0}\NormalTok{, [col1, col2, col3, col4])} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess:} \CommentTok{# Start populating the filename queue.} \NormalTok{coord }\OperatorTok{=} \NormalTok{tf.train.Coordinator()} \NormalTok{threads }\OperatorTok{=} \NormalTok{tf.train.start_queue_runners(coord}\OperatorTok{=}\NormalTok{coord)} \ControlFlowTok{for} \NormalTok{i }\OperatorTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{1200}\NormalTok{):} \CommentTok{# Retrieve a single instance:} \NormalTok{example, label }\OperatorTok{=} \NormalTok{sess.run([features, col5])} \NormalTok{coord.request_stop()} \NormalTok{coord.join(threads)} \end{Highlighting} \end{Shaded} 每次\texttt{read}的执行都会从文件中读取一行内容, \texttt{decode\_csv} 操作会解析这一行内容并将其转为张量列表。如果输入的参数有缺失,\texttt{record\_default}参数可以根据张量的类型来设置默认值。 在调用\texttt{run}或者\texttt{eval}去执行\texttt{read}之前, 你必须调用\texttt{tf.train.start\_queue\_runners}来将文件名填充到队列。否则\texttt{read}操作会被阻塞到文件名队列中有值为止。 \paragraph{固定长度的记录 }\label{ux56faux5b9aux957fux5ea6ux7684ux8bb0ux5f55} 从二进制文件中读取固定长度纪录, 可以使用\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#FixedLengthRecordReader}{\texttt{tf.FixedLengthRecordReader}}的\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#decode_raw}{\texttt{tf.decode\_raw}}操作。\texttt{decode\_raw}操作可以讲一个字符串转换为一个uint8的张量。 举例来说,\href{http://www.cs.toronto.edu/~kriz/cifar.html}{the CIFAR-10 dataset}的文件格式定义是:每条记录的长度都是固定的,一个字节的标签,后面是3072字节的图像数据。uint8的张量的标准操作就可以从中获取图像片并且根据需要进行重组。 例子代码可以在\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py}{\texttt{tensorflow/models/image/cifar10/cifar10\_input.py}}找到,具体讲述可参见\href{tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md\#prepare-the-data}{教程}. \paragraph{标准TensorFlow格式 }\label{ux6807ux51c6tensorflowux683cux5f0f} 另一种保存记录的方法可以允许你讲任意的数据转换为TensorFlow所支持的格式, 这种方法可以使TensorFlow的数据集更容易与网络应用架构相匹配。这种建议的方法就是使用TFRecords文件,TFRecords文件包含了\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto}{\texttt{tf.train.Example} 协议内存块(protocol buffer)}(协议内存块包含了字段 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/feature.proto}{\texttt{Features}})。你可以写一段代码获取你的数据, 将数据填入到\texttt{Example}协议内存块(protocol buffer),将协议内存块序列化为一个字符串, 并且通过\href{tensorflow-zh/SOURCE/api_docs/python/python_io.md\#TFRecordWriter}{\texttt{tf.python\_io.TFRecordWriter} class}写入到TFRecords文件。\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/convert_to_records.py}{\texttt{tensorflow/g3doc/how\_tos/reading\_data/convert\_to\_records.py}}就是这样的一个例子。 从TFRecords文件中读取数据, 可以使用\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#TFRecordReader}{\texttt{tf.TFRecordReader}}的\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#parse_single_example}{\texttt{tf.parse\_single\_example}}解析器。这个\texttt{parse\_single\_example}操作可以将\texttt{Example}协议内存块(protocol buffer)解析为张量。 MNIST的例子就使用了\texttt{convert\_to\_records} 所构建的数据。 请参看\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_reader.py}{\texttt{tensorflow/g3doc/how\_tos/reading\_data/fully\_connected\_reader.py}}, 您也可以将这个例子跟\texttt{fully\_connected\_feed}的版本加以比较。 \subsubsection{预处理 }\label{ux9884ux5904ux7406} 你可以对输入的样本进行任意的预处理, 这些预处理不依赖于训练参数, 你可以在\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py}{\texttt{tensorflow/models/image/cifar10/cifar10.py}}找到数据归一化, 提取随机数据片,增加噪声或失真等等预处理的例子。 \subsubsection{批处理 }\label{ux6279ux5904ux7406} 在数据输入管线的末端, 我们需要有另一个队列来执行输入样本的训练,评价和推理。因此我们使用\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#shuffle_batch}{\texttt{tf.train.shuffle\_batch} 函数}来对队列中的样本进行乱序处理 示例: \begin{verbatim} def read_my_file_format(filename_queue): reader = tf.SomeReader() key, record_string = reader.read(filename_queue) example, label = tf.some_decoder(record_string) processed_example = some_processing(example) return processed_example, label def input_pipeline(filenames, batch_size, num_epochs=None): filename_queue = tf.train.string_input_producer( filenames, num_epochs=num_epochs, shuffle=True) example, label = read_my_file_format(filename_queue) # min_after_dequeue defines how big a buffer we will randomly sample # from -- bigger means better shuffling but slower start up and more # memory used. # capacity must be larger than min_after_dequeue and the amount larger # determines the maximum we will prefetch. Recommendation: # min_after_dequeue + (num_threads + a small safety margin) * batch_size min_after_dequeue = 10000 capacity = min_after_dequeue + 3 * batch_size example_batch, label_batch = tf.train.shuffle_batch( [example, label], batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch \end{verbatim} 如果你需要对不同文件中的样子有更强的乱序和并行处理,可以使用\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#shuffle_batch_join}{\texttt{tf.train.shuffle\_batch\_join} 函数}. 示例: \begin{verbatim} def read_my_file_format(filename_queue): # Same as above def input_pipeline(filenames, batch_size, read_threads, num_epochs=None): filename_queue = tf.train.string_input_producer( filenames, num_epochs=num_epochs, shuffle=True) example_list = [read_my_file_format(filename_queue) for _ in range(read_threads)] min_after_dequeue = 10000 capacity = min_after_dequeue + 3 * batch_size example_batch, label_batch = tf.train.shuffle_batch_join( example_list, batch_size=batch_size, capacity=capacity, min_after_dequeue=min_after_dequeue) return example_batch, label_batch \end{verbatim} 在这个例子中, 你虽然只使用了一个文件名队列, 但是TensorFlow依然能保证多个文件阅读器从同一次迭代(epoch)的不同文件中读取数据,知道这次迭代的所有文件都被开始读取为止。(通常来说一个线程来对文件名队列进行填充的效率是足够的) 另一种替代方案是: 使用\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#shuffle_batch}{\texttt{tf.train.shuffle\_batch} 函数},设置\texttt{num\_threads}的值大于1。 这种方案可以保证同一时刻只在一个文件中进行读取操作(但是读取速度依然优于单线程),而不是之前的同时读取多个文件。这种方案的优点是: * 避免了两个不同的线程从同一个文件中读取同一个样本。 * 避免了过多的磁盘搜索操作。 你一共需要多少个读取线程呢? 函数\texttt{tf.train.shuffle\_batch*}为TensorFlow图提供了获取文件名队列中的元素个数之和的方法。 如果你有足够多的读取线程, 文件名队列中的元素个数之和应该一直是一个略高于0的数。具体可以参考\href{tensorflow-zh/SOURCE/how_tos/summaries_and_tensorboard/index.md}{TensorBoard:可视化学习}. \subsubsection{\texorpdfstring{创建线程并使用\texttt{QueueRunner}对象来预取 }{创建线程并使用QueueRunner对象来预取 }}\label{ux521bux5efaux7ebfux7a0bux5e76ux4f7fux7528queuerunnerux5bf9ux8c61ux6765ux9884ux53d6} 简单来说:使用上面列出的许多\texttt{tf.train}函数添加\href{../../api_docs/python/train.md\#QueueRunner}{\texttt{QueueRunner}}到你的数据流图中。在你运行任何训练步骤之前,需要调用\href{../../api_docs/python/train.md\#start_queue_runners}{\texttt{tf.train.start\_queue\_runners}}函数,否则数据流图将一直挂起。\href{../../api_docs/python/train.md\#start_queue_runners}{\texttt{tf.train.start\_queue\_runners}} 这个函数将会启动输入管道的线程,填充样本到队列中,以便出队操作可以从队列中拿到样本。这种情况下最好配合使用一个\href{../../api_docs/python/train.md\#Coordinator}{\texttt{tf.train.Coordinator}},这样可以在发生错误的情况下正确地关闭这些线程。如果你对训练迭代数做了限制,那么需要使用一个训练迭代数计数器,并且需要被初始化。推荐的代码模板如下: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Create the graph, etc.} \NormalTok{init_op }\OperatorTok{=} \NormalTok{tf.initialize_all_variables()} \CommentTok{# Create a session for running operations in the Graph.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session()} \CommentTok{# Initialize the variables (like the epoch counter).} \NormalTok{sess.run(init_op)} \CommentTok{# Start input enqueue threads.} \NormalTok{coord }\OperatorTok{=} \NormalTok{tf.train.Coordinator()} \NormalTok{threads }\OperatorTok{=} \NormalTok{tf.train.start_queue_runners(sess}\OperatorTok{=}\NormalTok{sess, coord}\OperatorTok{=}\NormalTok{coord)} \ControlFlowTok{try}\NormalTok{:} \ControlFlowTok{while} \OperatorTok{not} \NormalTok{coord.should_stop():} \CommentTok{# Run training steps or whatever} \NormalTok{sess.run(train_op)} \ControlFlowTok{except} \NormalTok{tf.errors.OutOfRangeError:} \BuiltInTok{print} \StringTok{'Done training -- epoch limit reached'} \ControlFlowTok{finally}\NormalTok{:} \CommentTok{# When done, ask the threads to stop.} \NormalTok{coord.request_stop()} \CommentTok{# Wait for threads to finish.} \NormalTok{coord.join(threads)} \NormalTok{sess.close()} \end{Highlighting} \end{Shaded} \paragraph{疑问: 这是怎么回事? }\label{ux7591ux95ee-ux8fd9ux662fux600eux4e48ux56deux4e8b} 首先,我们先创建数据流图,这个数据流图由一些流水线的阶段组成,阶段间用队列连接在一起。第一阶段将生成文件名,我们读取这些文件名并且把他们排到文件名队列中。第二阶段从文件中读取数据(使用\texttt{Reader}),产生样本,而且把样本放在一个样本队列中。根据你的设置,实际上也可以拷贝第二阶段的样本,使得他们相互独立,这样就可以从多个文件中并行读取。在第二阶段的最后是一个排队操作,就是入队到队列中去,在下一阶段出队。因为我们是要开始运行这些入队操作的线程,所以我们的训练循环会使得样本队列中的样本不断地出队。 在\texttt{tf.train}中要创建这些队列和执行入队操作,就要添加\href{../../api_docs/python/train.md\#QueueRunner}{\texttt{tf.train.QueueRunner}}到一个使用\href{../../api_docs/python/train.md\#add_queue_runner}{\texttt{tf.train.add\_queue\_runner}}函数的数据流图中。每个\texttt{QueueRunner}负责一个阶段,处理那些需要在线程中运行的入队操作的列表。一旦数据流图构造成功,\href{../../api_docs/python/train.md\#start_queue_runners}{\texttt{tf.train.start\_queue\_runners}}函数就会要求数据流图中每个\texttt{QueueRunner}去开始它的线程运行入队操作。 如果一切顺利的话,你现在可以执行你的训练步骤,同时队列也会被后台线程来填充。如果您设置了最大训练迭代数,在某些时候,样本出队的操作可能会得到一个\href{../../api_docs/python/client.md\#OutOfRangeError}{\texttt{tf.OutOfRangeError}}的错误。这其实是TensorFlow的``文件结束''(EOF) ------------ 这就意味着已经达到了最大训练迭代数,已经没有更多可用的样本了。 最后一个因素是\href{../../api_docs/python/train.md\#Coordinator}{\texttt{Coordinator}}。这是负责在收到任何关闭信号的时候,让所有的线程都知道。最常用的是在发生异常时这种情况就会呈现出来,比如说其中一个线程在运行某些操作时出现错误(或一个普通的Python异常)。 想要了解更多的关于threading, queues, QueueRunners, and Coordinators的内容可以\href{../../how_tos/threading_and_queues/index.md}{看这里}. \paragraph{疑问: 在达到最大训练迭代数的时候如何清理关闭线程? }\label{ux7591ux95ee-ux5728ux8fbeux5230ux6700ux5927ux8badux7ec3ux8fedux4ee3ux6570ux7684ux65f6ux5019ux5982ux4f55ux6e05ux7406ux5173ux95edux7ebfux7a0b} 想象一下,你有一个模型并且设置了最大训练迭代数。这意味着,生成文件的那个线程将只会在产生\texttt{OutOfRange}错误之前运行许多次。该\texttt{QueueRunner}会捕获该错误,并且关闭文件名的队列,最后退出线程。关闭队列做了两件事情: \begin{itemize} \tightlist \item 如果还试着对文件名队列执行入队操作时将发生错误。任何线程不应该尝试去这样做,但是当队列因为其他错误而关闭时,这就会有用了。 \item 任何当前或将来出队操作要么成功(如果队列中还有足够的元素)或立即失败(发生\texttt{OutOfRange}错误)。它们不会防止等待更多的元素被添加到队列中,因为上面的一点已经保证了这种情况不会发生。 \end{itemize} 关键是,当在文件名队列被关闭时候,有可能还有许多文件名在该队列中,这样下一阶段的流水线(包括reader和其它预处理)还可以继续运行一段时间。 一旦文件名队列空了之后,如果后面的流水线还要尝试从文件名队列中取出一个文件名(例如,从一个已经处理完文件的reader中),这将会触发\texttt{OutOfRange}错误。在这种情况下,即使你可能有一个QueueRunner关联着多个线程。如果这不是在QueueRunner中的最后那个线程,\texttt{OutOfRange}错误仅仅只会使得一个线程退出。这使得其他那些正处理自己的最后一个文件的线程继续运行,直至他们完成为止。 (但如果假设你使用的是\href{../../api_docs/python/train.md\#Coordinator}{\texttt{tf.train.Coordinator}},其他类型的错误将导致所有线程停止)。一旦所有的reader线程触发\texttt{OutOfRange}错误,然后才是下一个队列,再是样本队列被关闭。 同样,样本队列中会有一些已经入队的元素,所以样本训练将一直持续直到样本队列中再没有样本为止。如果样本队列是一个\href{../../api_docs/python/io_ops.md\#RandomShuffleQueue}{\texttt{RandomShuffleQueue}},因为你使用了\texttt{shuffle\_batch} 或者 \texttt{shuffle\_batch\_join},所以通常不会出现以往那种队列中的元素会比\texttt{min\_after\_dequeue} 定义的更少的情况。 然而,一旦该队列被关闭,\texttt{min\_after\_dequeue}设置的限定值将失效,最终队列将为空。在这一点来说,当实际训练线程尝试从样本队列中取出数据时,将会触发\texttt{OutOfRange}错误,然后训练线程会退出。一旦所有的培训线程完成,\href{../../api_docs/python/train.md\#Coordinator.join}{\texttt{tf.train.Coordinator.join}}会返回,你就可以正常退出了。 \subsubsection{筛选记录或产生每个记录的多个样本 }\label{ux7b5bux9009ux8bb0ux5f55ux6216ux4ea7ux751fux6bcfux4e2aux8bb0ux5f55ux7684ux591aux4e2aux6837ux672c} 举个例子,有形式为\texttt{{[}x,\ y,\ z{]}}的样本,我们可以生成一批形式为\texttt{{[}batch,\ x,\ y,\ z{]}}的样本。 如果你想滤除这个记录(或许不需要这样的设置),那么可以设置batch的大小为0;但如果你需要每个记录产生多个样本,那么batch的值可以大于1。 然后很简单,只需调用批处理函数(比如: \texttt{shuffle\_batch} or \texttt{shuffle\_batch\_join})去设置\texttt{enqueue\_many=True}就可以实现。 \subsubsection{稀疏输入数据 }\label{ux7a00ux758fux8f93ux5165ux6570ux636e} SparseTensors这种数据类型使用队列来处理不是太好。如果要使用SparseTensors你就必须在批处理\textbf{之后}使用\href{../../api_docs/python/io_ops.md\#parse_example}{\texttt{tf.parse\_example}} 去解析字符串记录 (而不是在批处理\textbf{之前}使用 \texttt{tf.parse\_single\_example}) 。 \subsection{预取数据 }\label{ux9884ux53d6ux6570ux636e} 这仅用于可以完全加载到存储器中的小的数据集。有两种方法: \begin{itemize} \tightlist \item 存储在常数中。 \item 存储在变量中,初始化后,永远不要改变它的值。 \end{itemize} 使用常数更简单一些,但是会使用更多的内存(因为常数会内联的存储在数据流图数据结构中,这个结构体可能会被复制几次)。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{training_data }\OperatorTok{=} \NormalTok{...} \NormalTok{training_labels }\OperatorTok{=} \NormalTok{...} \ControlFlowTok{with} \NormalTok{tf.Session():} \NormalTok{input_data }\OperatorTok{=} \NormalTok{tf.constant(training_data)} \NormalTok{input_labels }\OperatorTok{=} \NormalTok{tf.constant(training_labels)} \NormalTok{...} \end{Highlighting} \end{Shaded} 要改为使用变量的方式,您就需要在数据流图建立后初始化这个变量。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{training_data }\OperatorTok{=} \NormalTok{...} \NormalTok{training_labels }\OperatorTok{=} \NormalTok{...} \ControlFlowTok{with} \NormalTok{tf.Session() }\ImportTok{as} \NormalTok{sess:} \NormalTok{data_initializer }\OperatorTok{=} \NormalTok{tf.placeholder(dtype}\OperatorTok{=}\NormalTok{training_data.dtype,} \NormalTok{shape}\OperatorTok{=}\NormalTok{training_data.shape)} \NormalTok{label_initializer }\OperatorTok{=} \NormalTok{tf.placeholder(dtype}\OperatorTok{=}\NormalTok{training_labels.dtype,} \NormalTok{shape}\OperatorTok{=}\NormalTok{training_labels.shape)} \NormalTok{input_data }\OperatorTok{=} \NormalTok{tf.Variable(data_initalizer, trainable}\OperatorTok{=}\VariableTok{False}\NormalTok{, collections}\OperatorTok{=}\NormalTok{[])} \NormalTok{input_labels }\OperatorTok{=} \NormalTok{tf.Variable(label_initalizer, trainable}\OperatorTok{=}\VariableTok{False}\NormalTok{, collections}\OperatorTok{=}\NormalTok{[])} \NormalTok{...} \NormalTok{sess.run(input_data.initializer,} \NormalTok{feed_dict}\OperatorTok{=}\NormalTok{\{data_initializer: training_data\})} \NormalTok{sess.run(input_labels.initializer,} \NormalTok{feed_dict}\OperatorTok{=}\NormalTok{\{label_initializer: training_lables\})} \end{Highlighting} \end{Shaded} 设定\texttt{trainable=False} 可以防止该变量被数据流图的 \texttt{GraphKeys.TRAINABLE\_VARIABLES} 收集, 这样我们就不会在训练的时候尝试更新它的值; 设定 \texttt{collections={[}{]}} 可以防止\texttt{GraphKeys.VARIABLES} 收集后做为保存和恢复的中断点。 无论哪种方式,\href{../../api_docs/python/io_ops.md\#slice_input_producer}{\texttt{tf.train.slice\_input\_producer\ function}}函数可以被用来每次产生一个切片。这样就会让样本在整个迭代中被打乱,所以在使用批处理的时候不需要再次打乱样本。所以我们不使用\texttt{shuffle\_batch}函数,取而代之的是纯\href{../../api_docs/python/io_ops.md\#batch}{\texttt{tf.train.batch} 函数}。 如果要使用多个线程进行预处理,需要将\texttt{num\_threads}参数设置为大于1的数字。 在\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded.py}{\texttt{tensorflow/g3doc/how\_tos/reading\_data/fully\_connected\_preloaded.py}} 中可以找到一个MNIST例子,使用常数来预加载。 另外使用变量来预加载的例子在\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/how_tos/reading_data/fully_connected_preloaded_var.py}{\texttt{tensorflow/g3doc/how\_tos/reading\_data/fully\_connected\_preloaded\_var.py}},你可以用上面 \texttt{fully\_connected\_feed} 和 \texttt{fully\_connected\_reader} 的描述来进行比较。 \subsection{多输入管道 }\label{ux591aux8f93ux5165ux7ba1ux9053} 通常你会在一个数据集上面训练,然后在另外一个数据集上做评估计算(或称为 ``eval'')。 这样做的一种方法是,实际上包含两个独立的进程: \begin{itemize} \tightlist \item 训练过程中读取输入数据,并定期将所有的训练的变量写入还原点文件)。 \item 在计算过程中恢复还原点文件到一个推理模型中,读取有效的输入数据。 \end{itemize} 这两个进程在下面的例子中已经完成了:\href{../../tutorials/deep_cnn/index.md\#save-and-restore-checkpoints}{the example CIFAR-10 model},有以下几个好处: \begin{itemize} \tightlist \item eval被当做训练后变量的一个简单映射。 \item 你甚至可以在训练完成和退出后执行eval。 \end{itemize} 您可以在同一个进程的相同的数据流图中有训练和eval,并分享他们的训练后的变量。参考\href{../../how_tos/variable_scope/index.md}{the shared variables tutorial}. 原文地址:\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/reading_data/index.md}{Reading data} 翻译:\href{https://github.com/volvet}{volvet} and \href{https://github.com/zhangkom}{zhangkom} 校对: ================================================ FILE: tex_pdf/how_tos/c3s06_threading_and_queues.tex ================================================ \section{线程和队列 }\label{ux7ebfux7a0bux548cux961fux5217} 在使用TensorFlow进行异步计算时,队列是一种强大的机制。 正如TensorFlow中的其他组件一样,队列就是TensorFlow图中的节点。这是一种有状态的节点,就像变量一样:其他节点可以修改它的内容。具体来说,其他节点可以把新元素插入到队列后端(rear),也可以把队列前端(front)的元素删除。 为了感受一下队列,让我们来看一个简单的例子。我们先创建一个``先入先出''的队列(FIFOQueue),并将其内部所有元素初始化为零。然后,我们构建一个TensorFlow图,它从队列前端取走一个元素,加上1之后,放回队列的后端。慢慢地,队列的元素的值就会增加。 \texttt{Enqueue}、 \texttt{EnqueueMany}和\texttt{Dequeue}都是特殊的节点。他们需要获取队列指针,而非普通的值,如此才能修改队列内容。我们建议您将它们看作队列的方法。事实上,在Python API中,它们就是队列对象的方法(例如\texttt{q.enqueue(...)})。 现在你已经对队列有了一定的了解,让我们深入到细节\ldots{} \subsection{队列使用概述 }\label{ux961fux5217ux4f7fux7528ux6982ux8ff0} 队列,如\texttt{FIFOQueue}和\texttt{RandomShuffleQueue},在TensorFlow的张量异步计算时都非常重要。 例如,一个典型的输入结构:是使用一个\texttt{RandomShuffleQueue}来作为模型训练的输入: \begin{itemize} \tightlist \item 多个线程准备训练样本,并且把这些样本推入队列。 \item 一个训练线程执行一个训练操作,此操作会从队列中移除最小批次的样本(mini-batches)。 \end{itemize} 这种结构具有许多优点,正如在\href{../reading_data}{Reading data how to}中强调的,同时,\href{../reading_data}{Reading data how to}也概括地描述了如何简化输入管道的构造过程。 TensorFlow的\texttt{Session}对象是可以支持多线程的,因此多个线程可以很方便地使用同一个会话(Session)并且并行地执行操作。然而,在Python程序实现这样的并行运算却并不容易。所有线程都必须能被同步终止,异常必须能被正确捕获并报告,回话终止的时候, 队列必须能被正确地关闭。 所幸TensorFlow提供了两个类来帮助多线程的实现:\href{tensorflow-zh/SOURCE/api_docs/python/train.md\#Coordinator}{tf.Coordinator}和 \href{tensorflow-zh/SOURCE/api_docs/python/train.md\#QueueRunner}{tf.QueueRunner}。从设计上这两个类必须被一起使用。\texttt{Coordinator}类可以用来同时停止多个工作线程并且向那个在等待所有工作线程终止的程序报告异常。\texttt{QueueRunner}类用来协调多个工作线程同时将多个张量推入同一个队列中。 \subsection{Coordinator }\label{coordinator} Coordinator类用来帮助多个线程协同工作,多个线程同步终止。 其主要方法有: \begin{itemize} \tightlist \item \texttt{should\_stop()}:如果线程应该停止则返回True。 \item \texttt{request\_stop(\textless{}exception\textgreater{})}: 请求该线程停止。 \item \texttt{join(\textless{}list\ of\ threads\textgreater{})}:等待被指定的线程终止。 \end{itemize} 首先创建一个\texttt{Coordinator}对象,然后建立一些使用\texttt{Coordinator}对象的线程。这些线程通常一直循环运行,一直到\texttt{should\_stop()}返回True时停止。 任何线程都可以决定计算什么时候应该停止。它只需要调用\texttt{request\_stop()},同时其他线程的\texttt{should\_stop()}将会返回\texttt{True},然后都停下来。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 线程体:循环执行,直到`Coordinator`收到了停止请求。} \CommentTok{# 如果某些条件为真,请求`Coordinator`去停止其他线程。} \KeywordTok{def} \NormalTok{MyLoop(coord):} \ControlFlowTok{while} \OperatorTok{not} \NormalTok{coord.should_stop():} \NormalTok{...do something...} \ControlFlowTok{if} \NormalTok{...some condition...:} \NormalTok{coord.request_stop()} \CommentTok{# Main code: create a coordinator.} \NormalTok{coord }\OperatorTok{=} \NormalTok{Coordinator()} \CommentTok{# Create 10 threads that run 'MyLoop()'} \NormalTok{threads }\OperatorTok{=} \NormalTok{[threading.Thread(target}\OperatorTok{=}\NormalTok{MyLoop, args}\OperatorTok{=}\NormalTok{(coord)) }\ControlFlowTok{for} \NormalTok{i }\OperatorTok{in} \BuiltInTok{xrange}\NormalTok{(}\DecValTok{10}\NormalTok{)]} \CommentTok{# Start the threads and wait for all of them to stop.} \ControlFlowTok{for} \NormalTok{t }\OperatorTok{in} \NormalTok{threads: t.start()} \NormalTok{coord.join(threads)} \end{Highlighting} \end{Shaded} 显然,Coordinator可以管理线程去做不同的事情。上面的代码只是一个简单的例子,在设计实现的时候不必完全照搬。Coordinator还支持捕捉和报告异常, 具体可以参考\href{tensorflow-zh/SOURCE/api_docs/python/train.md\#Coordinator}{Coordinator class}的文档。 \subsection{QueueRunner }\label{queuerunner} \texttt{QueueRunner}类会创建一组线程, 这些线程可以重复的执行Enquene操作, 他们使用同一个Coordinator来处理线程同步终止。此外,一个QueueRunner会运行一个\emph{closer thread},当Coordinator收到异常报告时,这个\emph{closer thread}会自动关闭队列。 您可以使用一个queue runner,来实现上述结构。 首先建立一个TensorFlow图表,这个图表使用队列来输入样本。增加处理样本并将样本推入队列中的操作。增加training操作来移除队列中的样本。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{example }\OperatorTok{=} \NormalTok{...ops to create one example...} \CommentTok{# Create a queue, and an op that enqueues examples one at a time in the queue.} \NormalTok{queue }\OperatorTok{=} \NormalTok{tf.RandomShuffleQueue(...)} \NormalTok{enqueue_op }\OperatorTok{=} \NormalTok{queue.enqueue(example)} \CommentTok{# Create a training graph that starts by dequeuing a batch of examples.} \NormalTok{inputs }\OperatorTok{=} \NormalTok{queue.dequeue_many(batch_size)} \NormalTok{train_op }\OperatorTok{=} \NormalTok{...use }\StringTok{'inputs'} \NormalTok{to build the training part of the graph...} \end{Highlighting} \end{Shaded} 在Python的训练程序中,创建一个\texttt{QueueRunner}来运行几个线程, 这几个线程处理样本,并且将样本推入队列。创建一个\texttt{Coordinator},让queue runner使用\texttt{Coordinator}来启动这些线程,创建一个训练的循环, 并且使用\texttt{Coordinator}来控制\texttt{QueueRunner}的线程们的终止。 \begin{verbatim} # Create a queue runner that will run 4 threads in parallel to enqueue # examples. qr = tf.train.QueueRunner(queue, [enqueue_op] * 4) # Launch the graph. sess = tf.Session() # Create a coordinator, launch the queue runner threads. coord = tf.train.Coordinator() enqueue_threads = qr.create_threads(sess, coord=coord, start=True) # Run the training loop, controlling termination with the coordinator. for step in xrange(1000000): if coord.should_stop(): break sess.run(train_op) # When done, ask the threads to stop. coord.request_stop() # And wait for them to actually do it. coord.join(threads) \end{verbatim} \subsection{异常处理 }\label{ux5f02ux5e38ux5904ux7406} 通过queue runners启动的线程不仅仅只处理推送样本到队列。他们还捕捉和处理由队列产生的异常,包括\texttt{OutOfRangeError}异常,这个异常是用于报告队列被关闭。 使用\texttt{Coordinator}的训练程序在主循环中必须同时捕捉和报告异常。 下面是对上面训练循环的改进版本。 \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{try}\NormalTok{:} \ControlFlowTok{for} \NormalTok{step }\OperatorTok{in} \BuiltInTok{xrange}\NormalTok{(}\DecValTok{1000000}\NormalTok{):} \ControlFlowTok{if} \NormalTok{coord.should_stop():} \ControlFlowTok{break} \NormalTok{sess.run(train_op)} \ControlFlowTok{except} \PreprocessorTok{Exception}\NormalTok{, e:} \CommentTok{# Report exceptions to the coordinator.} \NormalTok{coord.request_stop(e)} \CommentTok{# Terminate as usual. It is innocuous to request stop twice.} \NormalTok{coord.request_stop()} \NormalTok{coord.join(threads)} \end{Highlighting} \end{Shaded} 原文地址:\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/threading_and_queues/index.md}{Threading and Queues} 翻译:\href{https://github.com/zhangkom}{zhangkom} 校对:\href{https://github.com/volvet}{volvet} ================================================ FILE: tex_pdf/how_tos/c3s07_adding_an_op.tex ================================================ \section{增加一个新 Op }\label{ux589eux52a0ux4e00ux4e2aux65b0-op} 预备知识: \begin{itemize} \tightlist \item 对 C++ 有一定了解. \item 已经\href{tensorflow-zh/SOURCE/get_started/introduction.md\#source}{下载 TensorFlow 源代码}并有能力编译它. \end{itemize} 如果现有的库没有涵盖你想要的操作, 你可以自己定制一个. 为了使定制的 Op 能够兼容原有的库 , 你必须做以下工作: \begin{itemize} \tightlist \item 在一个 C++ 文件中注册新 Op. Op 的注册与实现是相互独立的. 在其注册时描述了 Op 该如何执行. 例如, 注册 Op 时定义了 Op 的名字, 并指定了它的输入和输出. \item 使用 C++ 实现 Op. 每一个实现称之为一个 ``kernel'', 可以存在多个 kernel, 以适配不同的架构 (CPU, GPU 等)或不同的输入/输出类型. \item 创建一个 Python 包装器(wrapper). 这个包装器是创建 Op 的公开 API. 当注册 Op 时, 会自动生成一个默认 默认的包装器. 既可以直接使用默认包装器, 也可以添加一个新的包装器. \item (可选) 写一个函数计算 Op 的梯度. \item (可选) 写一个函数, 描述 Op 的输入和输出 shape. 该函数能够允许从 Op 推断 shape. \item 测试 Op, 通常使用 Pyhton。如果你定义了梯度,你可以使用Python的\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/kernel_tests/gradient_checker.py}{GradientChecker}来测试它。 \end{itemize} \subsection{内容}\label{ux5185ux5bb9} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-adding-a-new-op}{增加一个新 Op}}{增加一个新 Op}}\label{ux589eux52a0ux4e00ux4e2aux65b0-op-1} \begin{itemize} \tightlist \item \protect\hyperlink{defineux5finterface}{定义 Op 的接口} \item \protect\hyperlink{AUTOGENERATED-implement-the-kernel-for-the-op}{为 Op 实现 kernel} \item \protect\hyperlink{AUTOGENERATED-generate-the-client-wrapper}{生成客户端包装器} \item \protect\hyperlink{AUTOGENERATED-the-python-op-wrapper}{Python Op 包装器} \item \protect\hyperlink{AUTOGENERATED-the-c---op-wrapper}{C++ Op 包装器} \item \protect\hyperlink{AUTOGENERATED-verify-it-works}{检查 Op 能否正常工作} \item \protect\hyperlink{Validation}{验证条件} \item \protect\hyperlink{AUTOGENERATED-op-registration}{Op 注册} \item \protect\hyperlink{Attrs}{属性} \item \protect\hyperlink{AUTOGENERATED-attr-types}{属性类型} \item \protect\hyperlink{Polymorphism}{多态} \item \protect\hyperlink{AUTOGENERATED-inputs-and-outputs}{输入和输出} \item \protect\hyperlink{AUTOGENERATED-backwards-compatibility}{向后兼容性} \item \protect\hyperlink{mult-archs}{GPU 支持} \item \protect\hyperlink{AUTOGENERATED-implement-the-gradient-in-python}{使用 Python 实现梯度} \item \protect\hyperlink{AUTOGENERATED-implement-a-shape-function-in-python}{使用 Python 实现 shape 函数} \end{itemize} \subsection{定义 Op 的接口 }\label{ux5b9aux4e49-op-ux7684ux63a5ux53e3} 向 TensorFlow 系统注册来定义 Op 的接口. 在注册时, 指定 Op 的名称, 它的输入(类型和名称) 和输出(类型和名称), 和所需要任何 \protect\hyperlink{Attrs}{属性}的文档说明. 为了让你有直观的认识, 创建一个简单的 Op 作为例子. 该 Op 接受一个 \texttt{int32} 类型 tensor 作为 输入, 输出这个 tensor 的一个副本, 副本与原 tensor 唯一的区别在于第一个元素被置为 0. 创建 文件 \texttt{tensorflow/core/user\_ops/zero\_out.cc}, 并调用 \texttt{REGISTER\_OP} 宏来定义 Op 的接口. \begin{verbatim} #include "tensorflow/core/framework/op.h" REGISTER_OP("ZeroOut") .Input("to_zero: int32") .Output("zeroed: int32"); \end{verbatim} \texttt{ZeroOut} Op 接受 32 位整型的 tensor \texttt{to\_zero} 作为输入, 输出 32 位整型的 tensor \texttt{zeroed}. \begin{quote} 命名的注意事项: Op 的名称必须是为唯一的, 并使用驼峰命名法. 以下划线 \texttt{\_} 开始的名称保留为内部使用. \end{quote} \subsection{为 Op 实现 kernel }\label{ux4e3a-op-ux5b9eux73b0-kernel} 在定义接口之后, 提供一个或多个 Op 的实现. 为这些 kernel 的每一个创建一个对应的类, 继承 \texttt{OpKernel}, 覆盖 \texttt{Compute} 方法. \texttt{Compute} 方法提供一个类型为 \texttt{OpKernelContext*} 的参数 \texttt{context}, 用于访问一些有用的信息, 例如输入和输出的 tensor. 将 kernel 添加到刚才创建的文件中, kernel 看起来和下面的代码类似: \begin{verbatim} #include "tensorflow/core/framework/op_kernel.h" using namespace tensorflow; class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // 获取输入 tensor. const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // 创建一个输出 tensor. Tensor* output_tensor = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output_tensor)); auto output = output_tensor->template flat(); // 设置 tensor 除第一个之外的元素均设为 0. const int N = input.size(); for (int i = 1; i < N; i++) { output(i) = 0; } // 尽可能地保留第一个元素的值. if (N > 0) output(0) = input(0); } }; \end{verbatim} 实现 kernel 后, 将其注册到 TensorFlow 系统中. 注册时, 可以指定该 kernel 运行时的多个约束 条件. 例如可以指定一个 kernel 在 CPU 上运行, 另一个在 GPU 上运行. 将下列代码加入到 \texttt{zero\_out.cc} 中, 注册 \texttt{ZeroOut} op: \begin{verbatim} REGISTER_KERNEL_BUILDER(Name("ZeroOut").Device(DEVICE_CPU), ZeroOutOp); \end{verbatim} 一旦\href{tensorflow-zh/SOURCE/get_started/os_setup.md\#create-pip}{创建和重新安装了 TensorFlow}, Tensorflow 系统可以在需要时引用和使用该 Op. \subsection{生成客户端包装器 }\label{ux751fux6210ux5ba2ux6237ux7aefux5305ux88c5ux5668} \subsubsection{Python Op 包装器 }\label{python-op-ux5305ux88c5ux5668} 当编译 TensorFlow 时, 所有放在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/user_ops/}{\texttt{tensorflow/core/user\_ops}} 目录下 的 Op 会自动在 \texttt{bazel-genfiles/tensorflow/python/ops/gen\_user\_ops.py} 文件 中生成 Python Op 包装器. 通过以下声明, 把那些 Op 引入到 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py}{\texttt{tensorflow/python/user\_ops/user\_ops.py}} 中: \begin{Shaded} \begin{Highlighting}[] \ImportTok{from} \NormalTok{tensorflow.python.ops.gen_user_ops }\ImportTok{import} \OperatorTok{*} \end{Highlighting} \end{Shaded} 你可以选择性将部分函数替换为自己的实现. 为此, 首先要隐藏自动生成的代码, 在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/BUILD}{\texttt{tensorflow/python/BUILD}} 文件中, 将其名字添加到 \texttt{"user\_ops"} 的 \texttt{hidden} 列表. \begin{Shaded} \begin{Highlighting}[] \NormalTok{tf_gen_op_wrapper_py(} \NormalTok{name }\OperatorTok{=} \StringTok{"user_ops"}\NormalTok{,} \NormalTok{hidden }\OperatorTok{=} \NormalTok{[} \StringTok{"Fact"}\NormalTok{,} \NormalTok{],} \NormalTok{require_shape_functions }\OperatorTok{=} \VariableTok{False}\NormalTok{,} \NormalTok{)} \end{Highlighting} \end{Shaded} 紧接着 \texttt{"Fact"} 列出自己的 Op. 然后, 在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py}{\texttt{tensorflow/python/user\_ops/user\_ops.py}} 中添加你的替代实现函数. 通常, 替代实现函数也会调用自动生成函数来真正把 Op 添加 到图中. 被隐藏的自动生成函数位于 \texttt{gen\_user\_ops} 包中, 名称多了一个下划线前缀 (``\texttt{\_}''). 例如: \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{my_fact():} \CommentTok{"""覆盖一个 Op 自动生成代码的示例."""} \ControlFlowTok{return} \NormalTok{gen_user_ops._fact()} \end{Highlighting} \end{Shaded} \subsubsection{C++ Op 包装器 }\label{c-op-ux5305ux88c5ux5668} 当编译 TensorFlow 时, 所有 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/user_ops/}{\texttt{tensorflow/core/user\_ops}} 文件夹 下的 Op 会自动创建 C++ Op 包装器. 例如, \texttt{tensorflow/core/user\_ops/zero\_out.cc} 中的 Op 会自动在 \texttt{bazel-genfiles/tensorflow/cc/ops/user\_ops.\{h,cc\}} 中生成包装器. \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/cc/ops/standard_ops.h}{\texttt{tensorflow/cc/ops/standard\_ops.h}} 通过下述申明, 导入用户自定义 Op 自动生成的包装器. \begin{verbatim} #include "tensorflow/cc/ops/user_ops.h" \end{verbatim} \subsection{检查 Op 能否正常工作 }\label{ux68c0ux67e5-op-ux80fdux5426ux6b63ux5e38ux5de5ux4f5c} 验证已经成功实现 Op 的方式是编写测试程序. 创建文件 \texttt{tensorflow/python/kernel\_tests/zero\_out\_op\_test.py}, 包含以下内容: \begin{Shaded} \begin{Highlighting}[] \ImportTok{import} \NormalTok{tensorflow }\ImportTok{as} \NormalTok{tf} \KeywordTok{class} \NormalTok{ZeroOutTest(tf.test.TestCase):} \KeywordTok{def} \NormalTok{testZeroOut(}\VariableTok{self}\NormalTok{):} \ControlFlowTok{with} \VariableTok{self}\NormalTok{.test_session():} \NormalTok{result }\OperatorTok{=} \NormalTok{tf.user_ops.zero_out([}\DecValTok{5}\NormalTok{, }\DecValTok{4}\NormalTok{, }\DecValTok{3}\NormalTok{, }\DecValTok{2}\NormalTok{, }\DecValTok{1}\NormalTok{])} \VariableTok{self}\NormalTok{.assertAllEqual(result.}\BuiltInTok{eval}\NormalTok{(), [}\DecValTok{5}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{, }\DecValTok{0}\NormalTok{])} \end{Highlighting} \end{Shaded} 然后运行测试: \begin{verbatim} $ bazel test tensorflow/python:zero_out_op_test \end{verbatim} \subsection{验证条件 }\label{ux9a8cux8bc1ux6761ux4ef6} 上述示例假定 Op 能够应用在任何 shape 的 tensor 上. 如果只想应用到 vector 上 呢? 这意味需要在上述 OpKernel 实现中添加相关的检查. \begin{verbatim} void Compute(OpKernelContext* context) override { // 获取输入 tensor const Tensor& input_tensor = context->input(0); OP_REQUIRES(context, TensorShapeUtils::IsVector(input_tensor.shape()), errors::InvalidArgument("ZeroOut expects a 1-D vector.")); // ... } \end{verbatim} OP\_REQUIRES 断言的输入是一个 vector, 如果不是 vector, 将设置 \texttt{InvalidArgument} 状态并返回. \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h}{\texttt{OP\_REQUIRES} 宏} 有三个参数: \begin{itemize} \tightlist \item \texttt{context}: 可以是一个 \texttt{OpKernelContext} 或 \texttt{OpKernelConstruction} 指针 (参见 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_kernel.h}{\texttt{tensorflow/core/framework/op\_kernel.h}}), 其 \texttt{SetStatus()} 方法将被使用到. \item 检查条件: \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/tensor_shape.h}{\texttt{tensorflow/core/public/tensor\_shape.h}} 中有一些验证 tensor shape 的函数. \item 条件不满足时产生的错误: 错误用一个 \texttt{Status} 对象表示, 参见 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/public/status.h}{\texttt{tensorflow/core/public/status.h}}. \texttt{Status} 包含一个类型 (通常是 \texttt{InvalidArgument}, 但也可以是任何类型) 和一个消息. 构造 一个错误的函数位于 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h}{\texttt{tensorflow/core/lib/core/errors.h}} 中. \end{itemize} 如果想要测试一个函数返回的 \texttt{Status} 对象是否是一个错误, 可以使用 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h}{\texttt{OP\_REQUIRES\_OK}}. 这些宏如果检测到错误, 会直接跳出函数, 终止函数执行. \subsection{Op 注册 }\label{op-ux6ce8ux518c} \subsubsection{属性 }\label{ux5c5eux6027} Op 可以有属性, 属性的值在 Op 添加到图中时被设置. 属性值用于配置 Op, 在 kernel 实现中, Op 注册的输入和输出类型中, 均可访问这些属性值. 尽可能地使用输入代替属性, 因为输入的灵活性更高, 例如可以在执行步骤中 中被更改, 可以使用 feed 等等. 属性可用于实现一些输入无法做到的事情, 例如影响 Op 签名 (即输入输出的数量和类型) 的配置或只读配置可以通过属性实现. 注册 Op 时可以用 \texttt{Attr} 方法指定属性的名称和类型, 以此来定义一个属性, 形式如下: \begin{verbatim} : \end{verbatim} \texttt{\textless{}name\textgreater{}} 必须以字母开头, 可以由数字, 字母, 下划线组成. \texttt{\textless{}attr-type-expr\textgreater{}} 是一个类型表达式, 形式\protect\hyperlink{attr-types}{如下}: 例如, 如果想要 \texttt{ZeroOut} Op 保存一个用户索引, 指示该 Op 不仅仅只有一个元素, 你可以注册 Op 如下: \begin{verbatim} REGISTER_OP("ZeroOut") .Attr("preserve_index: int") .Input("to_zero: int32") .Output("zeroed: int32"); \end{verbatim} 你的 kernel 可以在构造函数里, 通过 \texttt{context} 参数访问这个属性: \begin{verbatim} class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction * context) : OpKernel(context) { // 获取欲保存的索引值 OP_REQUIRES_OK(context, context->GetAttr("preserve_index", &preserve_index_)); // 检查 preserve_index 是否为正 OP_REQUIRES(context, preserve_index_ >= 0, errors::InvalidArgument("Need preserve_index >= 0, got ", preserve_index_)); } void Compute(OpKernelContext* context) override { // ... } private: int preserve_index_; }; \end{verbatim} 该值可以在 \texttt{Compute} 方法中被使用: \begin{verbatim} void Compute(OpKernelContext* context) override { // ... // 检查 preserve_index 范围是否合法 OP_REQUIRES(context, preserve_index_ < input.dimension(0), errors::InvalidArgument("preserve_index out of range")); // 设置输出 tensor 所有的元素值为 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // 保存请求的输入值 output_flat(preserve_index_) = input(preserve_index_); } \end{verbatim} \begin{quote} 为了维持\protect\hyperlink{backwards-compatibility}{向后兼容性}, 将一个属性添加到一个已有的 Op 时, 必须指定一个\protect\hyperlink{default-values-constraints}{默认值}: \end{quote} \begin{verbatim} REGISTER_OP("ZeroOut") .Attr("preserve_index: int = 0") .Input("to_zero: int32") .Output("zeroed: int32"); \end{verbatim} \subsubsection{属性类型 }\label{ux5c5eux6027ux7c7bux578b} 属性可以使用下面的类型: \begin{itemize} \tightlist \item \texttt{string}: 任何二进制字节流 (UTF8 不是必须的). \item \texttt{int}: 一个有型整数. \item \texttt{float}: 一个浮点数. \item \texttt{bool}: 真或假. \item \texttt{type}: \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.cc\#DataTypeString}{\texttt{DataType}} 非引用类型之一. \item \texttt{shape}: 一个 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor_shape.proto}{\texttt{TensorShapeProto}}. \item \texttt{tensor}: 一个 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/tensor.proto}{\texttt{TensorProto}}. \item \texttt{list(\textless{}type\textgreater{})}: \texttt{\textless{}type\textgreater{}} 列表, 其中 \texttt{\textless{}type\textgreater{}} 是上述类型之一. 注意 \texttt{list(list(\textless{}type\textgreater{}))} 是无效的. \end{itemize} 权威的列表以 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.cc\#FinalizeAttr}{\texttt{op\_def\_builder.cc:FinalizeAttr}} 为准. \paragraph{默认值和约束条件 }\label{ux9ed8ux8ba4ux503cux548cux7ea6ux675fux6761ux4ef6} 属性可能有默认值, 一些类型的属性可以有约束条件. 为了定义一个有约束条件的属性, 你可以使用下列的 \texttt{\textless{}attr-type-expr\textgreater{}} 形式: \begin{itemize} \tightlist \item \texttt{\{\textquotesingle{}\textless{}string1\textgreater{}\textquotesingle{},\ \textquotesingle{}\textless{}string2\textgreater{}\textquotesingle{}\}}: 属性值必须是一个字符串, 取值可以为 \texttt{\textless{}string1\textgreater{}} 或 \texttt{\textless{}string2\textgreater{}}. 值的语法已经暗示了值的类型为 \texttt{string}, 已经暗示了. 下述语句模拟了一个枚举值: \end{itemize} \begin{verbatim} REGISTER_OP("EnumExample") .Attr("e: {'apple', 'orange'}"); \end{verbatim} \begin{itemize} \tightlist \item \texttt{\{\textless{}type1\textgreater{},\ \textless{}type2\textgreater{}\}}: 值是 \texttt{type} 类型, 且必须为 \texttt{\textless{}type1\textgreater{}} 或 \texttt{\textless{}type2\textgreater{}} 之一, 当然 \texttt{\textless{}type1\textgreater{}} 和 \texttt{\textless{}type2\textgreater{}} 必须都是有效的 \href{tensorflow-zh/SOURCE/resources/dims_types.md\#data-types}{tensor 类型}. 你无须指定属性的类型为 \texttt{type}, 而是通过 \texttt{\{...\}} 语句给出一个类型列表. 例如, 在下面的例子里, 属性 \texttt{t} 的类型必须为 \texttt{int32}, \texttt{float}, 或 \texttt{bool}: \end{itemize} \begin{verbatim} REGISTER_OP("RestrictedTypeExample") .Attr("t: {int32, float, bool}"); \end{verbatim} \begin{itemize} \item 这里有一些常见类型约束条件的快捷方式: \item \texttt{numbertype}: 限制类型为数字类型, 即非 string 非 bool 的类型. \item \texttt{realnumbertype}: 与 \texttt{numbertype} 区别是不支持复杂类型. \item \texttt{quantizedtype}: 与 \texttt{numbertype} 区别是只支持量化数值 (quantized number type). \end{itemize} 这些类型的列表在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/types.h}{\texttt{tensorflow/core/framework/types.h}} 文件中通过函数定义 (如 \texttt{NumberTypes()}). 本例中属性 \texttt{t} 必须为某种数字类型: \begin{verbatim} REGISTER_OP("NumberType") .Attr("t: numbertype"); \end{verbatim} 对于这个 Op: \begin{Shaded} \begin{Highlighting}[] \NormalTok{tf.number_type(t}\OperatorTok{=}\NormalTok{tf.int32) }\CommentTok{# 有效} \NormalTok{tf.number_type(t}\OperatorTok{=}\NormalTok{tf.}\BuiltInTok{bool}\NormalTok{) }\CommentTok{# 无效} \end{Highlighting} \end{Shaded} \begin{itemize} \tightlist \item \texttt{int\ \textgreater{}=\ \textless{}n\textgreater{}}: 值必须是一个整数, 且取值大于等于 \texttt{\textless{}n\textgreater{}}, \texttt{\textless{}n\textgreater{}} 是一个自然数. \end{itemize} 例如, 下列 Op 注册操作指定了属性 \texttt{a} 的取值至少为 \texttt{2}. \begin{verbatim} REGISTER_OP("MinIntExample") .Attr("a: int >= 2"); \end{verbatim} \begin{itemize} \tightlist \item \texttt{list(\textless{}type\textgreater{})\ \textgreater{}=\ \textless{}n\textgreater{}}: 一个 \texttt{\textless{}type\textgreater{}} 类型列表, 列表长度必须大于等于 \texttt{\textless{}n\textgreater{}}. \end{itemize} 例如, 下面的 Op 注册操作指定属性 \texttt{a} 是一个列表, 列表中的元素类型是 \texttt{int32} 或 \texttt{float}列表长度至少为3. \begin{verbatim} REGISTER_OP("TypeListExample") .Attr("a: list({int32, float}) >= 3"); \end{verbatim} 通过添加 \texttt{=\ \textless{}default\textgreater{}} 到约束条件末尾, 给一个属性设置默认值 (使其在自动生成的代码里 变成可选属性), 如下: \begin{verbatim} REGISTER_OP("AttrDefaultExample") .Attr("i: int = 0"); \end{verbatim} 默认值支持的语法将在最终 GraphDef 定义的 protobuf 表示中被使用. 下面是给所有类型赋予默认值的例子: \begin{verbatim} REGISTER_OP("AttrDefaultExampleForAllTypes") .Attr("s: string = 'foo'") .Attr("i: int = 0") .Attr("f: float = 1.0") .Attr("b: bool = true") .Attr("ty: type = DT_INT32") .Attr("sh: shape = { dim { size: 1 } dim { size: 2 } }") .Attr("te: tensor = { dtype: DT_INT32 int_val: 5 }") .Attr("l_empty: list(int) = []") .Attr("l_int: list(int) = [2, 3, 5, 7]"); \end{verbatim} 请特别注意那些类型值里面包含的 \href{tensorflow-zh/SOURCE/resources/dims_types.md\#data-types}{\texttt{DT\_*} 名称}. \subsubsection{多态 }\label{ux591aux6001} \hypertarget{type-polymorphism}{\paragraph{Type Polymorphism }\label{type-polymorphism}} 对于那些可以使用不同类型输入或产生不同类型输出的 Op, 可以注册 Op 时为输入/输出类型里指定一个\protect\hyperlink{attrs}{属性}. 一般紧接着, 会为每一个支持的类型注册一个 \texttt{OpKernel}. 例如, 除了 \texttt{int32} 外, 想要 \texttt{ZeroOut} Op 支持 \texttt{float}, 注册代码如下: \begin{verbatim} REGISTER_OP("ZeroOut") .Attr("T: {float, int32}") .Input("to_zero: T") .Output("zeroed: T"); \end{verbatim} 这段 Op 注册代码现在指定了输入的类型必须为 \texttt{float} 或 \texttt{int32}, 而且 既然输入和输出制定了同样的类型 \texttt{T}, 输出也同样如此. \begin{quote} 一个命名建议:\{\#naming\} 输入, 输出, 和属性通常使用 snake\_case 命名法. 唯一的例外是属性被用作输入类型或是输入类型的一部分. 当添加到图中时, 这些属性 可以被推断出来, 因此不会出现在 Op 的函数里. 例如, 最后一个 ZeroOut 定义 生成的 Python 函数如下: \end{quote} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{zero_out(to_zero, name}\OperatorTok{=}\VariableTok{None}\NormalTok{):} \CommentTok{"""...} \CommentTok{ 参数:} \CommentTok{ to_zero: 一个 `Tensor`. 必须为下列类型之一:} \CommentTok{ `float32`, `int32`.} \CommentTok{ name: 操作的名字 (可选).} \CommentTok{ 返回值:} \CommentTok{ 一个 `Tensor`, 类型和 `to_zero` 一样.} \CommentTok{ """} \end{Highlighting} \end{Shaded} \begin{quote} 如果输入的 \texttt{to\_zero} 是一个 \texttt{int32} 的tensor, 然后 \texttt{T} 将被自动 设置为 \texttt{int32} (实际上是 \texttt{DT\_INT32}). 那些推导出的属性的名称字母全大写 或采用驼峰命名法. 下面是一个输出类型自动推断的例子, 读者可以对比一下: \end{quote} \begin{verbatim} REGISTER_OP("StringToNumber") .Input("string_tensor: string") .Output("output: out_type") .Attr("out_type: {float, int32}"); .Doc(R"doc( Converts each string in the input Tensor to the specified numeric type. )doc"); \end{verbatim} \begin{quote} 在这种情况下, 用户需要在生成的 Python 代码中指定输出类型. \end{quote} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{string_to_number(string_tensor, out_type}\OperatorTok{=}\VariableTok{None}\NormalTok{, name}\OperatorTok{=}\VariableTok{None}\NormalTok{):} \CommentTok{"""将输入 Tensor 中的每一个字符串转化成指定的数字类型} \CommentTok{ 参数:} \CommentTok{ string_tensor: 一个 `string` 类型的 `Tensor`.} \CommentTok{ out_type: 一个可选的 `tf.DType`, 取值为 `tf.float32, tf.int32`.} \CommentTok{ 默认值是 `tf.float32`.} \CommentTok{ name: 操作的名称 (可选).} \CommentTok{ 返回值:} \CommentTok{ 一个 `out_type` 类型的 `Tensor`.} \CommentTok{ """} \end{Highlighting} \end{Shaded} \begin{verbatim} #include "tensorflow/core/framework/op_kernel.h" class ZeroOutInt32Op : public OpKernel { // 和之前一样 }; class ZeroOutFloatOp : public OpKernel { public: explicit ZeroOutFloatOp(OpKernelConstruction * context) : OpKernel(context) {} void Compute(OpKernelContext * context) override { // 获取输入 tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // 创建一个输出 tensor Tensor * output = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output)); auto output_flat = output->template flat(); // 设置输出 tensor 的所有元素为 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; }
// 保留第一个输入值 if (N > 0) output_flat(0) = input(0); } }; // 注意, TypeConstraint("T") 意味着属性 "T" (在上面 Op 注册代码中 // 定义的) 必须是 "int32", 才能实例化. REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint<int32>("T"), ZeroOutOpInt32); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutFloatOp); \end{verbatim} \begin{quote} 为了保持\protect\hyperlink{backwards-compatibility}{向后兼容性}, 你在为一个 已有的 op 添加属性时, 必须指定一个\protect\hyperlink{default-values-constraints}{默认值}: \end{quote} \begin{verbatim} REGISTER_OP("ZeroOut") .Attr("T: {float, int32} = DT_INT32") .Input("to_zero: T") .Output("zeroed: T") \end{verbatim} 如果需要添加更多类型, 例如 \texttt{double}: \begin{verbatim} REGISTER_OP("ZeroOut") .Attr("T: {float, double, int32}") .Input("to_zero: T") .Output("zeroed: T"); \end{verbatim} 为了避免为新增的类型写冗余的 \texttt{OpKernel} 代码, 通常可以写一个 C++ 模板作为替代. 当然, 仍然需要为每一个重载版本定义一个 keneral 注册 (\texttt{REGISTER\textbackslash{}\_KERNEL\textbackslash{}\_BUILDER} 调用). \begin{verbatim} template ; class ZeroOutOp : public OpKernel { public: explicit ZeroOutOp(OpKernelConstruction* context) : OpKernel(context) {} void Compute(OpKernelContext* context) override { // 获取输入 tensor const Tensor& input_tensor = context->input(0); auto input = input_tensor.flat(); // 创建一个输出 tensor Tensor* output = NULL; OP_REQUIRES_OK(context, context->allocate_output(0, input_tensor.shape(), &output)); auto output_flat = output->template flat(); // 设置输出 tensor 的所有元素为 0 const int N = input.size(); for (int i = 0; i < N; i++) { output_flat(i) = 0; } // Preserve the first input value if (N > 0) output_flat(0) = input(0); } }; };
// 注意, TypeConstraint("T") 意味着属性 "T" (在上面 Op 注册代码中 // 定义的) 必须是 "int32", 才能实例化. REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); REGISTER_KERNEL_BUILDER( Name("ZeroOut") .Device(DEVICE_CPU) .TypeConstraint("T"), ZeroOutOp); \end{verbatim} 如果有很多重载版本, 可以将注册操作通过一个宏来实现. \begin{verbatim} #include "tensorflow/core/framework/op_kernel.h" #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) REGISTER_KERNEL(int32); REGISTER_KERNEL(float); REGISTER_KERNEL(double); #undef REGISTER_KERNEL \end{verbatim} 取决于注册 kernel 使用哪些类型, 你可能可以使用\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/register_types.h}{\texttt{tensorflow/core/framework/register\_types.h}} 提供的宏: \begin{verbatim} #include "tensorflow/core/framework/op_kernel.h" #include "tensorflow/core/framework/register_types.h" REGISTER_OP("ZeroOut") .Attr("T: realnumbertype") .Input("to_zero: T") .Output("zeroed: T"); template class ZeroOutOp : public OpKernel { ... }; #define REGISTER_KERNEL(type) \ REGISTER_KERNEL_BUILDER( \ Name("ZeroOut").Device(DEVICE_CPU).TypeConstraint("T"), \ ZeroOutOp) TF_CALL_REAL_NUMBER_TYPES(REGISTER_KERNEL); #undef REGISTER_KERNEL \end{verbatim} \paragraph{列表输入和输出 }\label{ux5217ux8868ux8f93ux5165ux548cux8f93ux51fa} 除了能够使用不同类型的 tensor 作为输入或输出, Op 还支持使用多个 tensor 作为输入或输出. 在接下来的例子里, 属性 \texttt{T} 存储了一个类型\emph{列表}, 并同时作为输入 \texttt{in} 和输出 \texttt{out} 的类型. 输入和输出均为指定类型的 tensor 列表. 既然输入和输出的类型均为 \texttt{T}, 它们的 tensor 数量和类型 是一致的. \begin{verbatim} REGISTER_OP("PolymorphicListExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T"); \end{verbatim} 可以为列表中可存放的类型设置约束条件. 在下一个例子中, 输入是 \texttt{float} 和 \texttt{double} 类型的 tensor 列表. 例如, 这个 Op 可接受的 输入类型为 \texttt{(float,\ double,\ float)} 的数据, 且在此情况下, 输出类型同样 为 \texttt{(float,\ double,\ float)}. \begin{verbatim} REGISTER_OP("ListTypeRestrictionExample") .Attr("T: list({float, double})") .Input("in: T") .Output("out: T"); \end{verbatim} 如果想要一个列表中的所有 tensor 是同一类型, 你需要写下列代码: \begin{verbatim} REGISTER_OP("IntListInputExample") .Attr("N: int") .Input("in: N * int32") .Output("out: int32"); \end{verbatim} 这段代码接受 \texttt{int32} tensor 列表, 并用一个 \texttt{int} 属性 \texttt{N} 来指定列表的长度. 这也可用于\protect\hyperlink{type-polymorphism}{类型推断}. 在下一个例子中, 输入是一个 tensor 列表, 长度为 \texttt{"N"}, 类型为 \texttt{"T"}, 输出是单个 \texttt{"T"} 的 tensor: \begin{verbatim} REGISTER_OP("SameListInputExample") .Attr("N: int") .Attr("T: type") .Input("in: N * T") .Output("out: T"); \end{verbatim} 默认情况下, tensor 列表的最小长度为1. 这个约束条件可以通过 \protect\hyperlink{default-values-constraints}{为指定的属性增加一个 \texttt{"\textgreater{}="} 约束}来变更: \begin{verbatim} REGISTER_OP("MinLengthIntListExample") .Attr("N: int >= 2") .Input("in: N * int32") .Output("out: int32"); \end{verbatim} 同样的语法也适用于 \texttt{"list(type)"} 属性: \begin{verbatim} REGISTER_OP("MinimumLengthPolymorphicListExample") .Attr("T: list(type) >= 3") .Input("in: T") .Output("out: T"); \end{verbatim} \subsubsection{输入和输出 }\label{ux8f93ux5165ux548cux8f93ux51fa} 总结一下上述内容, 一个 Op 注册操作可以指定多个输入和输出: \begin{verbatim} REGISTER_OP("MultipleInsAndOuts") .Input("y: int32") .Input("z: float") .Output("a: string") .Output("b: int32"); \end{verbatim} 每一个输入或输出形式如下: \begin{verbatim} : \end{verbatim} 其中, \texttt{\textless{}name\textgreater{}} 以字母打头, 且只能由数字, 字母和下划线组成. \texttt{\textless{}io-type-expr\textgreater{}} 可以是 下列类型表达式之一: \begin{itemize} \tightlist \item \texttt{\textless{}type\textgreater{}}, 一个合法的输入类型, 如 \texttt{float}, \texttt{int32}, \texttt{string}. 这可用于指定给定类型的单个 tensor. \end{itemize} 参见\href{tensorflow-zh/SOURCE/resources/dims_types.md\#data-types}{合法 Tensor 类型列表}. \begin{verbatim} REGISTER_OP("BuiltInTypesExample") .Input("integers: int32") .Input("complex_numbers: scomplex64"); \end{verbatim} \begin{itemize} \tightlist \item \texttt{\textless{}attr-type\textgreater{}}, 一个\protect\hyperlink{attrs}{属性}和一个类型 \texttt{type} 或类型列表 \texttt{list(type)}(可能 包含类型限制). 该语法可实现\protect\hyperlink{Polymorphism}{多态 Op}. \end{itemize} \begin{verbatim} REGISTER_OP("PolymorphicSingleInput") .Attr("T: type") .Input("in: T); REGISTER_OP("RestrictedPolymorphicSingleInput") .Attr("T: {int32, int64}") .Input("in: T); \end{verbatim} 将属性的类型设置为 \texttt{list(type)} 将允许你接受一个序列的 tensor. \begin{verbatim} REGISTER_OP("ArbitraryTensorSequenceExample") .Attr("T: list(type)") .Input("in: T") .Output("out: T"); REGISTER_OP("RestrictedTensorSequenceExample") .Attr("T: list({int32, int64})") .Input("in: T") .Output("out: T"); \end{verbatim} 注意, 输入和输出均为 \texttt{T}, 意味着输入和输出的类型与数量均相同. \begin{itemize} \tightlist \item \texttt{\textless{}number\textgreater{}\ *\ \textless{}type\textgreater{}}, 一组拥有相同类型的 tensor, \texttt{\textless{}number\textgreater{}} 是一个 \texttt{int} 类型属性的名称. \texttt{\textless{}type\textgreater{}} 可以是\href{tensorflow-zh/SOURCE/resources/dims_types.md\#data-types}{一个类似于 \texttt{int32} 和 \texttt{float} 的特定类型}, 或者一个 \texttt{type} 类型属性的名字. 前者的例子如下, 该例子接受一个 \texttt{int32} tensor 列表作为 Op 输入: \end{itemize} \begin{verbatim} REGISTER_OP("Int32SequenceExample") .Attr("NumTensors: int") .Input("in: NumTensors * int32") \end{verbatim} 后者的例子如下, 该例子接受一个泛型 tensor 列表作为 Op 输入: \begin{verbatim} REGISTER_OP("SameTypeSequenceExample") .Attr("NumTensors: int") .Attr("T: type") .Input("in: NumTensors * T") \end{verbatim} \begin{itemize} \tightlist \item Tensor 的引用表示为 \texttt{Ref(\textless{}type\textgreater{})}, 其中 \texttt{\textless{}type\textgreater{}} 是上述类型之一. \end{itemize} \begin{quote} 一个命名建议: 当使用属性表示一个输入的类型时, 该类型可以被推断出来. 实现该特性, 将需要推断 的类型用大写名称表示 (如 \texttt{T} 或 \texttt{N}), 其它的输入, 输出, 和属性像使用函数参数一样使用这些 大写名称. 参见之前的\protect\hyperlink{naming}{命名建议}章节查看更多细节. \end{quote} 更多细节参见 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op_def_builder.h}{\texttt{tensorflow/core/framework/op\_def\_builder.h}}. \subsubsection{向后兼容性 }\label{ux5411ux540eux517cux5bb9ux6027} 通常, 对规范的改变必须保持向后兼容性: Op 使用新规范后, 需保证使用旧规范构造的序列化 GraphDef 仍能正确工作. 下面是几种保持向后兼容性的方式: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item 任何添加到 Op 的新属性必须有默认值, 且默认值下的行为有明确定义. 将一个非多态的操作变为多态操作, 你\emph{必须}为新的类型属性赋予默认值, 以保持原始的函数签名. 例如, 有如下操作: \end{enumerate} \begin{verbatim} REGISTER_OP("MyGeneralUnaryOp") .Input("in: float") .Output("out: float"); \end{verbatim} 可以通过下述方式将其变为多态, 且保持向后兼容性: \begin{verbatim} REGISTER_OP("MyGeneralUnaryOp") .Input("in: T") .Output("out: T") .Attr("T: numerictype = float"); \end{verbatim} 1.放宽一个属性的约束条件是安全的. 例如, 你可以将 \texttt{\{int32,\ int64\}} 变为 \texttt{\{int32,\ int64,\ float\}}, 或者, 将 \texttt{\{"apple",\ "orange"\}} 变为 \texttt{\{"apple",\ "banana",\ "orange"\}}. 2.通过给 Op 名称添加一些项目中唯一的标识作为前缀, 来为新建的 Op 添加命名空间. 命名空间 可以预防你的 Op 与 TensorFlow 未来版本里的内置 Op 产生命名冲突. 3.超前计划! 尝试着去预测 Op 未来的的用途, 超前设计, 毕竟, 一些签名的变更无法保证兼容性 (例如, 增加新的输入, 或将原来的单元素输入变成一个列表). 如果不能以兼容的方式改变一个操作, 那就创建一个全新的操作, 来实现所需功能. \subsection{GPU 支持 }\label{gpu-ux652fux6301} 你可以实现不同的 OpKernel, 将其中之一注册到 GPU, 另一个注册到 GPU, 正如\protect\hyperlink{Polymorphism}{为不同的类型注册 kernel}一样. \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/}{\texttt{tensorflow/core/kernels/}} 中有一些 GPU 支持的例子. 注意, 一些 kernel 的 CPU 版本位于 \texttt{.cc} 文件, GPU 版本位于 \texttt{\_gpu.cu.cc} 文件, 共享的代码位于 \texttt{.h} 文件. 例如, \href{tensorflow-zh/SOURCE/api_docs/python/array_ops.md\#pad}{\texttt{pad} op} 除了 GPU kernel 外的其它代码 均在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.cc}{\texttt{tensorflow/core/kernels/pad\_op.cc}} 中. GPU kernel 位于 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op_gpu.cu.cc}{\texttt{tensorflow/core/kernels/pad\_op\_gpu.cu.cc}}, 共享的一个模板类代码定义在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/pad_op.h}{\texttt{tensorflow/core/kernels/pad\_op.h}}. 需要注意的事情是, 即使使用 \texttt{pad} 的 GPU 版本时, 仍然需要将 \texttt{"paddings"} 输入放置到内存中. 为了实现这一点, 将输入或输出标记为必须保存在内存中, 为 kernel 注册一个 \texttt{HostMemory()} 调用. 如下: \begin{verbatim} #define REGISTER_GPU_KERNEL(T) \ REGISTER_KERNEL_BUILDER(Name("Pad") \ .Device(DEVICE_GPU) \ .TypeConstraint("T") \ .HostMemory("paddings"), \ PadOp) \end{verbatim} \subsection{使用 Python 实现梯度 }\label{ux4f7fux7528-python-ux5b9eux73b0ux68afux5ea6} 给定一个 Op 组成的图, TensorFlow 使用自动微分 (反向传播) 来添加新的 Op 以表示梯度运算, 同时 不影响已有的 Op (参见\href{tensorflow-zh/SOURCE/api_docs/python/train.md\#gradient-computation}{梯度运算}). 为了使自动微分能够与新的 Op 协同工作, 必须注册一个梯度函数, 从 Op 的输入计算梯度, 并返回代表 梯度值的输出. 数学上, 如果一个 Op 计算 \textbackslash{}(y = f(x)\textbackslash{}), 注册的梯度 Op 通过以下链式法则, 将 \textbackslash{}(\partial / \partial y\textbackslash{}) 的梯度运算转化为 \textbackslash{}(\partial / \partial x\textbackslash{}) 的梯度运算. \[\frac{\partial}{\partial x} = \frac{\partial}{\partial y} \frac{\partial y}{\partial x} = \frac{\partial}{\partial y} \frac{\partial f}{\partial x}.\] 在 \texttt{ZeroOut} 的例子中, 输入中只有一个项会影响输出, 所以, 代表输入的梯度值的 tensor 也只有 一个输入项. 如下所示: \begin{Shaded} \begin{Highlighting}[] \ImportTok{from} \NormalTok{tensorflow.python.framework }\ImportTok{import} \NormalTok{ops} \ImportTok{from} \NormalTok{tensorflow.python.ops }\ImportTok{import} \NormalTok{array_ops} \ImportTok{from} \NormalTok{tensorflow.python.ops }\ImportTok{import} \NormalTok{sparse_ops} \AttributeTok{@ops.RegisterGradient}\NormalTok{(}\StringTok{"ZeroOut"}\NormalTok{)} \KeywordTok{def} \NormalTok{_zero_out_grad(op, grad):} \CommentTok{"""`zero_out` 的梯度.} \CommentTok{ 参数:} \CommentTok{ op: 欲进行微分的 `zero_out` `操作`, 可以用于获取原始 Op 的输入和输出.} \CommentTok{ grad: 代表 `zero_out` 输出的梯度 Op.} \CommentTok{ 返回:} \CommentTok{ 代表输入 `zero_out` 的微分.} \CommentTok{ """} \NormalTok{to_zero }\OperatorTok{=} \NormalTok{op.inputs[}\DecValTok{0}\NormalTok{]} \NormalTok{shape }\OperatorTok{=} \NormalTok{array_ops.shape(to_zero)} \NormalTok{index }\OperatorTok{=} \NormalTok{array_ops.zeros_like(shape)} \NormalTok{first_grad }\OperatorTok{=} \NormalTok{array_ops.reshape(grad, [}\OperatorTok{-}\DecValTok{1}\NormalTok{])[}\DecValTok{0}\NormalTok{]} \NormalTok{to_zero_grad }\OperatorTok{=} \NormalTok{sparse_ops.sparse_to_dense(index, shape, first_grad, }\DecValTok{0}\NormalTok{)} \ControlFlowTok{return} \NormalTok{[to_zero_grad] }\CommentTok{# 单个 Tensor 的列表, 既然只有一个输入} \end{Highlighting} \end{Shaded} 使用 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#RegisterGradient}{\texttt{ops.RegisterGradient}} 注册梯度函数需要注意的一些细节: \begin{itemize} \item 对于仅有一个输出的 Op, 梯度函数使用 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#Operation}{\texttt{Operation}} \texttt{op} 和一个 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#Tensor}{\texttt{Tensor}} \texttt{grad} 作为参数, 并从 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#Operation.inputs}{\texttt{op.inputs{[}i{]}}}, \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#Operation.outputs}{\texttt{op.outputs{[}i{]}}}, 和 \texttt{grad} 构建新的 Op. 属性的信息可以通过 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#Operation.get_attr}{\texttt{op.get\_attr}} 获取. \item 如果 Op 有多个输出, 梯度函数将使用 \texttt{op} 和 \texttt{grads} 作为参数, 其中, \texttt{grads} 是一个 梯度 Op 的列表, 为每一个输出计算梯度. 梯度函数的输出必须是一个 \texttt{Tensor} 对象列表, 对应到 每一个输入的梯度. \item 如果没有为一些输入定义梯度, 譬如用作索引的整型, 这些输入返回的梯度为 \texttt{None}. 举一个例子, 如果一个 Op 的输入为一个浮点数 tensor \texttt{x} 和一个整型索引 \texttt{i}, 那么梯度函数将返回 \texttt{{[}x\_grad,\ None{]}}. \item 如果梯度对于一个 Op 来说毫无意义, 使用 \texttt{ops.NoGradient("OpName")} 禁用自动差分. \end{itemize} 注意当梯度函数被调用时, 作用的对象是数据流图中的 Op, 而不是 tensor 数据本身. 因此, 只有在图运行时, 梯度运算才会被其它 tensorflow Op 的执行动作所触发. \subsection{在 Python 中实现一个形状函数 }\label{ux5728-python-ux4e2dux5b9eux73b0ux4e00ux4e2aux5f62ux72b6ux51fdux6570} TensorFlow Python API 有一个 ``形状推断'' 功能, 可以不执行图就获取 tensor 的形状信息. 形状推断功能藉由每一个 Op 类型注册的 ``形状函数'' 来支持, 该函数有两个规则: 假设所有输入的 形状必须是兼容的, 以及指定输出的形状. 一个形状函数以一个 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#Operation}{\texttt{Operation}} 作为输入, 返回一个 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#TensorShape}{\texttt{TensorShape}} 对象列表 (每一个输出一个对象). 使用 \href{tensorflow-zh/SOURCE/api_docs/python/framework.md\#RegisterShape}{\texttt{tf.RegisterShape} 装饰器} 注册形状函数. 例如, \protect\hyperlink{defineux5finterface}{上文定义的 \texttt{ZeroOut} Op} 的形状函数如下: \begin{Shaded} \begin{Highlighting}[] \AttributeTok{@tf.RegisterShape}\NormalTok{(}\StringTok{"ZeroOut"}\NormalTok{):} \KeywordTok{def} \NormalTok{_zero_out_shape(op):} \CommentTok{"""ZeroOut Op 的形状函数.} \CommentTok{ 这是 ZeroOut 形状函数的无约束版本, 为每一个输出产生的形状和对应的输入一样. } \CommentTok{ """} \ControlFlowTok{return} \NormalTok{[op.inputs[}\DecValTok{0}\NormalTok{].get_shape()]} \end{Highlighting} \end{Shaded} 一个形状函数也可以约束输入的形状. 下面是 \protect\hyperlink{Validation}{ZeroOut 形状函数的 vector 输入约束}版本: \begin{Shaded} \begin{Highlighting}[] \AttributeTok{@tf.RegisterShape}\NormalTok{(}\StringTok{"ZeroOut"}\NormalTok{):} \KeywordTok{def} \NormalTok{_zero_out_shape(op):} \CommentTok{"""ZeroOut Op 的形状函数.} \CommentTok{ 这是 ZeroOut 形状函数的约束版本, 要输入的 rank 必须是 1 (即使一个 vector).} \CommentTok{ """} \NormalTok{input_shape }\OperatorTok{=} \NormalTok{op.inputs[}\DecValTok{0}\NormalTok{].get_shape().with_rank(}\DecValTok{1}\NormalTok{)} \ControlFlowTok{return} \NormalTok{[input_shape]} \end{Highlighting} \end{Shaded} 如果 Op 是\protect\hyperlink{Polymorphism}{多输入的多态 Op}, 使用操作的属性来决定需要检查的形状数量: \begin{verbatim} @tf.RegisterShape("IntListInputExample") def _int_list_input_example_shape(op): """ "IntListInputExample" Op 的形状函数. 所有的输入和输出是同大小的矩阵. """ output_shape = tf.TensorShape(None) for input in op.inputs: output_shape = output_shape.merge_with(input.get_shape().with_rank(2)) return [output_shape] \end{verbatim} 既然形状推断是一个可选的特性, 且 tensor 的形状可能动态变化, 形状函数必须足够健壮, 能够处理任意 输入形状信息缺失的情形. \href{tensorflow-zh/SOURCE/api_docs/python/framework.md}{\texttt{merge\_with}} 方法能够帮助 调用者判断两个形状是否是一样的, 即使两个形状的信息不全, 该函数同样有效. 所有的\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/}{标准 Python Op} 的形状函数都已经定义好了, 并且已经有很多不同的使用示例. \begin{quote} 原文:\href{http://www.tensorflow.org/how_tos/adding_an_op/index.html\#adding-a-new-op}{Adding a New Op} 翻译:{[}@doc001{]}(https://github.com/PFZheng) 校对:{[}@ZHNathanielLee{]}(https://github.com/ZHNathanielLee) \end{quote} ================================================ FILE: tex_pdf/how_tos/c3s08_new_data_formats.tex ================================================ \section{自定义数据读取 }\label{ux81eaux5b9aux4e49ux6570ux636eux8bfbux53d6} 基本要求: \begin{itemize} \tightlist \item 熟悉 C++ 编程。 \item 确保\href{tensorflow-zh/SOURCE/get_started/os_setup.md\#source}{下载 TensorFlow 源文件}, 并可编译使用。 \end{itemize} 我们将支持文件格式的任务分成两部分: \begin{itemize} \tightlist \item 文件格式: 我们使用 \emph{Reader} Op来从文件中读取一个 \emph{record} (可以使任意字符串)。 \item 记录格式: 我们使用解码器或者解析运算将一个字符串记录转换为TensorFlow可以使用的张量。 \end{itemize} 例如, 读取一个 \href{https://en.wikipedia.org/wiki/Comma-separated_values}{CSV 文件},我们使用 \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#TextLineReader}{一个文本读写器}, 然后是\href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#decode_csv}{从一行文本中解析CSV数据的运算}。 \subsection{主要内容}\label{ux4e3bux8981ux5185ux5bb9} \subsubsection{\texorpdfstring{\protect\hyperlink{AUTOGENERATED-custom-data-readers}{自定义数据读取}}{自定义数据读取}}\label{ux81eaux5b9aux4e49ux6570ux636eux8bfbux53d6-1} \begin{itemize} \tightlist \item \protect\hyperlink{AUTOGENERATED-writing-a-reader-for-a-file-format}{编写一个文件格式读写器} \item \protect\hyperlink{AUTOGENERATED-writing-an-op-for-a-record-format}{编写一个记录格式Op} \end{itemize} \subsection{编写一个文件格式读写器 }\label{ux7f16ux5199ux4e00ux4e2aux6587ux4ef6ux683cux5f0fux8bfbux5199ux5668} Reader 是专门用来读取文件中的记录的。TensorFlow中内建了一些读写器Op的实例: \begin{itemize} \tightlist \item \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#TFRecordReader}{tf.TFRecordReader} (\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/tf_record_reader_op.cc}{代码位于kernels/tf\_record\_reader\_op.cc}) \item \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#FixedLengthRecordReader}{tf.FixedLengthRecordReader} (\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/fixed_length_record_reader_op.cc}{代码位于 kernels/fixed\_length\_record\_reader\_op.cc}) \item \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#TextLineReader}{tf.TextLineReader} (\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/text_line_reader_op.cc}{代码位于 kernels/text\_line\_reader\_op.cc}) \end{itemize} 你可以看到这些读写器的界面是一样的,唯一的差异是在它们的构造函数中。最重要的方法是 Read。 它需要一个行列参数,通过这个行列参数,可以在需要的时候随时读取文件名 (例如: 当 Read Op首次运行,或者前一个 Read` 从一个文件中读取最后一条记录时)。它将会生成两个标量张量: 一个字符串和一个字符串关键值。 新创建一个名为 SomeReader 的读写器,需要以下步骤: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item 在 C++ 中, 定义一个 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h}{tensorflow::ReaderBase}的子类,命名为 ``SomeReader''. \item 在 C++ 中,注册一个新的读写器Op和Kernel,命名为 ``SomeReader''。 \item 在 Python 中, 定义一个 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py}{tf.ReaderBase} 的子类,命名为 ``SomeReader''。 \end{enumerate} 你可以把所有的 C++ 代码放在 \texttt{tensorflow/core/user\_ops/some\_reader\_op.cc}文件中. 读取文件的代码将被嵌入到C++ 的 ReaderBase 类的迭代中。 这个 ReaderBase 类 是在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/kernels/reader_base.h}{tensorflow/core/kernels/reader\_base.h} 中定义的。 你需要执行以下的方法: \begin{itemize} \tightlist \item OnWorkStartedLocked:打开下一个文件 \item ReadLocked:读取一个记录或报告 EOF/error \item OnWorkFinishedLocked:关闭当前文件 \item ResetLocked:清空记录,例如:一个错误记录 \end{itemize} 以上这些方法的名字后面都带有 ``Locked'', 表示 ReaderBase 在调用任何一个方法之前确保获得互斥锁,这样就不用担心线程安全(虽然只保护了该类中的元素而不是全局的)。 对于 OnWorkStartedLocked, 需要打开的文件名是 \texttt{current\_work()} 函数的返回值。此时的 ReadLocked 的数字签名如下: \begin{verbatim} Status ReadLocked(string* key, string* value, bool* produced, bool* at_end) \end{verbatim} 如果 ReadLocked 从文件中成功读取了一条记录,它将更新为: \begin{itemize} \tightlist \item *key: 记录的标志位,通过该标志位可以重新定位到该记录。 可以包含从 current\_work() 返回值获得的文件名,并追加一个记录号或其他信息。 \item *value: 包含记录的内容。 \item *produced: 设置为 true。 \end{itemize} 当你在文件(EOF)末尾,设置 *at\_end 为 true ,在任何情况下,都将返回 Status::OK()。 当出现错误的时候,只需要使用 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/lib/core/errors.h}{tensorflow/core/lib/core/errors.h} 中的一个辅助功能就可以简单地返回,不需要做任何参数修改。 接下来你讲创建一个实际的读写器Op。 如果你已经熟悉了\href{tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md}{添加新的Op} 那会很有帮助。 主要步骤如下: \begin{itemize} \tightlist \item 注册Op。 \item 定义并注册 OpKernel。 \end{itemize} 要注册Op,你需要用到一个调用指令定义在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/op.h}{tensorflow/core/framework/op.h}中的REGISTER\_OP。 读写器 Op 没有输入,只有 Ref(string) 类型的单输出。它们调用 SetIsStateful(),并有一个 container 字符串和 shared\_name 属性. 你可以在一个 Doc 中定义配置或包含文档的额外属性。 例如:详见 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/ops/io_ops.cc}{tensorflow/core/ops/io\_ops.cc}等: \begin{verbatim} #include "tensorflow/core/framework/op.h" REGISTER_OP("TextLineReader") .Output("reader_handle: Ref(string)") .Attr("skip_header_lines: int = 0") .Attr("container: string = ''") .Attr("shared_name: string = ''") .SetIsStateful() .Doc(R"doc( A Reader that outputs the lines of a file delimited by '\n'. )doc"); \end{verbatim} 要定义一个 OpKernel, 读写器可以使用定义在\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/framework/reader_op_kernel.h}{tensorflow/core/framework/reader\_op\_kernel.h}中的 ReaderOpKernel 的递减快捷方式,并运行一个叫 SetReaderFactory 的构造函数。 定义所需要的类之后,你需要通过 REGISTER\_KERNEL\_BUILDER(\ldots{}) 注册这个类。 一个没有属性的例子: \begin{verbatim} #include "tensorflow/core/framework/reader_op_kernel.h" class TFRecordReaderOp : public ReaderOpKernel { public: explicit TFRecordReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { Env* env = context->env(); SetReaderFactory([this, env]() { return new TFRecordReader(name(), env); }); } }; REGISTER_KERNEL_BUILDER(Name("TFRecordReader").Device(DEVICE_CPU), TFRecordReaderOp); \end{verbatim} 一个带有属性的例子: \begin{verbatim} #include "tensorflow/core/framework/reader_op_kernel.h" class TextLineReaderOp : public ReaderOpKernel { public: explicit TextLineReaderOp(OpKernelConstruction* context) : ReaderOpKernel(context) { int skip_header_lines = -1; OP_REQUIRES_OK(context, context->GetAttr("skip_header_lines", &skip_header_lines)); OP_REQUIRES(context, skip_header_lines >= 0, errors::InvalidArgument("skip_header_lines must be >= 0 not ", skip_header_lines)); Env* env = context->env(); SetReaderFactory([this, skip_header_lines, env]() { return new TextLineReader(name(), skip_header_lines, env); }); } }; REGISTER_KERNEL_BUILDER(Name("TextLineReader").Device(DEVICE_CPU), TextLineReaderOp); \end{verbatim} 最后一步是添加 Python 包装器,你需要将 tensorflow.python.ops.io\_ops 导入到 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/user_ops/user_ops.py}{tensorflow/python/user\_ops/user\_ops.py},并添加一个 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py}{io\_ops.ReaderBase}的衍生函数。 \begin{verbatim} from tensorflow.python.framework import ops from tensorflow.python.ops import common_shapes from tensorflow.python.ops import io_ops class SomeReader(io_ops.ReaderBase): def __init__(self, name=None): rr = gen_user_ops.some_reader(name=name) super(SomeReader, self).__init__(rr) ops.NoGradient("SomeReader") ops.RegisterShape("SomeReader")(common_shapes.scalar_shape) \end{verbatim} 你可以在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/python/ops/io_ops.py}{tensorflow/python/ops/io\_ops.py}中查看一些范例。 \subsection{编写一个记录格式Op }\label{ux7f16ux5199ux4e00ux4e2aux8bb0ux5f55ux683cux5f0fop} 一般来说,这是一个普通的Op, 需要一个标量字符串记录作为输入, 因此遵循 \href{tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md}{添加Op的说明}。 你可以选择一个标量字符串作为输入, 并包含在错误消息中报告不正确的格式化数据。 用于解码记录的运算实例: \begin{itemize} \tightlist \item \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#parse_single_example}{tf.parse\_single\_example} (and \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#parse_example}{tf.parse\_example}) \item \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#decode_csv}{tf.decode\_csv} \item \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#decode_raw}{tf.decode\_raw} \end{itemize} 请注意,使用多个Op 来解码某个特定的记录格式也是有效的。 例如,你有一张以字符串格式保存在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/core/example/example.proto}{tf.train.Example 协议缓冲区}的图像文件。 根据该图像的格式, 你可能从 \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#parse_single_example}{tf.parse\_single\_example} 的Op 读取响应输出并调用 \href{tensorflow-zh/SOURCE/api_docs/python/image.md\#decode_jpeg}{tf.decode\_jpeg}, \href{tensorflow-zh/SOURCE/api_docs/python/image.md\#decode_png}{tf.decode\_png}, 或者 \href{tensorflow-zh/SOURCE/api_docs/python/io_ops.md\#decode_raw}{tf.decode\_raw}。通过读取 tf.decode\_raw 的响应输出并使用\href{tensorflow-zh/SOURCE/api_docs/python/array_ops.md\#slice}{tf.slice} 和 \href{tensorflow-zh/SOURCE/api_docs/python/array_ops.md\#reshape}{tf.reshape} 来提取数据是通用的方法。 \textgreater{} 原文:\href{http://tensorflow.org/how_tos/new_data_formats/index.html\#custom-data-readers}{Custom Data Readers} 翻译:{[}@derekshang{]}(https://github.com/derekshang) 校对:\href{https://github.com/jikexueyuanwiki}{Wiki} ================================================ FILE: tex_pdf/how_tos/c3s09_using_gpu.tex ================================================ \section{使用 GPUs }\label{ux4f7fux7528-gpus} \subsection{支持的设备 }\label{ux652fux6301ux7684ux8bbeux5907} 在一套标准的系统上通常有多个计算设备. TensorFlow 支持 CPU 和 GPU 这两种设备. 我们用指定字符串 \texttt{strings} 来标识这些设备. 比如: \begin{itemize} \tightlist \item \texttt{"/cpu:0"}: 机器中的 CPU \item \texttt{"/gpu:0"}: 机器中的 GPU, 如果你有一个的话. \item \texttt{"/gpu:1"}: 机器中的第二个 GPU, 以此类推\ldots{} \end{itemize} 如果一个 TensorFlow 的 operation 中兼有 CPU 和 GPU 的实现, 当这个算子被指派设备时, GPU 有优先权. 比如\texttt{matmul}中 CPU 和 GPU kernel 函数都存在. 那么在 \texttt{cpu:0} 和 \texttt{gpu:0} 中, \texttt{matmul} operation 会被指派给 \texttt{gpu:0} . \subsection{记录设备指派情况 }\label{ux8bb0ux5f55ux8bbeux5907ux6307ux6d3eux60c5ux51b5} 为了获取你的 operations 和 Tensor 被指派到哪个设备上运行, 用 \texttt{log\_device\_placement} 新建一个 \texttt{session}, 并设置为 \texttt{True}. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 新建一个 graph.} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], name}\OperatorTok{=}\StringTok{'a'}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{2}\NormalTok{], name}\OperatorTok{=}\StringTok{'b'}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.matmul(a, b)} \CommentTok{# 新建session with log_device_placement并设置为True.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session(config}\OperatorTok{=}\NormalTok{tf.ConfigProto(log_device_placement}\OperatorTok{=}\VariableTok{True}\NormalTok{))} \CommentTok{# 运行这个 op.} \BuiltInTok{print} \NormalTok{sess.run(c)} \end{Highlighting} \end{Shaded} 你应该能看见以下输出: \begin{verbatim} Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0 b: /job:localhost/replica:0/task:0/gpu:0 a: /job:localhost/replica:0/task:0/gpu:0 MatMul: /job:localhost/replica:0/task:0/gpu:0 [[ 22. 28.] [ 49. 64.]] \end{verbatim} \subsection{手工指派设备 }\label{ux624bux5de5ux6307ux6d3eux8bbeux5907} 如果你不想使用系统来为 operation 指派设备, 而是手工指派设备, 你可以用 \texttt{with\ tf.device} 创建一个设备环境, 这个环境下的 operation 都统一运行在环境指定的设备上. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 新建一个graph.} \ControlFlowTok{with} \NormalTok{tf.device(}\StringTok{'/cpu:0'}\NormalTok{):} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], name}\OperatorTok{=}\StringTok{'a'}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{2}\NormalTok{], name}\OperatorTok{=}\StringTok{'b'}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.matmul(a, b)} \CommentTok{# 新建session with log_device_placement并设置为True.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session(config}\OperatorTok{=}\NormalTok{tf.ConfigProto(log_device_placement}\OperatorTok{=}\VariableTok{True}\NormalTok{))} \CommentTok{# 运行这个op.} \BuiltInTok{print} \NormalTok{sess.run(c)} \end{Highlighting} \end{Shaded} 你会发现现在 \texttt{a} 和 \texttt{b} 操作都被指派给了 \texttt{cpu:0}. \begin{verbatim} Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K40c, pci bus id: 0000:05:00.0 b: /job:localhost/replica:0/task:0/cpu:0 a: /job:localhost/replica:0/task:0/cpu:0 MatMul: /job:localhost/replica:0/task:0/gpu:0 [[ 22. 28.] [ 49. 64.]] \end{verbatim} \subsection{在多GPU系统里使用单一GPU}\label{ux5728ux591agpuux7cfbux7edfux91ccux4f7fux7528ux5355ux4e00gpu} 如果你的系统里有多个 GPU, 那么 ID 最小的 GPU 会默认使用. 如果你想用别的 GPU, 可以用下面的方法显式的声明你的偏好: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 新建一个 graph.} \ControlFlowTok{with} \NormalTok{tf.device(}\StringTok{'/gpu:2'}\NormalTok{):} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], name}\OperatorTok{=}\StringTok{'a'}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{2}\NormalTok{], name}\OperatorTok{=}\StringTok{'b'}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.matmul(a, b)} \CommentTok{# 新建 session with log_device_placement 并设置为 True.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session(config}\OperatorTok{=}\NormalTok{tf.ConfigProto(log_device_placement}\OperatorTok{=}\VariableTok{True}\NormalTok{))} \CommentTok{# 运行这个 op.} \BuiltInTok{print} \NormalTok{sess.run(c)} \end{Highlighting} \end{Shaded} 如果你指定的设备不存在, 你会收到 \texttt{InvalidArgumentError} 错误提示: \begin{verbatim} InvalidArgumentError: Invalid argument: Cannot assign a device to node 'b': Could not satisfy explicit device specification '/gpu:2' [[Node: b = Const[dtype=DT_FLOAT, value=Tensor, _device="/gpu:2"]()]] \end{verbatim} 为了避免出现你指定的设备不存在这种情况, 你可以在创建的 \texttt{session} 里把参数 \texttt{allow\_soft\_placement} 设置为 \texttt{True}, 这样 tensorFlow 会自动选择一个存在并且支持的设备来运行 operation. \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 新建一个 graph.} \ControlFlowTok{with} \NormalTok{tf.device(}\StringTok{'/gpu:2'}\NormalTok{):} \NormalTok{a }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{2}\NormalTok{, }\DecValTok{3}\NormalTok{], name}\OperatorTok{=}\StringTok{'a'}\NormalTok{)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.constant([}\FloatTok{1.0}\NormalTok{, }\FloatTok{2.0}\NormalTok{, }\FloatTok{3.0}\NormalTok{, }\FloatTok{4.0}\NormalTok{, }\FloatTok{5.0}\NormalTok{, }\FloatTok{6.0}\NormalTok{], shape}\OperatorTok{=}\NormalTok{[}\DecValTok{3}\NormalTok{, }\DecValTok{2}\NormalTok{], name}\OperatorTok{=}\StringTok{'b'}\NormalTok{)} \NormalTok{c }\OperatorTok{=} \NormalTok{tf.matmul(a, b)} \CommentTok{# 新建 session with log_device_placement 并设置为 True.} \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.Session(config}\OperatorTok{=}\NormalTok{tf.ConfigProto(} \NormalTok{allow_soft_placement}\OperatorTok{=}\VariableTok{True}\NormalTok{, log_device_placement}\OperatorTok{=}\VariableTok{True}\NormalTok{))} \CommentTok{# 运行这个 op.} \BuiltInTok{print} \NormalTok{sess.run(c)} \end{Highlighting} \end{Shaded} \subsection{使用多个 GPU }\label{ux4f7fux7528ux591aux4e2a-gpu} 如果你想让 TensorFlow 在多个 GPU 上运行, 你可以建立 multi-tower 结构, 在这个结构 里每个 tower 分别被指配给不同的 GPU 运行. 比如: \begin{verbatim} # 新建一个 graph. c = [] for d in ['/gpu:2', '/gpu:3']: with tf.device(d): a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3]) b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2]) c.append(tf.matmul(a, b)) with tf.device('/cpu:0'): sum = tf.add_n(c) # 新建session with log_device_placement并设置为True. sess = tf.Session(config=tf.ConfigProto(log_device_placement=True)) # 运行这个op. print sess.run(sum) \end{verbatim} 你会看到如下输出: \begin{verbatim} Device mapping: /job:localhost/replica:0/task:0/gpu:0 -> device: 0, name: Tesla K20m, pci bus id: 0000:02:00.0 /job:localhost/replica:0/task:0/gpu:1 -> device: 1, name: Tesla K20m, pci bus id: 0000:03:00.0 /job:localhost/replica:0/task:0/gpu:2 -> device: 2, name: Tesla K20m, pci bus id: 0000:83:00.0 /job:localhost/replica:0/task:0/gpu:3 -> device: 3, name: Tesla K20m, pci bus id: 0000:84:00.0 Const_3: /job:localhost/replica:0/task:0/gpu:3 Const_2: /job:localhost/replica:0/task:0/gpu:3 MatMul_1: /job:localhost/replica:0/task:0/gpu:3 Const_1: /job:localhost/replica:0/task:0/gpu:2 Const: /job:localhost/replica:0/task:0/gpu:2 MatMul: /job:localhost/replica:0/task:0/gpu:2 AddN: /job:localhost/replica:0/task:0/cpu:0 [[ 44. 56.] [ 98. 128.]] \end{verbatim} \href{tensorflow-zh/SOURCE/tutorials/deep_cnn/index.md}{cifar10 tutorial} 这个例子很好的演示了怎样用GPU集群训练. \begin{quote} 原文:\href{http://tensorflow.org/how_tos/using_gpu/index.md}{using\_gpu} 翻译:{[}@lianghyv{]}(https://github.com/lianghyv) 校对:\href{https://github.com/jikexueyuanwiki}{Wiki} \end{quote} ================================================ FILE: tex_pdf/tensorflow_manual_cn.idx ================================================ \indexentry{graph|hyperpage}{27} \indexentry{session|hyperpage}{27} \indexentry{tensor|hyperpage}{27} \indexentry{variable|hyperpage}{27} \indexentry{MNIST 数据集|hyperpage}{39} \indexentry{Softmax regression|hyperpage}{42} \indexentry{梯度下降法|hyperpage}{49} \indexentry{Softmax regression|hyperpage}{54} \indexentry{卷积神经网络|hyperpage}{57} ================================================ FILE: tex_pdf/tensorflow_manual_cn.ilg ================================================ This is makeindex, version 2.15 [TeX Live 2015/W32TeX] (kpathsea + Thai support). Scanning input file tensorflow_manual_cn.idx....done (9 entries accepted, 0 rejected). Sorting entries....done (33 comparisons). Generating output file tensorflow_manual_cn.ind....done (30 lines written, 0 warnings). Output written in tensorflow_manual_cn.ind. Transcript written in tensorflow_manual_cn.ilg. ================================================ FILE: tex_pdf/tensorflow_manual_cn.ind ================================================ \begin{theindex} \item graph, \hyperpage{26} \indexspace \item MNIST 数据集, \hyperpage{37} \indexspace \item session, \hyperpage{26} \item Softmax regression, \hyperpage{40}, \hyperpage{52} \indexspace \item tensor, \hyperpage{26} \indexspace \item variable, \hyperpage{26} \indexspace \item 卷积神经网络, \hyperpage{55} \indexspace \item 梯度下降法, \hyperpage{47} \end{theindex} ================================================ FILE: tex_pdf/tensorflow_manual_cn.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 %\documentclass[a4paper,11pt,twoside]{book} \documentclass[a4paper,11pt,twoside]{ctexbook} \usepackage{makeidx} \makeindex \usepackage{geometry} \geometry{left=3.5cm, right=3cm, top=3cm, bottom=3cm} %控制页眉页脚页码 \pagestyle{headings} %罗马字符页码 %\pagenumbering{roman} % \usepackage{ctex} % \usepackage{xeCJK} % \CJKsetecglue{} % 禁用汉字与其他内容之间空格(空隙) % 支持西文字体 \usepackage{fourier} \usepackage{courier} % \usepackage{fontspec} \newfontfamily\CodeFont{Consolas} % \newfontfamily\CodeFont{Ubuntu Mono} % \newfontfamily\CodeFont{Menlo} % \newfontfamily\CodeFont{Lucida Console} % \setmonofont{Lucida Console} \usepackage{graphicx} % 支持插入eps图形文件 % \usepackage{epsfig} % 支持代码框插入 \usepackage{xcolor} \definecolor{mygreen}{rgb}{0,0.6,0} \definecolor{mygray}{rgb}{0.5,0.5,0.5} \definecolor{mymauve}{rgb}{0.58,0,0.82} \definecolor{codeback}{rgb}{0.8,0.87,0.95} \definecolor{etc}{rgb}{0.24,0.07,0.18} \definecolor{etc}{rgb}{0.54,0.37,0.48} \usepackage{amsmath} % 支持超链接 \usepackage[colorlinks]{hyperref} \usepackage{listings} \lstset{ % backgroundcolor=\color{codeback}, % choose the background color; you must add \usepackage{color} or \usepackage{xcolor} basicstyle=\linespread{0.95}\footnotesize\CodeFont, % the size of the fonts that are used for the code breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace breaklines=true, % sets automatic line breaking captionpos=bl, % sets the caption-position to bottom commentstyle=\color{mygreen}, % comment style deletekeywords={...}, % if you want to delete keywords from the given language escapeinside={\%*}{*)}, % if you want to add LaTeX within your code extendedchars=true, % lets you use non-ASCII characters; for 8-bits encodings only, does not work with UTF-8 frame=single, % adds a frame around the code frameround=tttt, keepspaces=true, % keeps spaces in text, useful for keeping indentation of code (possibly needs columns=flexible) keywordstyle=\color{blue}, % keyword style language=Python, % the language of the code morekeywords={*,...}, % if you want to add more keywords to the set numbers=left, % where to put the line-numbers; possible values are (none, left, right) numbersep=4pt, % how far the line-numbers are from the code numberstyle=\tiny\CodeFont\color{mygray}, % the style that is used for the line-numbers rulecolor=\color{mygray}, % if not set, the frame-color may be changed on line-breaks within not-black text (e.g. comments (green here)) showspaces=false, % show spaces everywhere adding particular underscores; it overrides 'showstringspaces' showstringspaces=true, % underline spaces within strings only showtabs=true, % show tabs within strings adding particular underscores stepnumber=1, % the step between two line-numbers. If it's 1, each line will be numbered stringstyle=\color{orange}, % string literal style tabsize=2, % sets default tabsize to 2 spaces %title=myPython.py % show the filename of files included with \lstinputlisting; also try caption instead of title xleftmargin = 2em, xrightmargin = 2em, aboveskip = 0.5 em } % \setCJKmainfont[BoldFont={SimSun},ItalicFont={KaiTi}] %{SimSun} %%%%%%%%%%%% \title{TensorFlow 指南} \author{} \date{\today} % \thanks{} \begin{document} \maketitle \tableofcontents %%%% 第一章 \newpage \chapter{起步} % \section{Introduction} \include{get_started/c1s01_introduction} \include{get_started/c1s02_os_setup} \include{get_started/c1s03_basic_usage} %%%% 第二章 \newpage \chapter{基础教程} \include{tutorials/c2s00_overview} \include{tutorials/c2s01_minist_beginners} \include{tutorials/c2s02_minist_pros} \include{tutorials/c2s03_} \include{tutorials/c2s04_} \include{tutorials/c2s05_word2vec} \include{tutorials/c2s06_recurrent} \include{tutorials/c2s07_seq2seq} \include{tutorials/c2s08_mandelbrot} \include{tutorials/c2s09_pdes} \include{tutorials/c2s10_mnist_download} \newpage % Chapter 3 How to... % 第三章 运作方式 \chapter{运作方式} \include{how_tos/c3s00_overview} \include{how_tos/c3s01_variables} \include{how_tos/c3s02_variable_scope} \include{how_tos/c3s03_viz_learning} \include{how_tos/c3s04_graph_viz} \include{how_tos/c3s05_reading_data} \include{how_tos/c3s06_threading_and_queues} \include{how_tos/c3s07_adding_an_op} \include{how_tos/c3s08_new_data_formats} \include{how_tos/c3s09_using_gpu} 🍁 % English parts \newpage % Chapter 4 API (Python) \chapter{Python API} \include{api/c4s00} \include{api/python/c4s01_framework} \include{api/python/c4s02_constant_op} \include{api/python/c4s03_state_ops} \include{api/python/c4s04_array_ops} \include{api/python/c4s05_math_ops} \include{api/python/c4s06_control_flow_ops} \include{api/python/c4s07_image} \include{api/python/c4s08_sparse_ops} \include{api/python/c4s09_io_ops} \include{api/python/c4s10_python_io} \include{api/python/c4s11_nn} \include{api/python/c4s12_client} \include{api/python/c4s13_train} \newpage % Chapter 5 API (C++) \chapter{C++ API} \newpage \chapter{资源} \newpage \chapter{其他} \printindex \addcontentsline{toc}{chapter}{索引} \end{document} ================================================ FILE: tex_pdf/tutorials/c2s00_overview.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 % Chapter 2 TutorialsHow to ... % Section 2.1 % \section{综述} \textbf{综述} \hyperref[MINIST_beginner]{\textcolor{etc}{\textbf{MNIST For ML Beginners}} || \textbf{MNIST 机器学习入门}} \textcolor{etc}{If you're new to machine learning, we recommend starting here. You'll learn about a classic problem, handwritten digit classification (MNIST), and get a gentle introduction to multiclass classification.} 如果你是机器学习领域的新手, 我们推荐你从本文开始阅读. 本文通过讲述一个经典的问题, 手写数字识别 (MNIST), 让你对多类分类 (multiclass classification) 问题有直观的了解. \textcolor{etc}{\textbf{Deep MNIST for Experts}} || \textbf{深入MNIST} \textcolor{etc}{If you're already familiar with other deep learning software packages, and are already familiar with MNIST, this tutorial with give you a very brief primer on TensorFlow.} 如果你已经对其它深度学习软件比较熟悉, 并且也对 MNIST 很熟悉, 这篇教程能够引导你对 TensorFlow 有初步了解. \hyperref[MINIST_pros]{View Tutorial | 阅读该教程} \textcolor{etc}{\textbf{TensorFlow Mechanics 101}} || \textbf{} \textcolor{etc}{This is a technical tutorial, where we walk you through the details of using TensorFlow infrastructure to train models at scale. We use again MNIST as the example.} 这是一篇技术教程, 详细介绍了如何使用 TensorFlow 架构训练大规模模型. 本文继续使用MNIST 作为例子. \hyperref[tf_mech101]{View Tutorial | 阅读该教程} \textbf{Convolutional Neural Networks} An introduction to convolutional neural networks using the CIFAR-10 data set. Convolutional neural nets are particularly tailored to images, since they exploit translation invariance to yield more compact and effective representations of visual content. 这篇文章介绍了如何使用 TensorFlow 在 CIFAR-10 数据集上训练卷积神经网络. 卷积神经网络是为图像识别量身定做的一个模型. 相比其它模型, 该模型利用了平移不变性(translation invariance), 从而能够更更简洁有效地表示视觉内容. View Tutorial \textbf{Vector Representations of Words} This tutorial motivates why it is useful to learn to represent words as vectors (called word embeddings). It introduces the word2vec model as an efficient method for learning embeddings. It also covers the high-level details behind noise-contrastive training methods (the biggest recent advance in training embeddings). 本文让你了解为什么学会使用向量来表示单词, 即单词嵌套 (word embedding), 是一件很有用的事情. 文章中介绍的 word2vec 模型, 是一种高效学习嵌套的方法. 本文还涉及了对比噪声(noise-contrastive) 训练方法的一些高级细节, 该训练方法是训练嵌套领域最近最大的进展. View Tutorial \textbf{Recurrent Neural Networks} An introduction to RNNs, wherein we train an LSTM network to predict the next word in an English sentence. (A task sometimes called language modeling.) 一篇 RNN 的介绍文章, 文章中训练了一个 LSTM 网络来预测一个英文句子的下一个单词(该任务有时候被称作语言建模). View Tutorial \textbf{Sequence-to-Sequence Models} A follow on to the RNN tutorial, where we assemble a sequence-to-sequence model for machine translation. You will learn to build your own English-to-French translator, entirely machine learned, end-to-end. RNN 教程的后续, 该教程采用序列到序列模型进行机器翻译. 你将学会构建一个完全基于机器学习,端到端的\emph{英语-法语}翻译器. View Tutorial \textbf{Mandelbrot Set} TensorFlow can be used for computation that has nothing to do with machine learning. Here's a naive implementation of Mandelbrot set visualization. TensorFlow 可以用于与机器学习完全无关的其它计算领域. 这里实现了一个原生的 Mandelbrot 集合的可视化程序. View Tutorial \textbf{Partial Differential Equations} As another example of non-machine learning computation, we offer an example of a naive PDE simulation of raindrops landing on a pond. 这是另外一个非机器学习计算的例子, 我们利用一个原生实现的偏微分方程, 对雨滴落在池塘上的过程进行仿真. View Tutorial \textbf{MNIST Data Download} Details about downloading the MNIST handwritten digits data set. Exciting stuff. 一篇关于下载 MNIST 手写识别数据集的详细教程. View Tutorial \textbf{Image Recognition} How to run object recognition using a convolutional neural network trained on ImageNet Challenge data and label set. 如何利用受过训练的ImageNet挑战数据和标签集卷积神经网络来运行物体识别。 View Tutorial We will soon be releasing code for training a state-of-the-art Inception model. Deep Dream Visual Hallucinations Building on the Inception recognition model, we will release a TensorFlow version of the Deep Dream neural network visual hallucination software. 我们也将公布一个训练高级的Iception模型所用的代码。 COMING SOON ================================================ FILE: tex_pdf/tutorials/c2s01_minist_beginners.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 % Chapter 2 TutorialsHow to ... % Section 2.2 \newpage \section {MNIST机器学习入门}\label{MINIST_beginner} Ⓔ \textcolor{etc}{This tutorial is intended for readers who are new to both machine learning and TensorFlow. If you already know what MNIST is, and what softmax (multinomial logistic) regression is, you might prefer this faster paced tutorial. Be sure to install TensorFlow before starting either tutorial.} Ⓒ 本教程的目标读者是对机器学习和TensorFlow都不太了解的新手.如果你已经了解MNIST和softmax回归(softmax regression)的相关知识,你可以阅读这个快速上手教程. Ⓔ \textcolor{etc}{When one learns how to program, there's a tradition that the first thing you do is print "Hello World." Just like programming has Hello World, machine learning has MNIST.} Ⓒ 当我们开始学习编程的时候,第一件事往往是学习打印“Hello World”.就好比编程入门有Hello World,机器学习入门有MNIST.\index{MNIST 数据集} Ⓔ \textcolor{etc}{MNIST is a simple computer vision dataset. It consists of images of handwritten digits like these:} Ⓒ MNIST是一个入门级的计算机视觉数据集,它包含各种手写数字图片: \begin{figure}[htbp] \centering \includegraphics[width=.55\textwidth]{../SOURCE/images/MNIST.png} \caption{} \end{figure} Ⓔ \textcolor{etc}{It also includes labels for each image, telling us which digit it is. For example, the labels for the above images are 5, 0, 4, and 1.} Ⓒ 它也包含每一张图片对应的标签,告诉我们这个是数字几.比如,上面这四张图片的标签分别是5,0,4,1. Ⓔ \textcolor{etc}{In this tutorial, we're going to train a model to look at images and predict what digits they are. Our goal isn't to train a really elaborate model that achieves state-of-the-art performance -- although we'll give you code to do that later! -- but rather to dip a toe into using TensorFlow. As such, we're going to start with a very simple model, called a Softmax Regression.} Ⓒ 在此教程中,我们将训练一个机器学习模型用于预测图片里面的数字.我们的目的不是要设计一个世界一流的复杂模型---尽管我们会在之后给你源代码去实现一流的预测模型---而是要介绍下如何使用TensorFlow.所以,我们这里会从一个很简单的数学模型开始,它叫做Softmax Regression. Ⓔ \textcolor{etc}{The actual code for this tutorial is very short, and all the interesting stuff happens in just three lines. However, it is very important to understand the ideas behind it: both how TensorFlow works and the core machine learning concepts. Because of this, we are going to very carefully work through the code.} Ⓒ 对应这个教程的实现代码很短,而且真正有意思的内容只包含在三行代码里面.但是,去理解包含在这些代码里面的设计思想是非常重要的:TensorFlow工作流程和机器学习的基本概念.因此,这个教程会很详细地介绍这些代码的实现原理. \subsection {The MNIST Data | MNIST数据集} Ⓔ \textcolor{etc}{The MNIST data is hosted on Yann LeCun's website. For your convenience, we've included some python code to download and install the data automatically. You can either download the code and import it as below, or simply copy and paste it in.} Ⓒ MNIST数据集的官网是\href{http://yann.lecun.com/exdb/mnist/}{Yann LeCun's website}.在这里,我们提供了一份python源代码用于自动下载和安装这个数据集.你可以下载这段\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/examples/tutorials/mnist/input_data.py}{代码},然后用下面的代码导入到你的项目里面,也可以直接复制粘贴到你的代码文件里面. %\begin{lstlisting}[language={[ANSI]Python}] \begin{lstlisting} import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) \end{lstlisting} Ⓔ \textcolor{etc}{The downloaded data is split into three parts, 55,000 data points of training data \lstinline{(mnist.train)}, 10,000 points of test data \lstinline{(mnist.test)}, and 5,000 points of validation data \lstinline{(mnist.validation)}. This split is very important: it's essential in machine learning that we have separate data which we don't learn from so that we can make sure that what we've learned actually generalizes!} Ⓒ 下载下来的数据集可被分为三部分:55000 行训练用点数据集(\lstinline{mnist.train}),10000 行测试数据集(\lstinline{mnist.test}),以及5000行验证数据集(\lstinline{mnist.validation}).这样的切分很重要:在机器学习模型设计时必须有一个单独的测试数据集不用于训练而是用来评估这个模型的性能,从而更加容易把设计的模型推广到其他数据集上(泛化). Ⓔ \textcolor{etc}{As mentioned earlier, every MNIST data point has two parts: an image of a handwritten digit and a corresponding label. We will call the images "xs" and the labels "ys". Both the training set and test set contain xs and ys, for example the training images are \lstinline{mnist.train.images} and the train labels are \lstinline{mnist.train.labels}.} Ⓒ 正如前面提到的一样,每一个MNIST数据单元有两部分组成:一张包含手写数字的图片和一个对应的标签.我们把这些图片设为“xs”,把这些标签设为“ys”.训练数据集和测试数据集都包含xs和ys,比如训练数据集的图片是\lstinline{mnist.train.images} ,训练数据集的标签是\lstinline{mnist.train.labels}. Ⓔ \textcolor{etc}{Each image is 28 pixels by 28 pixels. We can interpret this as a big array of numbers:} Ⓒ 每一张图片包含$ 28 \times 28$像素.我们可以用一个数字数组来表示这张图片: \begin{figure}[htbp] \centering \includegraphics[width=.8\textwidth]{../SOURCE/images/MNIST-Matrix.png} \caption{} \end{figure} Ⓔ \textcolor{etc}{We can flatten this array into a vector of $ 28 \times 28 = 784$ numbers. It doesn't matter how we flatten the array, as long as we're consistent between images. From this perspective, the MNIST images are just a bunch of points in a 784-dimensional vector space, with a \href{http://colah.github.io/posts/2014-10-Visualizing-MNIST/}{very rich structure} (warning: computationally intensive visualizations).} Ⓒ 我们把这个数组展开成一个向量,长度是 $ 28 \times 28 = 784$.如何展开这个数组(数字间的顺序)不重要,只要保持各个图片采用相同的方式展开.从这个角度来看,MNIST数据集的图片就是在784维向量空间里面的点, 并且拥有比较\href{http://colah.github.io/posts/2014-10-Visualizing-MNIST/}{复杂的结构} (注意: 此类数据的可视化是计算密集型的). Ⓔ \textcolor{etc}{Flattening the data throws away information about the 2D structure of the image. Isn't that bad? Well, the best computer vision methods do exploit this structure, and we will in later tutorials. But the simple method we will be using here, a softmax regression, won't.} Ⓒ 展平图片的数字数组会丢失图片的二维结构信息.这显然是不理想的,最优秀的计算机视觉方法会挖掘并利用这些结构信息,我们会在后续教程中介绍.但是在这个教程中我们忽略这些结构,所介绍的简单数学模型,softmax回归(softmax regression),不会利用这些结构信息. Ⓔ \textcolor{etc}{The result is that \lstinline{mnist.train.images} is a tensor (an n-dimensional array) with a shape of \lstinline{[55000, 784]}. The first dimension indexes the images and the second dimension indexes the pixels in each image. Each entry in the tensor is the pixel intensity between 0 and 1, for a particular pixel in a particular image.} Ⓒ 因此,在MNIST训练数据集中,\lstinline{mnist.train.images}是一个形状为 \lstinline{[55000, 784]} 的张量,第一个维度数字用来索引图片,第二个维度数字用来索引每张图片中的像素点.在此张量里的每一个元素,都表示某张图片里的某个像素的强度值,值介于0和1之间. \begin{figure}[htbp] \centering \includegraphics[width=.65\textwidth]{../SOURCE/images/mnist-train-xs.png} \caption{} \end{figure} Ⓔ \textcolor{etc}{The corresponding labels in MNIST are numbers between 0 and 9, describing which digit a given image is of. For the purposes of this tutorial, we're going to want our labels as "one-hot vectors". A one-hot vector is a vector which is 0 in most dimensions, and 1 in a single dimension. In this case, the $n$th digit will be represented as a vector which is 1 in the $n$th dimensions. For example, 3 would be $[0,0,0,1,0,0,0,0,0,0]$. Consequently, \lstinline{mnist.train.labels} is a \lstinline{[55000, 10]} array of floats.} Ⓒ 相对应的MNIST数据集的标签是介于0到9的数字,用来描述给定图片里表示的数字.为了用于这个教程,我们使标签数据是"one-hot vectors". 一个one-hot向量除了某一位的数字是1以外其余各维度数字都是0.所以在此教程中,数字n将表示成一个只有在第$n$维度(从0开始)数字为1的10维向量.比如,标签3将表示成(\lstinline{[0,0,0,1,0,0,0,0,0,0]}).因此,\lstinline{mnist.train.labels}是一个 \lstinline{[55000, 10]} 的数字矩阵. \begin{figure}[htbp] \centering \includegraphics[width=.7\textwidth]{../SOURCE/images/mnist-train-ys.png} \caption{} \end{figure} Ⓔ \textcolor{etc}{We're now ready to actually make our model!} Ⓒ 现在,我们准备开始真正的建模之旅! \subsection {Softmax回归介绍} Ⓔ We know that every image in MNIST is a digit, whether it's a zero or a nine. We want to be able to look at an image and give probabilities for it being each digit. For example, our model might look at a picture of a nine and be 80\% sure it's a nine, but give a 5\% chance to it being an eight (because of the top loop) and a bit of probability to all the others because it isn't sure. Ⓒ 我们知道MNIST数据集的每一张图片都表示一个(0到9的)数字.那么,如果模型若能看到一张图就能知道它属于各个数字的对应概率就好了。比如,我们的模型可能看到一张数字"9"的图片,就判断出它是数字"9"的概率为80\%,而有5\%的概率属于数字"8"(因为8和9都有上半部分的小圆),同时给予其他数字对应的小概率(因为该图像代表它们的可能性微乎其微).\index{Softmax regression} Ⓔ This is a classic case where a softmax regression is a natural, simple model. If you want to assign probabilities to an object being one of several different things, softmax is the thing to do. Even later on, when we train more sophisticated models, the final step will be a layer of softmax. Ⓒ 这是能够体现softmax回归自然简约的一个典型案例.softmax模型可以用来给不同的对象分配概率.在后文,我们训练更加复杂的模型时,最后一步也往往需要用softmax来分配概率. Ⓔ A softmax regression has two steps: first we add up the evidence of our input being in certain classes, and then we convert that evidence into probabilities. Ⓒ softmax回归(softmax regression)分两步:首先对输入被分类对象属于某个类的“证据”相加求和,然后将这个“证据”的和转化为概率. Ⓔ To tally up the evidence that a given image is in a particular class, we do a weighted sum of the pixel intensities. The weight is negative if that pixel having a high intensity is evidence against the image being in that class, and positive if it is evidence in favor. Ⓒ 我们使用加权的方法来累积计算一张图片是否属于某类的“证据”。如果图片的像素强有力的体现该图不属于某个类,则权重为负数,相反如果这个像素拥有有利的证据支持这张图片属于这个类,那么权值为正. Ⓔ The following diagram shows the weights one model learned for each of these classes. Red represents negative weights, while blue represents positive weights. Ⓒ 下面的图片显示了一个模型学习到的图片上每个像素对于特定数字类的权值.红色代表负权值,蓝色代表正权值. \begin{figure}[htbp] \centering \includegraphics[width=.65\textwidth]{../SOURCE/images/softmax-weights.png} \caption{} \end{figure} Ⓔ We also add some extra evidence called a bias. Basically, we want to be able to say that some things are more likely independent of the input. The result is that the evidence for a class $i$ given an input $x$ is: Ⓒ 我们也需要引入额外的“证据”,可称之为偏置量(bias)。总的来说,我们希望它代表了与所输入向无关的判断证据.因此对于给定的输入图片$x$代表某数字$i$的总体证据可以表示为: \begin{equation} evidence_i = \sum_j{W_{i,j}}x_j+b_i \end{equation}\\ Ⓔ where $W_i$ is the weights and $b_i$ is the bias for class $i$, and $j$ is an index for summing over the pixels in our input image $x$. We then convert the evidence tallies into our predicted probabilities y using the "softmax" function:\\ Ⓒ 其中,$W_i$ 代表权重,$b_i$ 代表第 $i$ 类的偏置量,$j$ 代表给定图片 $x$ 的像素索引用于像素求和.然后用softmax函数可以把这些证据转换成概率$y$:\\ \begin{equation} y = softmax(evidence) \end{equation}\\ Ⓔ Here softmax is serving as an "activation" or "link" function, shaping the output of our linear function into the form we want -- in this case, a probability distribution over 10 cases. You can think of it as converting tallies of evidence into probabilities of our input being in each class. It's defined as:\\ Ⓒ 这里的softmax可以看成是一个\emph{激励}(activation)函数或是\emph{链接}(link)函数,把我们定义的线性函数的输出转换成我们想要的格式,也就是关于10个数字类的概率分布.因此,给定一张图片,它对于每一个数字的吻合度可以被softmax函数转换成为一个概率值.softmax函数可以定义为:\\ \begin{equation} softmax(x) = normalize(exp(x)) \end{equation}\\ Ⓔ If you expand that equation out, you get:\\ 展开等式右边的子式,可以得到:\\ \begin{equation} softmax(x)_i = \frac{exp(x_i)}{\sum_j{exp(x_j)}} \end{equation}\\ Ⓔ But it's often more helpful to think of softmax the first way: exponentiating its inputs and then normalizing them. The exponentiation means that one more unit of evidence increases the weight given to any hypothesis multiplicatively. And conversely, having one less unit of evidence means that a hypothesis gets a fraction of its earlier weight. No hypothesis ever has zero or negative weight. Softmax then normalizes these weights, so that they add up to one, forming a valid probability distribution. (To get more intuition about the softmax function, check out the \href{http://neuralnetworksanddeeplearning.com/chap3.html#softmax}{section} on it in Michael Nieslen's book, complete with an interactive visualization.)\\ Ⓒ 但是更多的时候把softmax模型函数定义为第一种形式:把输入值当成幂指数求值,再正则化这些结果值.这个幂运算表示,更大的证据对应更大的假设模型(hypothesis)里面的乘数权重值.反之,拥有更少的证据意味着在假设模型里面拥有更小的乘数系数.假设模型里的权值不可以是0值或者负值.Softmax然后会正则化这些权重值,使它们的总和等于1,以此构造一个有效的概率分布.(更多的关于Softmax函数的信息,可以参考Michael Nieslen的书里面的\href{http://neuralnetworksanddeeplearning.com/chap3.html#softmax}{这个部分},其中有关于softmax的可交互式的可视化解释.) Ⓔ You can picture our softmax regression as looking something like the following, although with a lot more $x$s. For each output, we compute a weighted sum of the $x$s, add a bias, and then apply softmax.\\ 对于softmax回归模型可以用下面的图解释,对于输入的$xs$ 加权求和,再分别加上一个偏置量,最后再输入到softmax函数中: \begin{center} \includegraphics[width=.65\textwidth]{../SOURCE/images/softmax-regression-scalargraph.png} \end{center} Ⓔ If we write that out as equations, we get:\\ 如果把它写成一个方程,可以得到: \begin{center} \includegraphics[width=.68\textwidth]{../SOURCE/images/softmax-regression-scalarequation.png} \end{center} Ⓔ We can "vectorize" this procedure, turning it into a matrix multiplication and vector addition. This is helpful for computational efficiency. (It's also a useful way to think.)\\ 我们也可以用向量表示这个计算过程:用矩阵乘法和向量相加.这有助于提高计算效率(也是一种更有效的思考方式). \begin{center} \includegraphics[width=.68\textwidth]{../SOURCE/images/softmax-regression-vectorequation.png} \end{center} Ⓔ More compactly, we can just write:\\ 更进一步,可以写成更加紧凑的方式: \begin{equation} y = softmax(W_x+b) \end{equation} \subsection {实现回归模型} Ⓔ To do efficient numerical computing in Python, we typically use libraries like NumPy that do expensive operations such as matrix multiplication outside Python, using highly efficient code implemented in another language. Unfortunately, there can still be a lot of overhead from switching back to Python every operation. This overhead is especially bad if you want to run computations on GPUs or in a distributed manner, where there can be a high cost to transferring data. 为了在python中高效的进行数值计算,我们通常会调用(如NumPy)外部函数库,把类似矩阵乘法这样的复杂运算使用其他外部语言实现.不幸的是,从外部计算切换回Python的每一个操作,仍然是一个很大的开销.如果你用GPU来进行外部计算,这样的开销会更大.用分布式的计算方式,也会花费更多的资源用来传输数据. Ⓔ TensorFlow also does its heavy lifting outside python, but it takes things a step further to avoid this overhead. Instead of running a single expensive operation independently from Python, TensorFlow lets us describe a graph of interacting operations that run entirely outside Python. (Approaches like this can be seen in a few machine learning libraries.) TensorFlow也把复杂的计算放在python之外完成,但是为了避免前面说的那些开销,它做了进一步完善.TensorFlow不单独地运行单一的复杂计算,而是让我们可以先用图描述一系列可交互的计算操作,然后全部一起在Python之外运行.(这样类似的运行方式,可以在不少的机器学习库中看到.) Ⓔ To use TensorFlow, we need to import it. 使用TensorFlow之前,首先导入它: \begin{lstlisting} import tensorflow as tf \end{lstlisting} Ⓔ We describe these interacting operations by manipulating symbolic variables. Let's create one: 我们通过操作符号变量来描述这些可交互的操作单元,可以用下面的方式创建一个: \begin{lstlisting} x = tf.placeholder("float", [None, 784]) \end{lstlisting}\\ Ⓔ \lstinline{x} isn't a specific value. It's a \lstinline{placeholder}, a value that we'll input when we ask TensorFlow to run a computation. We want to be able to input any number of MNIST images, each flattened into a 784-dimensional vector. We represent this as a 2-D tensor of floating-point numbers, with a shape \lstinline{[None, 784]}. (Here None means that a dimension can be of any length.)\\ \lstinline{x} 不是一个特定的值,而是一个占位符\lstinline{placeholder},我们在TensorFlow运行计算时输入这个值.我们希望能够输入任意数量的MNIST图像,每一张图展平成784维的向量.我们用2维的浮点数张量来表示这些图,这个张量的形状是 [None,784].(这里的\lstinline{None}表示此张量的第一个维度可以是任何长度的.) Ⓔ We also need the weights and biases for our model. We could imagine treating these like additional inputs, but TensorFlow has an even better way to handle it: \lstinline{Variable}. A \lstinline{Variable} is a modifiable tensor that lives in TensorFlow's graph of interacting operations. It can be used and even modified by the computation. For machine learning applications, one generally has the model parameters be \lstinline{Variables}. 我们的模型也需要权重值和偏置量,当然我们可以把它们当做是另外的输入(使用占位符),但TensorFlow有一个更好的方法来表示它们:\lstinline{Variable}. 一个\lstinline{Variable}代表一个可修改的张量,存在在TensorFlow的用于描述交互性操作的图中.它们可以用于计算输入值,也可以在计算中被修改.对于各种机器学习应用,一般都会有模型参数,可以用\lstinline{Variable}表示. \begin{lstlisting} W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) \end{lstlisting} Ⓔ We create these \lstinline{Variables} by giving \lstinline{tf.Variable} the initial value of the \lstinline{Variable}: in this case, we initialize both W and b as tensors full of zeros. Since we are going to learn \lstinline{W} and \lstinline{b}, it doesn't matter very much what they initially are. 我们赋予\lstinline{tf.Variable} 不同的初值来创建不同的\lstinline{Variable}:在这里,我们都用全为零的张量来初始化\lstinline{W}和\lstinline{b}.因为我们要学习\lstinline{W}和\lstinline{b}的值,它们的初值可以随意设置. Ⓔ Notice that W has a shape of \lstinline{[784, 10]} because we want to multiply the 784-dimensional image vectors by it to produce 10-dimensional vectors of evidence for the difference classes. \lstinline{b} has a shape of \lstinline{[10]} so we can add it to the output. 注意,\lstinline{W}的维度是\lstinline{[784,10]},因为我们想要用784维的图片向量乘以它以得到一个10维的证据值向量,每一位对应不同数字类.\lstinline{b}的形状是\lstinline{[10]},所以我们可以直接把它加到输出上面. Ⓔ We can now implement our model. It only takes one line! 现在,可以实现我们的模型了,只需以下一行代码: \begin{lstlisting} y = tf.nn.softmax(tf.matmul(x,W) + b) \end{lstlisting} Ⓔ First, we multiply $x$ by $W$ with the expression \lstinline{tf.matmul(x, W)}. This is flipped from when we multiplied them in our equation, where we had $W_x$, as a small trick to deal with x being a 2D tensor with multiple inputs. We then add b, and finally apply \lstinline{tf.nn.softmax}. 首先,我们用\lstinline{tf.matmul(X,W)}表示$x$乘以$W$,对应之前等式里面的$W_x$,这里$x$是一个2维张量拥有多个输入.然后再加上$b$,把和输入到\lstinline{tf.nn.softmax}函数里面. Ⓔ That's it. It only took us one line to define our model, after a couple short lines of setup. That isn't because TensorFlow is designed to make a softmax regression particularly easy: it's just a very flexible way to describe many kinds of numerical computations, from machine learning models to physics simulations. And once defined, our model can be run on different devices: your computer's CPU, GPUs, and even phones! 至此,我们先用了几行简短的代码来设置变量,然后只用了一行代码来定义我们的模型.TensorFlow不仅仅可以使softmax回归模型计算变得特别简单,它也用这种非常灵活的方式来描述其他各种数值计算,从机器学习模型对物理学模拟仿真模型.一旦被定义好之后,我们的模型就可以在不同的设备上运行:计算机的CPU,GPU,甚至是手机! \subsection{训练模型} Ⓔ In order to train our model, we need to define what it means for the model to be good. Well, actually, in machine learning we typically define what it means for a model to be bad, called the cost or loss, and then try to minimize how bad it is. But the two are equivalent. 为了训练我们的模型,我们首先需要定义一个指标来评估这个模型是好的.其实,在机器学习,我们通常定义指标来表示一个模型是坏的,这个指标称为成本(cost)或损失(loss),然后尽量最小化这个指标.但是,这两种方式是相同的. Ⓔ One very common, very nice cost function is "cross-entropy." Surprisingly, cross-entropy arises from thinking about information compressing codes in information theory but it winds up being an important idea in lots of areas, from gambling to machine learning. It's defined:\\ 一个非常常见的,非常漂亮的成本函数是“交叉熵”(cross-entropy).交叉熵产生于信息论里面的信息压缩编码技术,但是它后来演变成为从博弈论到机器学习等其他领域里的重要技术手段.它的定义如下: \\ \begin{equation} H_{y'}(y) = -\sum_i{y_{i}'log(y_i)} \end{equation} \\ Ⓔ where $y$ is our predicted probability distribution, and $y'$ is the true distribution (the one-hot vector we'll input). In some rough sense, the cross-entropy is measuring how inefficient our predictions are for describing the truth. Going into more detail about cross-entropy is beyond the scope of this tutorial, but it's well worth \href{http://colah.github.io/posts/2015-09-Visual-Information/}{understanding}. \\ $y$是我们预测的概率分布,$y'$是实际的分布(我们输入的one-hot vector).比较粗糙的理解是,交叉熵是用来衡量我们的预测用于描述真相的低效性.更详细的关于交叉熵的解释超出本教程的范畴,但是你很有必要好好\href{http://colah.github.io/posts/2015-09-Visual-Information/}{理解}它. Ⓔ To implement cross-entropy we need to first add a new placeholder to input the correct answers:\\ 为了计算交叉熵,我们首先需要添加一个新的占位符用于输入正确值: \\ \begin{lstlisting} y = tf.placeholder("float", [None,10]) \end{lstlisting} Ⓔ Then we can implement the cross-entropy $-\sum{y'log(y)}$,\\ 然后我们可以用 $-\sum{y'log(y)}$ 计算交叉熵: \begin{lstlisting} cross_entropy = -tf.reduce_sum(y_*tf.log(y)) \end{lstlisting} Ⓔ First, \lstinline{tf.log} computes the logarithm of each element of \lstinline{y.} Next, we multiply each element of \lstinline{y_} with the corresponding element of \lstinline{tf.log(y)}. Finally, \lstinline{tf.reduce_sum} adds all the elements of the tensor. 首先,用 \lstinline{tf.log} 计算y的每个元素的对数.接下来,我们把\lstinline{y_}的每一个元素和\lstinline{tf.log(y_)}的对应元素相乘.最后,用\lstinline{tf.reduce_sum}计算张量的所有元素的总和. Ⓔ Note that this isn't just the cross-entropy of the truth with a single prediction, but the sum of the cross-entropies for all the images we looked at. In this example, we have 100 images in each batch: how well we are doing on 100 data points is a much better description of how good our model is than a single data point. 值得注意的是,这里的交叉熵不仅仅用来衡量单一的一对预测和真实值,而是所有100幅图片的交叉熵的总和.对于100个数据点的预测表现比单一数据点的表现能更好地描述我们的模型的性能. Ⓔ Now that we know what we want our model to do, it's very easy to have TensorFlow train it to do so. Because TensorFlow knows the entire graph of your computations, it can automatically use the \href{http://colah.github.io/posts/2015-08-Backprop/}{backpropagation algorithm} to efficiently determine how your variables affect the cost you ask it minimize. Then it can apply your choice of optimization algorithm to modify the variables and reduce the cost. 现在我们知道我们需要我们的模型做什么啦,用TensorFlow来训练它是非常容易的.因为TensorFlow拥有一张描述你各个计算单元的图,它可以自动地使用\href{http://colah.github.io/posts/2015-08-Backprop/}{反向传播算法(backpropagation algorithm)}来有效地确定你的变量是如何影响你想要最小化的那个成本值的.然后,TensorFlow会用你选择的优化算法来不断地修改变量以降低成本. \begin{lstlisting} train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) \end{lstlisting} Ⓔ In this case, we ask TensorFlow to minimize \lstinline{cross_entropy} using the gradient descent algorithm with a learning rate of $0.01$. Gradient descent is a simple procedure, where TensorFlow simply shifts each variable a little bit in the direction that reduces the cost. But TensorFlow also provides \href{https://www.tensorflow.org/versions/master/api_docs/python/train.html#optimizers}{many other optimization algorithms}: using one is as simple as tweaking one line. 在这里,我们要求TensorFlow用梯度下降算法(gradient descent algorithm)以0.01的学习速率最小化交叉熵.梯度下降算法(gradient descent algorithm)是一个简单的学习过程,TensorFlow只需将每个变量一点点地往使成本不断降低的方向移动.当然TensorFlow也提供了\href{https://www.tensorflow.org/versions/master/api_docs/python/train.html#optimizers}{其他许多优化算法}:只要简单地调整一行代码就可以使用其他的算法.\index{梯度下降法} Ⓔ What TensorFlow actually does here, behind the scenes, is it adds new operations to your graph which implement backpropagation and gradient descent. Then it gives you back a single operation which, when run, will do a step of gradient descent training, slightly tweaking your variables to reduce the cost. TensorFlow在这里实际上所做的是,它会在后台给描述你的计算的那张图里面增加一系列新的计算操作单元用于实现反向传播算法和梯度下降算法.然后,它返回给你的只是一个单一的操作,当运行这个操作时,它用梯度下降算法训练你的模型,微调你的变量,不断减少成本. Ⓔ Now we have our model set up to train. One last thing before we launch it, we have to add an operation to initialize the variables we created: 现在,我们已经设置好了我们的模型.在运行计算之前,我们需要添加一个操作来初始化我们创建的变量: \begin{lstlisting} init = tf.initialize_all_variables() \end{lstlisting} \\ Ⓔ We can now launch the model in a \lstinline{Session}, and run the operation that initializes the variables:\\ 现在我们可以在一个 \lstinline{Session} 里面启动我们的模型,并且初始化变量: \begin{lstlisting} sess = tf.Session() sess.run(init) \end{lstlisting} \\ Ⓔ Let's train -- we'll run the training step 1000 times!\\ 然后开始训练模型,这里我们让模型循环训练1000次! \begin{lstlisting} for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) \end{lstlisting} Ⓔ Each step of the loop, we get a "batch" of one hundred random data points from our training set. We run \lstinline{train_step} feeding in the batches data to replace the placeholders. 该循环的每个步骤中,我们都会随机抓取训练数据中的100个批处理数据点,然后我们用这些数据点作为参数替换之前的占位符来运行\lstinline{train_step}. Ⓔ Using small batches of random data is called stochastic training -- in this case, stochastic gradient descent. Ideally, we'd like to use all our data for every step of training because that would give us a better sense of what we should be doing, but that's expensive. So, instead, we use a different subset every time. Doing this is cheap and has much of the same benefit. 使用一小部分的随机数据来进行训练被称为随机训练(stochastic training)---在这里更确切的说是随机梯度下降训练.理想情况下,我们希望用我们所有的数据来进行每一步的训练,因为这能给我们更好的训练结果,但显然这需要很大的计算开销.所以,每一次训练我们可以使用不同的数据子集,这样做既可以减少计算开销,又可以最大化地学习到数据集的总体特性. \subsection{Evaluating Our Model || 评估我们的模型} Ⓔ How well does our model do? 那么我们的模型性能如何呢? Ⓔ Well, first let's figure out where we predicted the correct label. \lstinline{tf.argmax} is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, \lstinline{tf.argmax(y,1)} is the label our model thinks is most likely for each input, while \lstinline{tf.argmax(y_,1)} is the correct label. We can use \lstinline{tf.equal} to check if our prediction matches the truth. 首先让我们找出那些预测正确的标签.\lstinline{tf.argmax()}是一个非常有用的函数,它能给你在一个张量里沿着某条轴的最高条目的索引值.比如,\lstinline{tf.argmax(y,1)}是模型认为每个输入最有可能对应的那些标签,而\lstinline{tf.argmax(y_,1)}代表正确的标签.我们可以用\lstinline{tf.equal} 来检测我们的预测是否真实标签匹配. \begin{lstlisting} correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) \end{lstlisting} Ⓔ That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, \lstinline{[True, False, True, True]} would become \lstinline{[1,0,1,1]} which would become $0.75$. 这行代码会给我们一组布尔值.为了确定正确预测项的比例,我们可以把布尔值转换成浮点数,然后取平均值.例如,\lstinline{[True, False, True, True]}会变成\lstinline{[1,0,1,1]},取平均值后得到 $0.75$ . \begin{lstlisting} accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) \end{lstlisting} Ⓔ Finally, we ask for our accuracy on our test data. 最后,我们计算所学习到的模型在测试数据集上面的正确率. \begin{lstlisting} print sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}) \end{lstlisting} Ⓔ This should be about 91\%. 最终结果值应该大约是91\%. Ⓔ Is that good? Well, not really. In fact, it's pretty bad. This is because we're using a very simple model. With some small changes, we can get to 97\%/. The best models can get to over 99.7\% accuracy! (For more information, have a look at this \href{http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html}{list of results}) 这个结果好吗?嗯,并不太好.事实上,这个结果是很差的.这是因为我们仅仅使用了一个非常简单的模型.不过,做一些小小的改进,我们就可以得到97\%的正确率.最好的模型甚至可以获得超过99.7\%的准确率!(想了解更多信息,请参考这个\href{http://rodrigob.github.io/are_we_there_yet/build/classification_datasets_results.html}{结果对比列表}.) Ⓔ What matter is that we learned from this model. Still, if you're feeling a bit down about these results, check out the next tutorial where we do a lot better, and learn how to build more sophisticated models using TensorFlow! 比结果更重要的是,我们从这个模型中学习到的设计思想.不过,如果你仍然对这里的结果有点失望,可以查看下个教程,在那里你将学到如何用TensorFlow构建更加复杂的模型以获得更好的性能! 原文地址:\href{http://tensorflow.org/tutorials/mnist/beginners/index.md}{MNIST For ML Beginners} ================================================ FILE: tex_pdf/tutorials/c2s02_minist_pros.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 % Chapter 2 Tutorials % Section 2.3 minist_pros \newpage \section {\textcolor{etc}{Deep MNIST for Experts} || 深入MNIST} \label{MINIST_pros} Ⓔ \textcolor{etc}{TensorFlow is a powerful library for doing large-scale numerical computation. One of the tasks at which it excels is implementing and training deep neural networks. In this tutorial we will learn the basic building blocks of a TensorFlow model while constructing a deep convolutional MNIST classifier.} TensorFlow是一个善于大规模数值计算的强大库件。它的一个强项就是训练并实现深度神经网络(deep neural networks)。在本小节中,我们将会学习TensorFlow模型构建的基本方法,并以此构建一个深度卷积MNIST分类器。 Ⓔ \textcolor{etc}{This introduction assumes familiarity with neural networks and the MNIST dataset. If you don't have a background with them, check out the \hyperref[MINIST_beginner]{introduction for beginners}. Be sure to \hyperref[download_install]{install TensorFlow} before starting.} 本教程假设您已经熟悉神经网络和MNIST数据集。如果你尚未了解,请查看\hyperref[MINIST_beginner]{新手指南}。再开始学习前请确保您已\hyperref[download_install]{安装TensorFlow}。 % %% \subsection {Setup | 安装} Ⓔ \textcolor{etc}{Before we create our model, we will first load the MNIST dataset, and start a TensorFlow session.} 在创建模型之前,我们会先加载MNIST数据集,然后启动一个TensorFlow会话。 \subsubsection {Load MNIST Data | 加载MINIST数据} \textcolor{etc}{For your convenience, we've included \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/examples/tutorials/mnist/input_data.py}{a script} which automatically downloads and imports the MNIST dataset. It will create a directory \lstinline{'MNIST_data'} in which to store the data files.} 为了方便起见,我们已经准备了一个\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/examples/tutorials/mnist/input_data.py}{脚本}来自动下载和导入MNIST数据集。它会自动创建一个\lstinline{'MNIST_data'}的目录来存储数据。 \begin{lstlisting} import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) \end{lstlisting} \textcolor{etc}{Here \lstinline{mnist} is a lightweight class which stores the training, validation, and testing sets as NumPy arrays. It also provides a function for iterating through data minibatches, which we will use below.} 此处的 \lstinline{mnist} 是一个以NumPy数组形式存储训练、验证和测试数据的轻量级类。我们将在之后使用到它提供的一个函数功能,用于迭代按批处理数据。 \subsubsection {Start TensorFlow InteractiveSession | 开始TensorFlow交互会话} Ⓔ \textcolor{etc}{Tensorflow relies on a highly efficient C++ backend to do its computation. The connection to this backend is called a session. The common usage for TensorFlow programs is to first create a graph and then launch it in a session.} Tensorflow基于一个高效的C++后台模块进行运算。与这个后台模块的连接叫做\emph{会话}(session)。TensorFlow编程的常规流程是先创建一个图,然后在session中加载它。 Ⓔ \textcolor{etc}{Here we instead use the convenient InteractiveSession class, which makes TensorFlow more flexible about how you structure your code. It allows you to interleave operations which build a \hyperref[computation_graph]{computation graph} with ones that run the graph. This is particularly convenient when working in interactive contexts like IPython. If you are not using an InteractiveSession, then you should build the entire computation graph before starting a session and \hyperref[launching_graph]{launching} the graph.} 这里,我们使用更加方便的\emph{交互会话}(InteractiveSession)类,它可以让您更加灵活地构建代码。交互会话能让你在运行图的时候,插入一些构建计算图的操作。这能给使用交互式文本shell如iPython带来便利。如果你没有使用InteractiveSession的话,你需要在开始session和加载图之前,构建整个计算图。 \begin{lstlisting} import tensorflow as tf sess = tf.InteractiveSession() \end{lstlisting} \subsubsection {Computation Graph | 计算图} Ⓔ \textcolor{etc}{To do efficient numerical computing in Python, we typically use libraries like NumPy that do expensive operations such as matrix multiplication outside Python, using highly efficient code implemented in another language. Unfortunately, there can still be a lot of overhead from switching back to Python every operation. This overhead is especially bad if you want to run computations on GPUs or in a distributed manner, where there can be a high cost to transferring data.} 为了高效地在Python里进行数值计算,我们一般会使用像NumPy这样用其他语言编写的库件,在Python外用其它执行效率高的语言完成这些高运算开销操作(如矩阵运算)。但是,每一步操作依然会需要切换回Python带来很大开销。特别的,这种开销会在GPU运算或是分布式集群运算这类高数据传输需求的运算形式上非常高昂。 Ⓔ \textcolor{etc}{TensorFlow also does its heavy lifting outside Python, but it takes things a step further to avoid this overhead. Instead of running a single expensive operation independently from Python, TensorFlow lets us describe a graph of interacting operations that run entirely outside Python. This approach is similar to that used in Theano or Torch.} TensorFlow将高运算量计算放在Python外进行,同时更进一步设法避免上述的额外运算开销。不同于在Python中独立运行运算开销昂贵的操作,TensorFlow让我们可以独立于Python以外以图的形式描述交互式操作。这与Theano、Torch的做法很相似。 Ⓔ \textcolor{etc}{The role of the Python code is therefore to build this external computation graph, and to dictate which parts of the computation graph should be run. See the \hyperref[computation_graph]{Computation Graph} section of \hyperref[basic_usage]{Basic Usage} for more detail.} 因此,这里Python代码的角色是构建其外部将运行的\emph{计算图},并决定计算图的哪一部分将被运行。更多的细节和\hyperref[basic_usage]{基本使用方法}请参阅\hyperref[computation_graph]{计算图}章节。 % %% \subsection{Build a Softmax Regression Model || 构建 Softmax 回归模型} Ⓔ \textcolor{etc}{In this section we will build a softmax regression model with a single linear layer. In the next section, we will extend this to the case of softmax regression with a multilayer convolutional network.} 在这小节里,我们将会构建一个包含单个线性隐层的 softmax 回归模型。我们将在下一小结把它扩展成多层卷积网络 softmax回归模型。 \index{Softmax regression} % \subsubsection{Placeholder | 占位符} Ⓔ \textcolor{etc}{We start building the computation graph by creating nodes for the input images and target output classes.} 我们先从创建输入图像和输出类别的节点来创建计算图。 \begin{lstlisting} x = tf.placeholder("float", shape=[None, 784]) y_ = tf.placeholder("float", shape=[None, 10]) \end{lstlisting} Ⓔ \textcolor{etc}{Here \lstinline{x} and \lstinline{y} aren't specific values. Rather, they are each a \lstinline{placeholder} --- a value that we'll input when we ask TensorFlow to run a computation.} 这里的\lstinline{x}和\lstinline{y}并不代表具体值,他们是一个\emph{占位符}(\lstinline{placeholder}) --- 当TensorFlow运行时需要赋值的变量。 Ⓔ \textcolor{etc}{The input images \lstinline{x} will consist of a 2d tensor of floating point numbers. Here we assign it a \lstinline{shape} of \lstinline{[None, 784]}, where \lstinline{784} is the dimensionality of a single flattened MNIST image, and None indicates that the first dimension, corresponding to the batch size, can be of any size. The target output classes \lstinline{y_} will also consist of a 2d tensor, where each row is a one-hot 10-dimensional vector indicating which digit class the corresponding MNIST image belongs to.} 输入图片\lstinline{x}是由浮点数值组成的2维张量(tensor)。这里,我们定义它为\lstinline{[None, 784]}的\lstinline{shape},其中\lstinline{784}是单张展开的MNIST图片的维度数。\lstinline{None}对应\lstinline{shape}的第一个维度,代表了这批输入图像的数量,可能是任意值。目标输出类\lstinline{y_}也是一个2维张量,其中每一行为一个10维向量代表对应MNIST图片的所属数字的类别。 Ⓔ \textcolor{etc}{The shape argument to placeholder is optional, but it allows TensorFlow to automatically catch bugs stemming from inconsistent tensor shapes.} 虽然\lstinline{placeholder}的\lstinline{shape}参数是可选的,但有了它,TensorFlow能够自动捕捉因数据维度不一致导致的错误。 %%%%% % I, Seika, have revised to here % \fcolorbox{gray}{yellow}{test 张量 \emph{矢量} 1234 \lstinline{shape}} %%%%% \subsubsection{Variables | 变量} Ⓔ \textcolor{etc}{We now define the weights \lstinline{W} and biases \lstinline{b} for our model. We could imagine treating these like additional inputs, but TensorFlow has an even better way to handle them: \lstinline{Variable}. A \lstinline{Variable} is a value that lives in TensorFlow's computation graph. It can be used and even modified by the computation. In machine learning applications, one generally has the model parameters be \lstinline{Variables}.} 我们现在为模型定义权重\lstinline{W}和偏置\lstinline{b}。它们可以被视作是额外的输入量,但是TensorFlow有一个更好的方式来处理:\lstinline{Variable}。一个\lstinline{Variable}代表着在TensorFlow计算图中的一个值,它是能在计算过程中被读取和修改的。在机器学习的应用过程中,模型参数一般用\lstinline{Variable}来表示。 \begin{lstlisting} W = tf.Variable(tf.zeros([784,10])) b = tf.Variable(tf.zeros([10])) \end{lstlisting} Ⓔ \textcolor{etc}{We pass the initial value for each parameter in the call to \lstinline{tf.Variable}. In this case, we initialize both \lstinline{W} and \lstinline{b} as tensors full of zeros. \lstinline{W} is a $784\times10$ matrix (because we have $784$ input features and $10$ outputs) and \lstinline{b} is a 10-dimensional vector (because we have $10$ classes).} 我们在调用\lstinline{tf.Variable}的时候传入初始值。在这个例子里,我们把\lstinline{W}和\lstinline{b}都初始化为零向量。\lstinline{W}是一个$784\times10$的矩阵(因为我们有784个特征和10个输出值)。\lstinline{b}是一个10维的向量(因为我们有10个分类)。 Ⓔ \textcolor{etc}{Before \lstinline{Variables} can be used within a session, they must be initialized using that session. This step takes the initial values (in this case tensors full of zeros) that have already been specified, and assigns them to each \lstinline{Variable}. This can be done for all \lstinline{Variables} at once.} \lstinline{Variable}需要在\lstinline{session}之前初始化,才能在\lstinline{session}中使用。初始化需要初始值(本例当中是全为零)传入并赋值给每一个\lstinline{Variable}。这个操作可以一次性完成。 \begin{lstlisting} sess.run(tf.initialize_all_variables()) \end{lstlisting} % %% \subsubsection{Predicted Class and Cost Function | 预测分类与损失函数} Ⓔ \textcolor{etc}{We can now implement our regression model. It only takes one line! We multiply the vectorized input images \lstinline{x} by the weight matrix \lstinline{W}, add the bias \lstinline{b}, and compute the softmax probabilities that are assigned to each class.} 现在我们可以实现我们的regression模型了。这只需要一行!我们把图片\lstinline{x}和权重矩阵\lstinline{W}相乘,加上偏置\lstinline{b},然后计算每个分类的softmax概率值。 \begin{lstlisting} y = tf.nn.softmax(tf.matmul(x,W) + b) \end{lstlisting} Ⓔ \textcolor{etc}{The cost function to be minimized during training can be specified just as easily. Our cost function will be the cross-entropy between the target and the model's prediction.} 在训练中最小化损失函数同样很简单。我们这里的损失函数用目标分类和模型预测分类之间的交叉熵。 \begin{lstlisting} cross_entropy = -tf.reduce_sum(y_*tf.log(y)) \end{lstlisting} Ⓔ \textcolor{etc}{Note that \lstinline{tf.reduce_sum} sums across all images in the minibatch, as well as all classes. We are computing the cross entropy for the entire minibatch.} 注意,\lstinline{tf.reduce_sum}把\lstinline{minibatch}里的每张图片的交叉熵值都加起来了。我们计算的交叉熵是指整个\lstinline{minibatch}的。 % %% \subsection{Train the Model | 训练模型} Ⓔ \textcolor{etc}{Now that we have defined our model and training cost function, it is straightforward to train using TensorFlow. Because TensorFlow knows the entire computation graph, it can use automatic differentiation to find the gradients of the cost with respect to each of the variables. TensorFlow has a variety of \hyperref[optimizers]{builtin optimization algorithms}. For this example, we will use steepest gradient descent, with a step length of 0.01, to descend the cross entropy.} 我们已经定义好了模型和训练的时候用的损失函数,接下来使用TensorFlow来训练。因为TensorFlow知道整个计算图,它会用自动微分法来找到损失函数对于各个变量的梯度。TensorFlow有大量\hyperref[optimizers]{内置优化算法},这个例子中,我们用快速梯度下降法让交叉熵下降,步长为0.01。 \begin{lstlisting} train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy) \end{lstlisting} Ⓔ \textcolor{etc}{What TensorFlow actually did in that single line was to add new operations to the computation graph. These operations included ones to compute gradients, compute parameter update steps, and apply update steps to the parameters.} 这一行代码实际上是用来往计算图上添加一个新操作,其中包括计算梯度,计算每个参数的步长变化,并且计算出新的参数值。 Ⓔ \textcolor{etc}{The returned operation \lstinline{train_step}, when run, will apply the gradient descent updates to the parameters. Training the model can therefore be accomplished by repeatedly running \lstinline{train_step}.} \lstinline{train_step}这个操作,用梯度下降来更新权值。因此,整个模型的训练可以通过反复地运行\lstinline{train_step}来完成。 \begin{lstlisting} for i in range(1000): batch = mnist.train.next_batch(50) train_step.run(feed_dict={x: batch[0], y_: batch[1]}) \end{lstlisting} Ⓔ \textcolor{etc}{Each training iteration we load 50 training examples. We then run the \lstinline{train_step} operation, using \lstinline{feed_dict} to replace the \lstinline{placeholder} tensors \lstinline{x} and \lstinline{y_} with the training examples. Note that you can replace any tensor in your computation graph using \lstinline{feed_dict} --- it's not restricted to just \lstinline{placeholders}.} 每一步迭代,我们都会加载50个训练样本,然后执行一次\lstinline{train_step,使用}\lstinline{feed_dict},用训练数据替换\lstinline{placeholder}向量\lstinline{x}和\lstinline{y_}。注意,在计算图中,你可以用\lstinline{feed_dict}来替代任何张量,并不仅限于替换\lstinline{placeholder}。 \subsubsection{Evaluate the Model | 评估模型} Ⓔ \textcolor{etc}{How well did our model do?} 我们的模型效果怎样? Ⓔ \textcolor{etc}{First we'll figure out where we predicted the correct label. \lstinline{tf.argmax} is an extremely useful function which gives you the index of the highest entry in a tensor along some axis. For example, \lstinline{tf.argmax(y,1)} is the label our model thinks is most likely for each input, while \lstinline{tf.argmax(y_,1)} is the true label. We can use \lstinline{tf.equal} to check if our prediction matches the truth.} 首先,要先知道我们哪些label是预测正确了。\lstinline{tf.argmax}是一个非常有用的函数。它会返回一个张量某个维度中的最大值的索引。例如,\lstinline{tf.argmax(y,1)}表示我们模型对每个输入的最大概率分类的分类值。而 \lstinline{tf.argmax(y_,1)}表示真实分类值。我们可以用\lstinline{tf.equal}来判断我们的预测是否与真实分类一致。 \begin{lstlisting} correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) \end{lstlisting} Ⓔ \textcolor{etc}{That gives us a list of booleans. To determine what fraction are correct, we cast to floating point numbers and then take the mean. For example, \lstinline{[True, False, True, True]} would become \lstinline{[1,0,1,1]} which would become \lstinline{0.75}.} 这里返回一个布尔数组。为了计算我们分类的准确率,我们将布尔值转换为浮点数来代表对、错,然后取平均值。例如:\lstinline{[True, False, True, True]}变为\lstinline{[1,0,1,1]},计算出平均值为\lstinline{0.75}。 \begin{lstlisting} accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) \end{lstlisting} Ⓔ \textcolor{etc}{Finally, we can evaluate our accuracy on the test data. This should be about 91\% correct.} 最后,我们可以计算出在测试数据上的准确率,大概是91\%。 \begin{lstlisting} print accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}) \end{lstlisting} % %% \subsection{Build a Multilayer Convolutional Network | 构建多层卷积网络模型} Ⓔ \textcolor{etc}{Getting 91\% accuracy on MNIST is bad. It's almost embarrassingly bad. In this section, we'll fix that, jumping from a very simple model to something moderately sophisticated: a small convolutional neural network. This will get us to around 99.2\% accuracy --- not state of the art, but respectable.} 在MNIST上只有91\%正确率,实在太糟糕。在这个小节里,我们用一个稍微复杂的模型:卷积神经网络来改善效果。这会达到大概99.2\%的准确率。虽然不是最高,但是还是比较让人满意。 \index{卷积神经网络} % \subsubsection{Weight Initialization | 权重初始化} Ⓔ \textcolor{etc}{To create this model, we're going to need to create a lot of weights and biases. One should generally initialize weights with a small amount of noise for symmetry breaking, and to prevent 0 gradients. Since we're using ReLU neurons, it is also good practice to initialize them with a slightly positive initial bias to avoid "dead neurons." Instead of doing this repeatedly while we build the model, let's create two handy functions to do it for us.} 在创建模型之前,我们先来创建权重和偏置。一般来说,初始化时应加入轻微噪声,来打破对称性,防止零梯度的问题。因为我们用的是ReLU,所以用稍大于0的值来初始化偏置能够避免节点输出恒为0的问题(dead neurons)。为了不在建立模型的时候反复做初始化操作,我们定义两个函数用于初始化。 \begin{lstlisting} def weight_variable(shape): initial = tf.truncated_normal(shape, stddev=0.1) return tf.Variable(initial) def bias_variable(shape): initial = tf.constant(0.1, shape=shape) return tf.Variable(initial) \end{lstlisting} \subsubsection{Convolution and Pooling | 卷积和池化} Ⓔ \textcolor{etc}{TensorFlow also gives us a lot of flexibility in convolution and pooling operations. How do we handle the boundaries? What is our stride size? In this example, we're always going to choose the vanilla version. Our convolutions uses a stride of one and are zero padded so that the output is the same size as the input. Our pooling is plain old max pooling over $2\times2$ blocks. To keep our code cleaner, let's also abstract those operations into functions.} TensorFlow在卷积和池化上有很强的灵活性。我们怎么处理边界?步长应该设多大?在这个实例里,我们会一直使用vanilla版本。我们的卷积使用1步长(stride size),0边距(padding size)的模板,保证输出和输入是同一个大小。我们的池化用简单传统的$2\times2$大小的模板做max pooling。为了代码更简洁,我们把这部分抽象成一个函数。 \begin{lstlisting} def conv2d(x, W): return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME') def max_pool_2x2(x): return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME') \end{lstlisting} \subsubsection{First Convolutional Layer | 第一层卷积} Ⓔ \textcolor{etc}{We can now implement our first layer. It will consist of convolution, followed by max pooling. The convolutional will compute 32 features for each $5\times5$ patch. Its weight tensor will have a shape of \lstinline{[5, 5, 1, 32]}. The first two dimensions are the patch size, the next is the number of input channels, and the last is the number of output channels. We will also have a bias vector with a component for each output channel.} 现在我们可以开始实现第一层了。它由一个卷积接一个max pooling完成。卷积在每个$5\times5$的patch中算出32个特征。权重是一个\lstinline{[5, 5, 1, 32]}的张量,前两个维度是patch的大小,接着是输入的通道数目,最后是输出的通道数目。输出对应一个同样大小的偏置向量。 \begin{lstlisting} W_conv1 = weight_variable([5, 5, 1, 32]) b_conv1 = bias_variable([32]) \end{lstlisting} Ⓔ \textcolor{etc}{To apply the layer, we first reshape \lstinline{x} to a 4d tensor, with the second and third dimensions corresponding to image width and height, and the final dimension corresponding to the number of color channels.} 为了用这一层,我们把\lstinline{x}变成一个4d向量,第2、3维对应图片的宽高,最后一维代表颜色通道。 \begin{lstlisting} x_image = tf.reshape(x, [-1,28,28,1]) \end{lstlisting} Ⓔ \textcolor{etc}{We then convolve \lstinline{x_image} with the weight tensor, add the bias, apply the ReLU function, and finally max pool.} 我们把\lstinline{x_image}和权值向量进行卷积相乘,加上偏置,使用ReLU激活函数,最后max pooling。 \begin{lstlisting} h_conv1 = tf.nn.relu(conv2d(x_image, W_conv1) + b_conv1) h_pool1 = max_pool_2x2(h_conv1) \end{lstlisting} % \subsubsection{Second Convolutional Layer | 第二层卷积} Ⓔ \textcolor{etc}{In order to build a deep network, we stack several layers of this type. The second layer will have 64 features for each $5\times5$ patch.} 为了构建一个更深的网络,我们会把几个类似的层堆叠起来。第二层中,每个5x5的patch会得到64个特征。 \begin{lstlisting} W_conv2 = weight_variable([5, 5, 32, 64]) b_conv2 = bias_variable([64]) h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2) h_pool2 = max_pool_2x2(h_conv2) \end{lstlisting} % \subsubsection{Densely Connected Layer | 密集连接层} Ⓔ \textcolor{etc}{Now that the image size has been reduced to $7\times7$, we add a fully-connected layer with 1024 neurons to allow processing on the entire image. We reshape the tensor from the pooling layer into a batch of vectors, multiply by a weight matrix, add a bias, and apply a ReLU.} 现在,图片降维到$7\times7$,我们加入一个有1024个神经元的全连接层,用于处理整个图片。我们把池化层输出的张量reshape成一些向量,乘上权重矩阵,加上偏置,使用ReLU激活。 \begin{lstlisting} W_fc1 = weight_variable([7 * 7 * 64, 1024]) b_fc1 = bias_variable([1024]) h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64]) h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1) \end{lstlisting} \textbf{Dropout} Ⓔ \textcolor{etc}{To reduce overfitting, we will apply dropout before the readout layer. We create a placeholder for the probability that a neuron's output is kept during dropout. This allows us to turn dropout on during training, and turn it off during testing. TensorFlow's tf.nn.dropout op automatically handles scaling neuron outputs in addition to masking them, so dropout just works without any additional scaling.} 为了减少过拟合,我们在输出层之前加入dropout。我们用一个placeholder来代表一个神经元在dropout中被保留的概率。这样我们可以在训练过程中启用dropout,在测试过程中关闭dropout。 TensorFlow的\lstinline{tf.nn.dropout}操作会自动处理神经元输出值的scale。所以用dropout的时候可以不用考虑scale。 \begin{lstlisting} keep_prob = tf.placeholder("float") h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob) \end{lstlisting} % \subsubsection{Readout Layer | 输出层} Ⓔ \textcolor{etc}{Finally, we add a softmax layer, just like for the one layer softmax regression above.} 最后,我们添加一个softmax层,就像前面的单层softmax regression一样。 \begin{lstlisting} W_fc2 = weight_variable([1024, 10]) b_fc2 = bias_variable([10]) y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2) \end{lstlisting} % \subsubsection{Train and Evaluate the Model | 训练和评估模型} Ⓔ \textcolor{etc}{How well does this model do? To train and evaluate it we will use code that is nearly identical to that for the simple one layer SoftMax network above. The differences are that: we will replace the steepest gradient descent optimizer with the more sophisticated ADAM optimizer; we will include the additional parameter \lstinline{keep_prob} in \lstinline{feed_dict} to control the dropout rate; and we will add logging to every 100th iteration in the training process.} 这次效果又有多好呢?我们用前面几乎一样的代码来测测看。只是我们会用更加复杂的ADAM优化器来做梯度最速下降,在feed\_dict中加入额外的参数keep\_prob来控制dropout比例。然后每100次迭代输出一次日志。 \begin{lstlisting} cross_entropy = -tf.reduce_sum(y_*tf.log(y_conv)) train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy) correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) sess.run(tf.initialize_all_variables()) for i in range(20000): batch = mnist.train.next_batch(50) if i%100 == 0: train_accuracy = accuracy.eval(feed_dict={ x:batch[0], y_: batch[1], keep_prob: 1.0}) print "step %d, training accuracy %g"%(i, train_accuracy) train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5}) print "test accuracy %g"%accuracy.eval(feed_dict={ x: mnist.test.images, y_: mnist.test.labels, keep_prob: 1.0}) \end{lstlisting} Ⓔ \textcolor{etc}{The final test set accuracy after running this code should be approximately 99.2\%.} 以上代码,在最终测试集上的准确率大概是99.2\%。 Ⓔ \textcolor{etc}{We have learned how to quickly and easily build, train, and evaluate a fairly sophisticated deep learning model using TensorFlow.} 目前为止,我们已经学会了用TensorFlow来快速和简易地搭建、训练和评估一个复杂一点儿的深度学习模型。 原文地址:\href{https://www.tensorflow.org/versions/master/tutorials/mnist/pros/index.html#deep-mnist-for-experts}{Deep MNIST for Experts} ================================================ FILE: tex_pdf/tutorials/c2s03_.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 % Chapter 2 Tutorials % Section 2.4 TensorFlow Mechanics 101 \newpage \section {TensorFlow Mechanics 101} \label{tf_mech101} Ⓔ \textcolor{etc}{\textbf{Code}: \href{https://tensorflow/g3doc/tutorials/mnist/}{tensorflow/examples/tutorials/mnist/}} \textbf{代码地址}: \href{https://tensorflow/g3doc/tutorials/mnist/}{tensorflow/g3doc/tutorials/mnist/} Ⓔ \textcolor{etc}{The goal of this tutorial is to show how to use TensorFlow to train and evaluate a simple feed-forward neural network for handwritten digit classification using the (classic) MNIST data set. The intended audience for this tutorial is experienced machine learning users interested in using TensorFlow.} 本篇教程的目的,是向大家展示如何利用TensorFlow使用(经典)MNIST数据集训练并评估一个用于识别手写数字的简易前馈神经网络(feed-forward neural network)。我们的目标读者是有兴趣使用TensorFlow的机器学习资深人士。 Ⓔ \textcolor{etc}{These tutorials are not intended for teaching Machine Learning in general.} 因此,撰写该系列教程并不是为了教大家机器学习领域的基础知识。 Ⓔ \textcolor{etc}{Please ensure you have followed the instructions to \href{https://www.tensorflow.org/versions/master/get_started/os_setup.html}{install TensorFlow}.} 在学习本教程之前,请确保您已按照安装TensorFlow教程中的要求,完成了\href{https://www.tensorflow.org/versions/master/get_started/os_setup.html}{安装}。 \subsection {教程使用的文件} \label{minist_tf} 本教程引用如下文件: % add table here 只需要直接运行fully\_connected\_feed.py文件,就可以开始训练: python fully\_connected\_feed.py \subsection {准备数据} MNIST是机器学习领域的一个经典问题,指的是让机器查看一系列大小为$28\times28$像素的手写数字灰度图像,并判断这些图像代表0--9中的哪一个数字。 \begin{figure}[htbp] \centering \includegraphics[width=.55\textwidth]{../SOURCE/images/MNIST.png} \caption{} \end{figure} 更多相关信息,请查阅Yann LeCun网站中关于MNIST的介绍 或者Chris Olah对MNIST的可视化探索。 \subsubsection{下载} 在\lstinline{run\_training()}方法的一开始,\lstinline{input\_data.read\_data\_sets()}函数会确保你的本地训练文件夹中,已经下载了正确的数据,然后将这些数据解压并返回一个含有\lstinline{DataSet}实例的字典。 \begin{lstlisting} data_sets = input_data.read_data_sets(FLAGS.train_dir, FLAGS.fake_data) \end{lstlisting}\footnote{\lstinline{fake_data}标记是用于单元测试的,读者可以不必理会。} % 数据集 | 目的 % --- | --- % `data_sets.train` | 55000个图像和标签(labels),作为主要训练集。 % `data_sets.validation` | 5000个图像和标签,用于迭代验证训练准确度。 % `data_sets.test` | 10000个图像和标签,用于最终测试训练准确度(trained accuracy)。 Add table here\footnote{了解更多数据有关信息,请查阅此系列教程的[数据下载](mnist/download/index.md) 部分.} \subsubsection{输入与占位符} % ### 输入与占位符(Inputs and Placeholders) \lstinline{placeholder_inputs()}函数将生成两个\lstinline{tf.placeholder} %[`tf.placeholder`](../api_docs/python/io_ops.md#placeholder) 操作,定义传入图表中的shape参数,shape参数中包括\lstinline{batch_size}值,后续还会将实际的训练用例传入图表。 \begin{lstlisting} images_placeholder = tf.placeholder(tf.float32, shape=(batch_size, IMAGE_PIXELS)) labels_placeholder = tf.placeholder(tf.int32, shape=(batch_size)) \end{lstlisting} 在训练循环(training loop)的后续步骤中,传入的整个图像和标签数据集会被切片,以符合每一个操作所设置的 \lstinline{batch_size}值,占位符操作将会填补以符合这个\lstinline{batch_size}值。然后使用\lstinline{feed_dict}参数,将数据传入\lstinline{sess.run()}函数。 \subsection {构建图表 (Build the Graph)} %## 构建图表 (Build the Graph) 在为数据创建占位符之后,就可以运行\lstinline{mnist.py}文件,经过三阶段的模式函数操作:\lstinline{inference()}, \lstinline{loss()\lstinline},和\lstinline{training()}。图表就构建完成了。 \begin{enumerate} \item \lstinline{inference()} —— 尽可能地构建好图表,满足促使神经网络向前反馈并做出预测的要求。 \item \lstinline{loss()} —— 往inference图表中添加生成损失(loss)所需要的操作(ops)。 \item \lstinline{training()} —— 往损失图表中添加计算并应用梯度(gradients)所需的操作。 \end{enumerate} \begin{figure}[htbp] \centering \includegraphics[width=.95\textwidth]{../SOURCE/images/mnist_subgraph.png} \caption{} \end{figure} \subsubsection{推理(Inference)} %### 推理(Inference) \lstinline{inference()}函数会尽可能地构建图表,做到返回包含了预测结果(output prediction)的Tensor。 它接受图像占位符为输入,在此基础上借助ReLu(Rectified Linear Units)激活函数,构建一对完全连接层(layers),以及一个有着十个节点(node)、指明了输出logtis模型的线性层。 每一层都创建于一个唯一的\lstinline{tf.name_scope}%(../api_docs/python/framework.md#name_scope) 之下,创建于该作用域之下的所有元素都将带有其前缀。 \begin{lstlisting} with tf.name_scope('hidden1') as scope: \end{lstlisting} 在定义的作用域中,每一层所使用的权重和偏差都在\lstinline{tf.Variable} %(../api_docs/python/state_ops.md#Variable) 实例中生成,并且包含了各自期望的shape。 \begin{lstlisting} weights = tf.Variable(tf.truncated_normal([IMAGE_PIXELS, hidden1_units], stddev=1.0 / math.sqrt(float(IMAGE_PIXELS))), name='weights') biases = tf.Variable(tf.zeros([hidden1_units]), name='biases') \end{lstlisting} 例如,当这些层是在\lstinline{hidden1}作用域下生成时,赋予权重变量的独特名称将会是"\lstinline{hidden1/weights}"。 每个变量在构建时,都会获得初始化操作(initializer ops)。 在这种最常见的情况下,通过\lstinline{tf.truncated_normal} %(../api_docs/python/constant_op.md#truncated_normal) 函数初始化权重变量,给赋予的shape则是一个二维tensor,其中第一个维度代表该层中权重变量所连接(connect from)的单元数量,第二个维度代表该层中权重变量所连接到的(connect to)单元数量。对于名叫\lstinline{hidden1}的第一层,相应的维度则是\lstinline{[IMAGE_PIXELS, hidden1_units]},因为权重变量将图像输入连接到了\lstinline{hidden1}层。\lstinline{tf.truncated_normal}初始函数将根据所得到的均值和标准差,生成一个随机分布。 然后,通过\lstinline{tf.zeros} %(../api_docs/python/constant_op.md#zeros) 函数初始化偏差变量(biases),确保所有偏差的起始值都是0,而它们的shape则是其在该层中所接到的(connect to)单元数量。 图表的三个主要操作,分别是两个\lstinline{tf.nn.relu} %(../api_docs/python/nn.md#relu) 操作,它们中嵌入了隐藏层所需的\lstinline{tf.matmul} %(../api_docs/python/math_ops.md#matmul) ;以及logits模型所需的另外一个\lstinline{tf.matmul}。三者依次生成,各自的\lstinline{tf.Variable}实例则与输入占位符或下一层的输出tensor所连接。 \begin{lstlisting} hidden1 = tf.nn.relu(tf.matmul(images, weights) + biases) \end{lstlisting} \begin{lstlisting} hidden2 = tf.nn.relu(tf.matmul(hidden1, weights) + biases) \end{lstlisting} \begin{lstlisting} logits = tf.matmul(hidden2, weights) + biases \end{lstlisting} 最后,程序会返回包含了输出结果的`logits`Tensor。 \subsubsection{损失(Loss)} \lstinline{loss()}函数通过添加所需的损失操作,进一步构建图表。 首先,\lstinline{labels_placeholer}中的值,将被编码为一个含有1-hot values的Tensor。例如,如果类标识符为“3”,那么该值就会被转换为: \lstinline{[0, 0, 0, 1, 0, 0, 0, 0, 0, 0]} \begin{lstlisting} batch_size = tf.size(labels) labels = tf.expand_dims(labels, 1) indices = tf.expand_dims(tf.range(0, batch_size, 1), 1) concated = tf.concat(1, [indices, labels]) onehot_labels = tf.sparse_to_dense( concated, tf.pack([batch_size, NUM_CLASSES]), 1.0, 0.0) \end{lstlisting} 之后,又添加一个\lstinline{tf.nn.softmax_cross_entropy_with_logits} %`](../api_docs/python/nn.md#softmax_cross_entropy_with_logits) 操作\footnote{交叉熵是信息理论中的概念,可以让我们描述如果基于已有事实,相信神经网络所做的推测最坏会导致什么结果。更多详情,请查阅博文《可视化信息理论》(http://colah.github.io/posts/2015-09-Visual-Information/)},用来比较\lstinline{inference()}函数与1-hot标签所输出的logits Tensor。 \begin{lstlisting} cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits, onehot_labels, name='xentropy') \end{lstlisting} 然后,使用\lstinline{tf.reduce_mean} %(../api_docs/python/math_ops.md#reduce_mean) 函数,计算batch维度(第一维度)下交叉熵(cross entropy)的平均值,将将该值作为总损失。 \begin{lstlisting} loss = tf.reduce_mean(cross_entropy, name='xentropy_mean') \end{lstlisting} 最后,程序会返回包含了损失值的Tensor。 \subsubsection{训练} \lstinline{training()}函数添加了通过梯度下降(gradient descent)将损失最小化所需的操作。 首先,该函数从\lstinline{loss()}函数中获取损失Tensor,将其交给\lstinline{[tf.scalar_summary]} % ](../api_docs/python/train.md#scalar_summary) ,后者在与\lstinline{SummaryWriter}(见下文)配合使用时,可以向事件文件(events file)中生成汇总值(summary values)。在本篇教程中,每次写入汇总值时,它都会释放损失Tensor的当前值(snapshot value)。 \begin{lstlisting} tf.scalar_summary(loss.op.name, loss) \end{lstlisting} 接下来,我们实例化一个\lstinline{[tf.train.GradientDescentOptimizer]} % (../api_docs/python/train.md#GradientDescentOptimizer) ,负责按照所要求的学习效率(learning rate)应用梯度下降法(gradients)。 \begin{lstlisting} optimizer = tf.train.GradientDescentOptimizer(FLAGS.learning_rate) \end{lstlisting} 之后,我们生成一个变量用于保存全局训练步骤(global training step)的数值,并使用\lstinline{minimize()} % (../api_docs/python/train.md#Optimizer.minimize) 函数更新系统中的三角权重(triangle weights)、增加全局步骤的操作。根据惯例,这个操作被称为\lstinline{train_op},是TensorFlow会话(session)诱发一个完整训练步骤所必须运行的操作(见下文)。 \begin{lstlisting} global_step = tf.Variable(0, name='global_step', trainable=False) train_op = optimizer.minimize(loss, global_step=global_step) \end{lstlisting} 最后,程序返回包含了训练操作(training op)输出结果的Tensor。 \subsection{训练模型} 一旦图表构建完毕,就通过\lstinline{fully_connected_feed.py}文件中的用户代码进行循环地迭代式训练和评估。 \subsubsection{图表 (The Graph)} 在\lstinline{run_training()}这个函数的一开始,是一个Python语言中的\lstinline{with}命令,这个命令表明所有已经构建的操作都要与默认的\lstinline{[`tf.Graph`]} %(../api_docs/python/framework.md#Graph) 全局实例关联起来。 \begin{lstlisting} with tf.Graph().as_default(): \end{lstlisting} \lstinline{tf.Graph}实例是一系列可以作为整体执行的操作。TensorFlow的大部分场景只需要依赖默认图表一个实例即可。 利用多个图表的更加复杂的使用场景也是可能的,但是超出了本教程的范围。 \subsubsection{会话 (The Session)} 完成全部的构建准备、生成全部所需的操作之后,我们就可以创建一个\lstinline{tf.Session} %(../api_docs/python/client.md#Session) ,用于运行图表。 \begin{lstlisting} sess = tf.Session() \end{lstlisting} 另外,也可以利用\lstinline{with}代码块生成\lstinline{Session},限制作用域: \begin{lstlisting} with tf.Session() as sess: \end{lstlisting} \lstinline{Session}函数中没有传入参数,表明该代码将会依附于(如果还没有创建会话,则会创建新的会话)默认的本地会话。 生成会话之后,所有\lstinline{tf.Variable}实例都会立即通过调用各自初始化操作中的\lstinline{sess.run()} %(../api_docs/python/client.md#Session.run) 函数进行初始化。 \begin{lstlisting} init = tf.initialize_all_variables() sess.run(init) \end{lstlisting} \lstinline{sess.run()} %(../api_docs/python/client.md#Session.run) 方法将会运行图表中与作为参数传入的操作相对应的完整子集。在初次调用时,\lstinline{init}操作只包含了变量初始化程序\lstinline{tf.group} %(../api_docs/python/control_flow_ops.md#group) 。图表的其他部分不会在这里,而是在下面的训练循环运行。 \subsubsection{训练循环} 完成会话中变量的初始化之后,就可以开始训练了。 训练的每一步都是通过用户代码控制,而能实现有效训练的最简单循环就是: \begin{lstlisting} for step in xrange(max_steps): sess.run(train_op) \end{lstlisting} 但是,本教程中的例子要更为复杂一点,原因是我们必须把输入的数据根据每一步的情况进行切分,以匹配之前生成的占位符。 \paragragh{向图表提} 执行每一步时,我们的代码会生成一个反馈字典(feed dictionary),其中包含对应步骤中训练所要使用的例子,这些例子的哈希键就是其所代表的占位符操作。 \lstinline{fill_feed_dict}函数会查询给定的\lstinline{DataSet},索要下一批次\lstinline{batch_size}的图像和标签,与占位符相匹配的Tensor则会包含下一批次的图像和标签。 \begin{lstlisting} images_feed, labels_feed = data_set.next_batch(FLAGS.batch_size) \end{lstlisting} 然后,以占位符为哈希键,创建一个Python字典对象,键值则是其代表的反馈Tensor。 \begin{lstlisting} feed_dict = { images_placeholder: images_feed, labels_placeholder: labels_feed, } \end{lstlisting} 这个字典随后作为\lstinline{feed_dict}参数,传入\lstinline{sess.run()}函数中,为这一步的训练提供输入样例。 \paragragh{检查状态} 在运行\lstinline{sess.run()}函数时,要在代码中明确其需要获取的两个值:\lstinline{[train_op, loss]}。 \begin{lstlisting} for step in xrange(FLAGS.max_steps): feed_dict = fill_feed_dict(data_sets.train, images_placeholder, labels_placeholder) _, loss_value = sess.run([train_op, loss], feed_dict=feed_dict) \end{lstlisting} 因为要获取这两个值,\lstinline{sess.run()}会返回一个有两个元素的元组。其中每一个\lstinline{Tensor}对象,对应了返回的元组中的numpy数组,而这些数组中包含了当前这步训练中对应Tensor的值。由于\lstinline{train_op}并不会产生输出,其在返回的元祖中的对应元素就是\lstinline{None},所以会被抛弃。但是,如果模型在训练中出现偏差,\lstinline{loss} Tensor的值可能会变成\lstinline{NaN},所以我们要获取它的值,并记录下来。 假设训练一切正常,没有出现\lstinline{NaN},训练循环会每隔100个训练步骤,就打印一行简单的状态文本,告知用户当前的训练状态。 \begin{lstlisting} if step % 100 == 0: print 'Step %d: loss = %.2f (%.3f sec)' % (step, loss_value, duration) \end{lstlisting} \paragraph{状态可视化} 为了释放\lstinline{[TensorBoard]} %(../how_tos/summaries_and_tensorboard.md) 所使用的事件文件(events file),所有的即时数据(在这里只有一个)都要在图表构建阶段合并至一个操作(op)中。 \begin{lstlisting} summary_op = tf.merge_all_summaries() \end{lstlisting} 在创建好会话(session)之后,可以实例化一个\lstinline{tf.train.SummaryWriter} % (../api_docs/python/train.md#SummaryWriter) ,用于写入包含了图表本身和即时数据具体值的事件文件。 \begin{lstlisting} summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, graph_def=sess.graph_def) \end{lstlisting} 最后,每次运行\lstinline{summary_op}时,都会往事件文件中写入最新的即时数据,函数的输出会传入事件文件读写器(writer)的\lstinline{add_summary()}函数。。 \begin{lstlisting} summary_str = sess.run(summary_op, feed_dict=feed_dict) summary_writer.add_summary(summary_str, step) \end{lstlisting} 事件文件写入完毕之后,可以就训练文件夹打开一个TensorBoard,查看即时数据的情况。 \begin{figure}[htbp] \centering \includegraphics[width=.85\textwidth]{../SOURCE/images/mnist_tensorboard.png} \centering \end{figure} % ![MNIST TensorBoard](../images/mnist_tensorboard.png "MNIST TensorBoard") **注意**:了解更多如何构建并运行TensorBoard的信息,请查看相关教程\hyperref[vis_learning]{Tensorboard:训练过程可视化}。 \paragraph{保存检查点(checkpoint)} 为了得到可以用来后续恢复模型以进一步训练或评估的检查点文件(checkpoint file),我们实例化一个\lstinline{tf.train.Saver} % (../api_docs/python/state_ops.md#Saver)。 \begin{lstlisting} saver = tf.train.Saver() \end{lstlisting} 在训练循环中,将定期调用\lstinline{saver.save()} %(../api_docs/python/state_ops.md#Saver.save) 方法,向训练文件夹中写入包含了当前所有可训练变量值得检查点文件。 \begin{lstlisting} saver.save(sess, FLAGS.train_dir, global_step=step) \end{lstlisting} 这样,我们以后就可以使用\lstinline{saver.restore()} %(../api_docs/python/state_ops.md#Saver.restore) 方法,重载模型的参数,继续训练。 \begin{lstlisting} saver.restore(sess, FLAGS.train_dir) \end{lstlisting} \subsection{评估模型} 每隔一千个训练步骤,我们的代码会尝试使用训练数据集与测试数据集,对模型进行评估。\lstinline{do_eval}函数会被调用三次,分别使用训练数据集、验证数据集合测试数据集。 \begin{lstlisting} print 'Training Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.train) print 'Validation Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.validation) print 'Test Data Eval:' do_eval(sess, eval_correct, images_placeholder, labels_placeholder, data_sets.test) \end{lstlisting} >注意,更复杂的使用场景通常是,先隔绝\lstinline{data_sets.test}测试数据集,只有在大量的超参数优化调整(hyperparameter tuning)之后才进行检查。但是,由于MNIST问题比较简单,我们在这里一次性评估所有的数据。 \subsubsection {构建评估图表(Eval Graph)} 在打开默认图表(Graph)之前,我们应该先调用\lstinline{get_data(train=False)}函数,抓取测试数据集。 \begin{lstlisting} test_all_images, test_all_labels = get_data(train=False) \end{lstlisting} 在进入训练循环之前,我们应该先调用\lstinline{mnist.py}文件中的\lstinline{evaluation}函数,传入的logits和标签参数要与\lstinline{loss}函数的一致。这样做事为了先构建Eval操作。 \begin{lstlisting} eval_correct = mnist.evaluation(logits, labels_placeholder) \end{lstlisting} \lstinline{evaluation}函数会生成\hyperref[(../api_docs/python/nn.md#in_top_k)]{\lstinline{tf.nn.in_top_k}} 操作,如果在$k$个最有可能的预测中可以发现真的标签,那么这个操作就会将模型输出标记为正确。在本文中,我们把$k$的值设置为1,也就是只有在预测是真的标签时,才判定它是正确的。 \begin{lstlisting} eval_correct = tf.nn.in_top_k(logits, labels, 1) \end{lstlisting} \subsubsection {评估图表的输出(Eval Output)} 之后,我们可以创建一个循环,往其中添加\lstinline{feed_dict},并在调用\lstinline{sess.run()}函数时传入\lstinline{eval_correct}操作,目的就是用给定的数据集评估模型。 \begin{lstlisting} for step in xrange(steps_per_epoch): feed_dict = fill_feed_dict(data_set, images_placeholder, labels_placeholder) true_count += sess.run(eval_correct, feed_dict=feed_dict) \end{lstlisting} \lstinline{true_count}变量会累加所有\lstinline{in_top_k}操作判定为正确的预测之和。接下来,只需要将正确测试的总数,除以例子总数,就可以得出准确率了。 \begin{lstlisting} precision = float(true_count) / float(num_examples) print ' Num examples: %d Num correct: %d Precision @ 1: %0.02f' % ( num_examples, true_count, precision) \end{lstlisting} 原文:\href{http://www.tensorflow.org/tutorials/mnist/tf/index.md}{TensorFlow Mechanics 101} 翻译:\href{https://github.com/bingjin}{bingjin} 校对:\href{https://github.com/LichAmnesia}{LichAmnesia} ================================================ FILE: tex_pdf/tutorials/c2s04_.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 % Chapter 2 Tutorials % Section 2.5 \newpage \section {卷积神经网络} \label{cnn} \subsection {Overview} %Ⓔ CIFAR-10 classification is a common benchmark problem in machine learning. The problem is to classify RGB 32x32 pixel images across 10 categories: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. 对CIFAR-10 数据集的分类是机器学习中一个公开的基准测试问题,其任务是对一组32x32RGB的图像进行分类,这些图像涵盖了10个类别:\lstinline{airplane}, \lstinline{automobile}, \lstinline{bird}, \lstinline{cat}, \lstinline{deer}, \lstinline{dog}, \lstinline{frog}, \lstinline{horse}, \lstinline{ship}, 和 \lstinline{truck}.\footnote{This tutorial is intended for advanced users of TensorFlow and assumes expertise and experience in machine learning} \begin{figure}[htbp] \centering \includegraphics[width=.65\textwidth]{../SOURCE/images/cifar_samples.png} \caption{} \end{figure} 想了解更多信息请参考\href{http://www.cs.toronto.edu/~kriz/cifar.html}{CIFAR-10 page},以及Alex Krizhevsky写的\href{http://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf}{技术报告}。 \subsubsection {目标} 本教程的目标是建立一个用于识别图像的相对较小的卷积神经网络,在这一过程中,本教程会: \begin{itemize} \item 着重于建立一个规范的网络组织结构,训练并进行评估; \item 为建立更大规模更加复杂的模型提供一个范例 \end{itemize} 选择CIFAR-10是因为它的复杂程度足以用来检验TensorFlow中的大部分功能,并可将其扩展为更大的模型。与此同时由于模型较小所以训练速度很快,比较适合用来测试新的想法,检验新的技术。 \subsubsection {本教程的重点} CIFAR-10 教程演示了在TensorFlow上构建更大更复杂模型的几个种重要内容: \begin{itemize} \item 相关核心数学对象,如卷积、修正线性激活、最大池化以及局部响应归一化; \item 训练过程中一些网络行为的可视化,这些行为包括输入图像、损失情况、网络行为的分布情况以及梯度; \item 算法学习参数的移动平均值的计算函数,以及在评估阶段使用这些平均值提高预测性能; \item 实现了一种机制,使得学习率随着时间的推移而递减; \item 为输入数据设计预存取队列,将磁盘延迟和高开销的图像预处理操作与模型分离开来处理; \end{itemize} 我们也提供了模型的多GPU版本,用以表明: \begin{itemize} \item 可以配置模型后使其在多个GPU上并行的训练 \item 可以在多个GPU之间共享和更新变量值 \end{itemize} 我们希望本教程给大家开了个头,使得在Tensorflow上可以为视觉相关工作建立更大型的Cnns模型 \subsubsection {模型架构} 本教程中的模型是一个多层架构,由卷积层和非线性层(nonlinearities)交替多次排列后构成。这些层最终通过全连通层对接到softmax分类器上。这一模型除了最顶部的几层外,基本跟Alex Krizhevsky提出的模型一致。 在一个GPU上经过几个小时的训练后,该模型达到了最高86\%的精度。细节请查看下面的描述以及代码。模型中包含了1,068,298个学习参数,分类一副图像需要大概19.5M个乘加操作。 \subsection {Code Organization} 本教程的代码位于\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/}{tensorflow/models/image/cifar10/}. % insert table here \begin{longtable}[c]{@{}ll@{}} \toprule \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 文件 \strut\end{minipage} & \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 作用 \strut\end{minipage}\tabularnewline \midrule \endhead \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_input.py}{\lstinline{cifar10\_input.py}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 读取本地CIFAR-10的二进制文件格式的内容。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py}{\lstinline{cifar10.py}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 建立CIFAR-10的模型。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_train.py}{\lstinline{cifar10\_train.py}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 在CPU或GPU上训练CIFAR-10的模型。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_multi_gpu_train.py}{\lstinline{cifar10\_multi\_gpu\_train.py}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 在多GPU上训练CIFAR-10的模型。 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10_eval.py}{\lstinline{cifar10\_eval.py}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 评估CIFAR-10模型的预测性能。 \strut\end{minipage}\tabularnewline \bottomrule \end{longtable} \subsection{CIFAR-10 模型 }\label{cifar-10-ux6a21ux578b} CIFAR-10 网络模型部分的代码位于\href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/image/cifar10/cifar10.py}{\lstinline{cifar10.py}}。完整的训练图中包含约765个操作,但是通过下面的模块来构造训练图可以最大限度的提高代码复用率: \begin{enumerate} \def\labelenumi{\arabic{enumi}.} \tightlist \item \protect\hyperlink{model-inputs}{\textbf{模型输入:}} 包括\lstinline{inputs()} 、 \lstinline{distorted\_inputs()}等一些操作,分别用于读取CIFAR的图像并进行预处理,做为后续评估和训练的输入; \item \protect\hyperlink{model-prediction}{\textbf{模型预测:}} 包括\lstinline{inference()}等一些操作,用于进行统计计算,比如在提供的图像进行分类; adds operations that perform inference, i.e.~classification, on supplied images. \item \protect\hyperlink{model-training}{\textbf{模型训练:}} 包括\lstinline{loss()} 和 \lstinline{train()}等一些操作,用于计算损失、计算梯度、进行变量更新以及呈现最终结果。 \end{enumerate} \subsubsection{模型输入 }\label{ux6a21ux578bux8f93ux5165} 输入模型是通过 \lstinline{inputs()} 和\lstinline{distorted\_inputs()}函数建立起来的,这2个函数会从CIFAR-10二进制文件中读取图片文件,由于每个图片的存储字节数是固定的,因此可以使用\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/io_ops.md\#FixedLengthRecordReader}{\lstinline{tf.FixedLengthRecordReader}}函数。更多的关于\lstinline{Reader}类的功能可以查看\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/reading_data/index.md\#reading-from-files}{Reading Data}。 图片文件的处理流程如下: \begin{itemize} \tightlist \item 图片会被统一裁剪到24x24像素大小,裁剪中央区域用于评估或\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md\#random_crop}{随机}裁剪用于训练; \item 图片会进行\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md\#per_image_whitening}{近似的白化处理},使得模型对图片的动态范围变化不敏感。 \end{itemize} 对于训练,我们另外采取了一系列随机变换的方法来人为的增加数据集的大小: \begin{itemize} \tightlist \item 对图像进行\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md\#random_flip_left_right}{随机的左右翻转}; \item 随机变换\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md\#random_brightness}{图像的亮度}; \item 随机变换\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md\#tf_image_random_contrast}{图像的对比度}; \end{itemize} 可以在\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/image.md}{Images}页的列表中查看所有可用的变换,对于每个原始图我们还附带了一个\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md\#image_summary}{\lstinline{image\_summary}},以便于在TensorBoard中查看。这对于检查输入图像是否正确十分有用。 从磁盘上加载图像并进行变换需要花费不少的处理时间。为了避免这些操作减慢训练过程,我们在16个独立的线程中并行进行这些操作,这16个线程被连续的安排在一个TensorFlow\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/io_ops.md\#shuffle_batch}{队列}中。 \subsubsection{模型预测 }\label{ux6a21ux578bux9884ux6d4b} 模型的预测流程由\lstinline{inference()}构造,该函数会添加必要的操作步骤用于计算预测值的 \emph{logits},其对应的模型组织方式如下所示: \begin{longtable}[c]{@{}ll@{}} \toprule \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut Layer 名称 \strut\end{minipage} & \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 描述 \strut\end{minipage}\tabularnewline \midrule \endhead \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{conv1} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 实现\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#conv2d}{卷积} 以及 \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#relu}{rectified linear} activation. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{pool1} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#max_pool}{max pooling}. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{norm1} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#local_response_normalization}{局部响应归一化}. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{conv2} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#conv2d}{卷积} and \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#relu}{rectified linear} activation. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{norm2} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#local_response_normalization}{局部响应归一化}. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{pool2} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#max_pool}{max pooling}. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{local3} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md}{基于修正线性激活的全连接层}. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{local4} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md}{基于修正线性激活的全连接层}. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \lstinline{softmax\_linear} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 进行线性变换以输出 logits. \strut\end{minipage}\tabularnewline \bottomrule \end{longtable} 这里有一个由TensorBoard绘制的图形,用于描述模型建立过程中经过的步骤: \begin{quote} \textbf{练习}: \lstinline{inference}的输出是未归一化的logits,尝试使用\href{tensorflow-zh/SOURCE/api_docs/python/nn.md\#softmax}{\lstinline{tf.softmax()}}修改网络架构后返回归一化的预测值。 \end{quote} \lstinline{inputs()} 和 \lstinline{inference()} 函数提供了评估模型时所需的所有构件,现在我们把讲解的重点从构建一个模型转向训练一个模型。 \begin{quote} \textbf{练习:} \lstinline{inference()} 中的模型跟\href{https://code.google.com/p/cuda-convnet/}{cuda-convnet}中描述的CIFAR-10模型有些许不同,其差异主要在于其顶层不是全连接层而是局部连接层,可以尝试修改网络架构来准确的复制全连接模型。 \end{quote} \subsubsection{模型训练}\label{ux6a21ux578bux8badux7ec3} 训练一个可进行N维分类的网络的常用方法是使用\href{https://en.wikipedia.org/wiki/Multinomial_logistic_regression}{多项式逻辑回归},又被叫做\emph{softmax 回归}。Softmax 回归在网络的输出层上附加了一个\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#softmax}{softmax} nonlinearity,并且计算归一化的预测值和label的\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/sparse_ops.md\#sparse_to_dense}{1-hot encoding}的\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#softmax_cross_entropy_with_logits}{交叉熵}。在正则化过程中,我们会对所有学习变量应用\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/nn.md\#l2_loss}{权重衰减损失}。模型的目标函数是求交叉熵损失和所有权重衰减项的和,\lstinline{loss()}函数的返回值就是这个值。 在TensorBoard中使用\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md\#scalar_summary}{scalar\_summary}来查看该值的变化情况: \begin{figure}[htbp] \centering \includegraphics{../SOURCE/images/cifar_loss.png} \caption{CIFAR-10 Loss} \end{figure} 我们使用标准的梯度下降算法来训练模型(也可以在\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md}{Training}中看看其他方法),其学习率随时间以指数形式衰减。 \begin{figure}[htbp] \centering \includegraphics{../SOURCE/images/cifar_lr_decay.png} \caption{CIFAR-10 Learning Rate Decay} \end{figure} \lstinline{train()} 函数会添加一些操作使得目标函数最小化,这些操作包括计算梯度、更新学习变量(详细信息请查看\href{tensorflow-zh/SOURCE/api_docs/python/train.md\#GradientDescentOptimizer}{\lstinline{GradientDescentOptimizer}})。\lstinline{train()} 函数最终会返回一个用以对一批图像执行所有计算的操作步骤,以便训练并更新模型。 \subsection{开始执行并训练模型 }\label{ux5f00ux59cbux6267ux884cux5e76ux8badux7ec3ux6a21ux578b} 我们已经把模型建立好了,现在通过执行脚本\lstinline{cifar10\_train.py}来启动训练过程。 \begin{lstlisting} python cifar10_train.py \end{lstlisting} \textbf{注意:} 当第一次在CIFAR-10教程上启动任何任务时,会自动下载CIFAR-10数据集,该数据集大约有160M大小,因此第一次运行时泡杯咖啡小栖一会吧。 你应该可以看到如下类似的输出: \begin{lstlisting} Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. 2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) 2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) 2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) 2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) 2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) 2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) ... \end{lstlisting} 脚本会在每10步训练过程后打印出总损失值,以及最后一批数据的处理速度。下面是几点注释: \begin{itemize} \item 第一批数据会非常的慢(大概要几分钟时间),因为预处理线程要把20,000个待处理的CIFAR图像填充到重排队列中; \item 打印出来的损失值是最近一批数据的损失值的均值。请记住损失值是交叉熵和权重衰减项的和; \item 上面打印结果中关于一批数据的处理速度是在Tesla K40C上统计出来的,如果你运行在CPU上,性能会比此要低; \end{itemize} \begin{quote} \textbf{练习:} 当实验时,第一阶段的训练时间有时会非常的长,长到足以让人生厌。可以尝试减少初始化时初始填充到队列中图片数量来改变这种情况。在\lstinline{cifar10.py}中搜索\lstinline{NUM\_EXAMPLES\_PER\_EPOCH\_FOR\_TRAIN}并修改之。 \end{quote} \lstinline{cifar10\_train.py} 会周期性的在\href{https://github.com/jikexueyuanwiki/tensorflow-zh/tree/master/SOURCE/how_tos/variables/index.md\#saving-and-restoring}{检查点文件}中\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/state_ops.md\#Saver}{保存}模型中的所有参数,但是\emph{不会}对模型进行评估。\lstinline{cifar10\_eval.py}会使用该检查点文件来测试预测性能(详见下面的描述:\protect\hyperlink{ux8bc4ux4f30ux6a21ux578b}{评估模型})。 如果按照上面的步骤做下来,你应该已经开始训练一个CIFAR-10模型了。\href{https://www.youtube.com/watch?v=9bZkp7q19f0}{恭喜你!} \lstinline{cifar10\_train.py}输出的终端信息中提供了关于模型如何训练的一些信息,但是我们可能希望了解更多关于模型训练时的信息,比如:\\ * 损失是\emph{真的}在减小还是看到的只是噪声数据?\\ * 为模型提供的图片是否合适?\\ * 梯度、激活、权重的值是否合理?\\ * 当前的学习率是多少? \href{https://github.com/jikexueyuanwiki/tensorflow-zh/tree/master/SOURCE/how_tos/summaries_and_tensorboard/index.md}{TensorBoard}提供了该功能,可以通过\lstinline{cifar10\_train.py}中的\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md\#SummaryWriter}{\lstinline{SummaryWriter}}周期性的获取并显示这些数据。 比如我们可以在训练过程中查看\lstinline{local3}的激活情况,以及其特征维度的稀疏情况: 相比于总损失,在训练过程中的单项损失尤其值得人们的注意。但是由于训练中使用的数据批量比较小,损失值中夹杂了相当多的噪声。在实践过程中,我们也发现相比于原始值,损失值的移动平均值显得更为有意义。请参阅脚本\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md\#ExponentialMovingAverage}{\lstinline{ExponentialMovingAverage}}了解如何实现。 \hypertarget{ux8bc4ux4f30ux6a21ux578b}{\subsection{评估模型 }\label{ux8bc4ux4f30ux6a21ux578b}} 现在可以在另一部分数据集上来评估训练模型的性能。脚本文件\lstinline{cifar10\_eval.py}对模型进行了评估,利用 \lstinline{inference()}函数重构模型,并使用了在评估数据集所有10,000张CIFAR-10图片进行测试。最终计算出的精度为\emph{1:N},N=预测值中置信度最高的一项与图片真实label匹配的频次。(It calculates the \emph{precision at 1:} how often the top prediction matches the true label of the image)。 为了监控模型在训练过程中的改进情况,评估用的脚本文件会周期性的在最新的检查点文件上运行,这些检查点文件是由\lstinline{cifar10\_train.py}产生。 \begin{lstlisting} python cifar10_eval.py \end{lstlisting} \begin{quote} 注意:不要在同一块GPU上同时运行训练程序和评估程序,因为可能会导致内存耗尽。尽可能的在其它单独的GPU上运行评估程序,或者在同一块GPU上运行评估程序时先挂起训练程序。 \end{quote} 你可能会看到如下所示输出: \begin{lstlisting} 2015-11-06 08:30:44.391206: precision @ 1 = 0.860 ... \end{lstlisting} 评估脚本只是周期性的返回precision@1 (The script merely returns the precision @ 1 periodically)--在该例中返回的准确率是86\%。\lstinline{cifar10\_eval.py} 同时也返回其它一些可以在TensorBoard中进行可视化的简要信息。可以通过这些简要信息在评估过程中进一步的了解模型。 训练脚本会为所有学习变量计算其\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/train.md\#ExponentialMovingAverage}{移动均值},评估脚本则直接将所有学习到的模型参数替换成对应的移动均值。这一替代方式可以在评估过程中提升模型的性能。 \begin{quote} \textbf{练习:} 通过precision @ 1测试发现,使用均值参数可以将预测性能提高约3\%,在\lstinline{cifar10\_eval.py}中尝试修改为不采用均值参数的方式,并确认由此带来的预测性能下降。 \end{quote} \subsection{在多个GPU板卡上训练模型 }\label{ux5728ux591aux4e2agpuux677fux5361ux4e0aux8badux7ec3ux6a21ux578b} 现代的工作站可能包含多个GPU进行科学计算。TensorFlow可以利用这一环境在多个GPU卡上运行训练程序。 在并行、分布式的环境中进行训练,需要对训练程序进行协调。对于接下来的描述,术语\emph{模型拷贝}(\emph{model replica})特指在一个数据子集中训练出来的模型的一份拷贝。 如果天真的对模型参数的采用异步方式更新将会导致次优的训练性能,这是因为我们可能会基于一个旧的模型参数的拷贝去训练一个模型。但与此相反采用完全同步更新的方式,其速度将会变得和最慢的模型一样慢(Conversely, employing fully synchronous updates will be as slow as the slowest model replica.)。 在具有多个GPU的工作站中,每个GPU的速度基本接近,并且都含有足够的内存来运行整个CIFAR-10模型。因此我们选择以下方式来设计我们的训练系统: \begin{itemize} \item 在每个GPU上放置单独的模型副本; \item 等所有GPU处理完一批数据后再同步更新模型的参数; \end{itemize} 下图示意了该模型的结构:: 可以看到,每一个GPU会用一批独立的数据计算梯度和估计值。这种设置可以非常有效的将一大批数据分割到各个GPU上。 这一机制要求所有GPU能够共享模型参数。但是众所周知在GPU之间传输数据非常的慢,因此我们决定在CPU上存储和更新所有模型的参数(对应图中绿色矩形的位置)。这样一来,GPU在处理一批新的数据之前会更新一遍的参数。 图中所有的GPU是同步运行的。所有GPU中的梯度会累积并求平均值(绿色方框部分)。模型参数会利用所有模型副本梯度的均值来更新。 \subsubsection{在多个设备中设置变量和操作}\label{ux5728ux591aux4e2aux8bbeux5907ux4e2dux8bbeux7f6eux53d8ux91cfux548cux64cdux4f5c} 在多个设备中设置变量和操作时需要做一些特殊的抽象。 我们首先需要把在单个模型拷贝中计算估计值和梯度的行为抽象到一个函数中。在代码中,我们称这个抽象对象为``tower''。对于每一个``tower''我们都需要设置它的两个属性:\\ * 在一个tower中为所有操作设定一个唯一的名称。\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md\#name_scope}{\lstinline{tf.name\_scope()}}通过添加一个范围前缀来提供该唯一名称。比如,第一个tower中的所有操作都会附带一个前缀\lstinline{tower\_0},示例:\lstinline{tower\_0/conv1/Conv2D}; \begin{itemize} \tightlist \item 在一个tower中运行操作的优先硬件设备。 \href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/framework.md\#device}{\lstinline{tf.device()}} 提供该信息。比如,在第一个tower中的所有操作都位于 \lstinline{device(\textquotesingle{}/gpu:0\textquotesingle{})}范围中,暗含的意思是这些操作应该运行在第一块GPU上; \end{itemize} 为了在多个GPU上共享变量,所有的变量都绑定在CPU上,并通过\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/api_docs/python/state_ops.md\#get_variable}{\lstinline{tf.get\_variable()}}访问。可以查看\href{https://github.com/jikexueyuanwiki/tensorflow-zh/blob/master/SOURCE/how_tos/variables/index.md}{Sharing Variables}以了解如何共享变量。 \subsubsection{启动并在多个GPU上训练模型 }\label{ux542fux52a8ux5e76ux5728ux591aux4e2agpuux4e0aux8badux7ec3ux6a21ux578b} 如果你的机器上安装有多块GPU,你可以通过使用\lstinline{cifar10\_multi\_gpu\_train.py}脚本来加速模型训练。该脚本是训练脚本的一个变种,使用多个GPU实现模型并行训练。 \begin{lstlisting} python cifar10_multi_gpu_train.py --num_gpus=2 \end{lstlisting} 训练脚本的输出如下所示: \begin{lstlisting} Filling queue with 20000 CIFAR images before starting to train. This will take a few minutes. 2015-11-04 11:45:45.927302: step 0, loss = 4.68 (2.0 examples/sec; 64.221 sec/batch) 2015-11-04 11:45:49.133065: step 10, loss = 4.66 (533.8 examples/sec; 0.240 sec/batch) 2015-11-04 11:45:51.397710: step 20, loss = 4.64 (597.4 examples/sec; 0.214 sec/batch) 2015-11-04 11:45:54.446850: step 30, loss = 4.62 (391.0 examples/sec; 0.327 sec/batch) 2015-11-04 11:45:57.152676: step 40, loss = 4.61 (430.2 examples/sec; 0.298 sec/batch) 2015-11-04 11:46:00.437717: step 50, loss = 4.59 (406.4 examples/sec; 0.315 sec/batch) ... \end{lstlisting} 需要注意的是默认的GPU使用数是1,此外,如果你的机器上只有一个GPU,那么所有的计算都只会在一个GPU上运行,即便你可能设置的是N个。 \begin{quote} \textbf{练习:} \lstinline{cifar10\_train.py}中的批处理大小默认配置是128。尝试在2个GPU上运行\lstinline{cifar10\_multi\_gpu\_train.py}脚本,并且设定批处理大小为64,然后比较2种方式的训练速度。 \end{quote} \subsection{下一步 }\label{ux4e0bux4e00ux6b65} 恭喜你! 你已经完成了CIFAR-10教程。 如果你对开发和训练自己的图像分类系统感兴趣,我们推荐你新建一个基于该教程的分支,并修改其中的内容以建立解决您问题的图像分类系统。 \begin{quote} \textbf{练习:} 下载\href{http://ufldl.stanford.edu/housenumbers/}{Street View House Numbers (SVHN)} 数据集。新建一个CIFAR-10教程的分支,并将输入数据替换成SVHN。尝试改变网络结构以提高预测性能。 \end{quote} \begin{quote} 原文:\href{http://tensorflow.org/tutorials/deep_cnn/index.md}{Convolutional Neural Networks} 翻译:\href{https://github.com/oskycar}{oskycar} 校对:\href{https://github.com/zhyhooo}{KK4SBB} \end{quote} ================================================ FILE: tex_pdf/tutorials/c2s05_word2vec.tex ================================================ %!TEX program = xelatex % Encoding: UTF8 % SEIKA 2015 \section{Vector Representations of Words}\label{word2vec} In this tutorial we look at the word2vec model by \href{http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf}{Mikolov et al}. This model is used for learning vector representations of words, called "word embeddings". 在本教程我们来看一下\href{http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf}{Mikolovet al}中提到的word2vec模型。该模型是用于学习文字的向量表示,称之为 "wordembedding"。 \subsection{亮点 }\label{word2vec_hl} 本教程旨在意在展现出在TensorfLow中构建word2vec模型有趣、本质的部分。 \begin{itemize} \item 我们从我们为何需要使用向量表示文字开始。 \item 我们通过直观地例子观察模型背后的本质,以及它是如何训练的(通过一些数学方法评估)。 \item 同时我们也展示了TensorFlow对该模型的简单实现。 \item 最后,我们着眼于让给这个简单版本的模型表现更好。 \end{itemize} 我们会在教程的推进中循序渐进地解释代码,但是如果你更希望直入主题,可以在 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py}{tensorflow/g3doc/tutorials/word2vec/word2vec\_basic.py}查看到一个最简单的实现。这个基本的例子提供的代码可以完成下载一些数据,简单训练后展示结果。一旦你觉得已经完全掌握了这个简单版本,你可以查看 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py}{tensorflow/models/embedding/word2vec.py},这里提供了一些更复杂的实现,同时也展示了TensorFlow的一些更进阶的特性,比如如何更高效地使用线程将数据送入文本模型,再比如如何在训练中设置检查点等等。 但是首先,让我们来看一下为何需要学习word embeddings。如果你对word embeddings相关内容已经是个专家了,那么请安心跳过本节内容,直接深入细节干一些脏活吧。 \subsection{动机: 为什么需要学习 Word Embeddings?}\label{ux52a8} 通常图像或音频系统处理的是由图片中所有单个原始像素点强度值或者音频中功率谱密度的强度值,把它们编码成丰富、高纬度的向量数据集。对于物体或语音识别这一类的任务,我们所需的全部信息已经都存储在原始数据中(显然人类本身就是依赖原始数据进行日常的物体或语音识别的)。然后,自然语言处理系统通常将词汇作为离散的单一符号,例如"cat" 一词或可表示为\lstinline{Id537} ,而"dog"一词或可表示为\lstinline{Id143}。这些符号编码毫无规律,无法提供不同词汇之间可能存在的关联信息。换句话说,在处理关于"dogs" 一词的信息时,模型将无法利用已知的关于"cats"的信息(例如,它们都是动物,有四条腿,可作为宠物等等)。可见,将词汇表达为上述的独立离散符号将进一步导致数据稀疏,使我们在训练统计模型时不得不寻求更多的数据。而词汇的向量表示将克服上述的难题。 \begin{figure}[htbp] \centering \includegraphics[width=.9\textwidth]{../SOURCE/images/audio-image-text.png} \caption{} \end{figure} \href{https://en.wikipedia.org/wiki/Vector_space_model}{向量空间模型(VSMs)} 将词汇表达(嵌套)于一个连续的向量空间中,语义近似的词汇被映射为相邻的数据点。向量空间模型在自然语言处理领域中有着漫长且丰富的历史,不过几乎所有利用这一模型的方法都依赖于\href{https://en.wikipedia.org/wiki/Distributional_semantics\#Distributional_Hypothesis}{分布式假设},其核心思想为出现于上下文情景中的词汇都有相类似的语义。采用这一假设的研究方法大致分为以下两类:\emph{基于技术的方法}(如,\href{https://en.wikipedia.org/wiki/Latent_semantic_analysis}{潜在语义分析}),和 \emph{预测方法} (如,\href{http://www.scholarpedia.org/article/Neural_net_language_models}{神经概率化语言模型}). 其中它们的区别在如下论文中又详细阐述\href{http://clic.cimec.unitn.it/marco/publications/acl2014/baroni-etal-countpredict-acl2014.pdf}{Baroni etal.},不过简而言之:基于计数的方法计算某词汇与其邻近词汇在一个大型语料库中共同出现的频率及其他统计量,然后将这些统计量映射到一个小型且稠密的向量中。预测方法则试图直接从某词汇的邻近词汇对其进行预测,在此过程中利用已经学习到的小型且稠密的\emph{嵌套向量}。 Word2vec是一种可以进行高效率词嵌套学习的预测模型。其两种变体分别为:连续词袋模型(CBOW)及Skip-Gram模型。从算法角度看,这两种方法非常相似,其区别为CBOW根据源词上下文词汇("the cat sits on the")来预测目标词汇(例如,"mat"),而Skip-Gram模型做法相反,它通过目标词汇来预测源词汇。Skip-Gram模型采取CBOW的逆过程的动机在于:CBOW算法对于很多分布式信息进行了平滑处理(例如将一整段上下文信息视为一个单一观察量)。很多情况下,对于小型的数据集,这一处理是有帮助的。相形之下,Skip-Gram模型将每个“上下文---目标词汇”的组合视为一个新观察量,这种做法在大型数据集中会更为有效。本教程余下部分将着重讲解Skip-Gram模型。 \subsection{处理噪声对比训练}\label{ux5904ux7406ux566aux58f0ux5bf9ux6bd4ux8badux7ec3} 神经概率化语言模型通常使用\href{https://en.wikipedia.org/wiki/Maximum_likelihood}{极大似然法(ML)} 进行训练,其中通过\href{https://en.wikipedia.org/wiki/Softmax_function}{\emph{softmax}function} 来最大化当提供前一个单词$h$ (代表"history"),后一个单词的概率$w_t$ (代表"target"), \begin{align} P(w_t|h) & = softmax(score(w_t,h)) \\ & = \frac{exp\{score(w_t,h)\}}{\sum_{Word w'in Vocab}{exp\{score(w',h)\}}} \end{align} 当$score(w_t,h)$ 计算了文字 $w_t$ 和 上下文 $h$的相容性(通常使用向量积)。我们使用对数似然函数来训练训练集的最大值,比如通过: \begin{align} J_{ML} & = \log{P(w_t|h)} \\ & = score(w_t,h) - \log{(\sum_{Word w'in Vocab}{\exp{\{score(w',h)\}}})} \end{align} 这里提出了一个解决语言概率模型的合适的通用方法。然而这个方法实际执行起来开销非常大,因为我们需要去计算并正则化当前上下文环境 $h$ 中所有其他 $V$ 单词 $w'$ 的概率得分,\emph{在每一步训练迭代中}。 \begin{figure}[htbp] \centering \includegraphics[width=.75\textwidth]{../SOURCE/images/nce-nplm.png} \caption{} \end{figure} 从另一个角度来说,当使用word2vec模型时,我们并不需要对概率模型中的所有特征进行学习。而CBOW模型和Skip-Gram模型为了避免这种情况发生,使用一个二分类器(逻辑回归)在同一个上下文环境里从 \textbf{k} 虚构的 (噪声) 单词区分出真正的目标单词 \includegraphics{../SOURCE/images/rw4.png}。我们下面详细阐述一下CBOW模型,对于Skip-Gram模型只要简单地做相反的操作即可。 \begin{figure}[htbp] \centering \includegraphics[width=.75\textwidth]{../SOURCE/images/softmax-nplm.png} \caption{} \end{figure} 从数学角度来说,我们的目标是对每个样本最大化: \begin{figure}[htbp] \centering \includegraphics{../SOURCE/images/rw6.png} \caption{} \end{figure} 其中 \includegraphics{../SOURCE/images/rw7.png} 代表的是数据集在当前上下文\textbf{h} ,根据所学习的嵌套向量 \includegraphics{../SOURCE/images/theta.png},目标单词 \textbf{w}使用二分类逻辑回归计算得出的概率。在实践中,我们通过在噪声分布中绘制比对文字来获得近似的期望值(通过计算\href{https://en.wikipedia.org/wiki/Monte_Carlo_integration}{蒙特卡洛平均值})。 当真实地目标单词被分配到较高的概率,同时噪声单词的概率很低时,目标函数也就达到最大值了。从技术层面来说,这种方法叫做\href{http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf}{负抽样},而且使用这个损失函数在数学层面上也有很好的解释:这个更新过程也近似于softmax函数的更新。这在计算上将会有很大的优势,因为当计算这个损失函数时,只是有我们挑选出来的\textbf{k} 个 \emph{噪声单词},而没有使用整个语料库\textbf{V}。这使得训练变得非常快。我们实际上使用了与\href{http://papers.nips.cc/paper/5165-learning-word-embeddings-efficiently-with-noise-contrastive-estimation.pdf}{noise-contrastiveestimation(NCE)}介绍的非常相似的方法,这在TensorFlow中已经封装了一个很便捷的函数\lstinline{tf.nn.nce_loss()}。 让我们在实践中来直观地体会它是如何运作的! \subsection{Skip-gram 模型}\label{skip-gram_model} 下面来看一下这个数据集 \lstinline{"the quick brown fox jumped over the lazy dog"} 我们首先对一些单词以及它们的上下文环境建立一个数据集。我们可以以任何合理的方式定义`上下文',而通常上这个方式是根据文字的句法语境的(使用语法原理的方式处理当前目标单词可以看一下这篇文献 \href{https://levyomer.files.wordpress.com/2014/04/dependency-based-word-embeddings-acl-2014.pdf}{Levy et al.},比如说把目标单词左边的内容当做一个“上下文”,或者以目标单词右边的内容,等等。现在我们把目标单词的左右单词视作一个上下文, 使用大小为1的窗口,这样就得到这样一个由\texttt{(上下文,\ 目标单词)} 组成的数据集: \texttt{({[}the,\ brown{]},\ quick),\ ({[}quick,\ fox{]},\ brown),\ ({[}brown,\ jumped{]},\ fox),\ ...} 前文提到Skip-Gram模型是把目标单词和上下文颠倒过来,所以在这个问题中,举个例子,就是用'quick'来预测 `the' 和 `brown' ,用 `brown' 预测 `quick' 和 `brown' 。因此这个数据集就变成由\texttt{(输入,\ 输出)}组成的: \texttt{(quick,\ the),\ (quick,\ brown),\ (brown,\ quick),\ (brown,\ fox),\ ...} 目标函数通常是对整个数据集建立的,但是本问题中要对每一个样本(或者是一个\texttt{batch\_size} 很小的样本集,通常设置为\texttt{16\ \textless{}=\ batch\_size\ \textless{}=\ 512})在同一时间执行特别的操作,称之为\href{https://en.wikipedia.org/wiki/Stochastic_gradient_descent}{随机梯度下降} (SGD)。我们来看一下训练过程中每一步的执行。 假设用 \textbf{t} 表示上面这个例子中\texttt{quick} 来预测 \texttt{the} 的训练的单个循环。用 \texttt{num\_noise} 定义从噪声分布中挑选出来的噪声(相反的)单词的个数,通常使用一元分布,\textbf{P(w)}。为了简单起见,我们就定\texttt{num\_noise=1},用 \texttt{sheep} 选作噪声词。接下来就可以计算每一对观察值和噪声值的损失函数了,每一个执行步骤就可表示为: \begin{figure}[htbp] \centering \includegraphics{../SOURCE/images/vr4.png} \caption{} \end{figure} 整个计算过程的目标是通过更新嵌套参数 \includegraphics{../SOURCE/images/theta.png} 来逼近目标函数(这个这个例子中就是使目标函数最大化)。为此我们要计算损失函数中嵌套参数 \includegraphics{../SOURCE/images/theta.png} 的梯度,比如, \includegraphics{../SOURCE/images/vr5.png} (幸好TensorFlow封装了工具函数可以简单调用!)。对于整个数据集,当梯度下降的过程中不断地更新参数,对应产生的效果就是不断地移动每个单词的嵌套向量,直到可以把真实单词和噪声单词很好得区分开。 我们可以把学习向量映射到2维中以便我们观察,其中用到的技术可以参考 \href{http://lvdmaaten.github.io/tsne/}{t-SNE 降维技术}。当我们用可视化的方式来观察这些向量,就可以很明显的获取单词之间语义信息的关系,这实际上是非常有用的。当我们第一次发现这样的诱导向量空间中,展示了一些特定的语义关系,这是非常有趣的,比如文字中 \emph{male-female},\emph{gender} 甚至还有 \emph{country-capital} 的关系, 如下方的图所示 (也可以参考 \href{http://www.aclweb.org/anthology/N13-1090}{Mikolov et al., 2013}论文中的例子)。 这也解释了为什么这些向量在传统的NLP问题中可作为特性使用,比如用在对一个演讲章节打个标签,或者对一个专有名词的识别 (看看如下这个例子 \href{http://arxiv.org/pdf/1103.0398v1.pdf}{Collobert et al.}或者 \href{http://www.aclweb.org/anthology/P10-1040}{Turian et al.})。 不过现在让我们用它们来画漂亮的图表吧! \subsection{建立图形 }\label{ux5efaux7acbux56feux5f62} 这里谈得都是嵌套,那么先来定义一个嵌套参数矩阵。我们用唯一的随机值来初始化这个大矩阵。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{embeddings }\OperatorTok{=} \NormalTok{tf.Variable(} \NormalTok{tf.random_uniform([vocabulary_size, embedding_size], }\OperatorTok{-}\FloatTok{1.0}\NormalTok{, }\FloatTok{1.0}\NormalTok{))} \end{Highlighting} \end{Shaded} 对噪声-比对的损失计算就使用一个逻辑回归模型。对此,我们需要对语料库中的每个单词定义一个权重值和偏差值。(也可称之为\texttt{输出权重} 与之对应的 \texttt{输入嵌套值})。定义如下。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{nce_weights }\OperatorTok{=} \NormalTok{tf.Variable(} \NormalTok{tf.truncated_normal([vocabulary_size, embedding_size],} \NormalTok{stddev}\OperatorTok{=}\FloatTok{1.0} \OperatorTok{/} \NormalTok{math.sqrt(embedding_size)))} \NormalTok{nce_biases }\OperatorTok{=} \NormalTok{tf.Variable(tf.zeros([vocabulary_size]))} \end{Highlighting} \end{Shaded} 我们有了这些参数之后,就可以定义Skip-Gram模型了。简单起见,假设我们已经把语料库中的文字整型化了,这样每个整型代表一个单词(细节请查看 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/word2vec/word2vec_basic.py}{tensorflow/g3doc/tutorials/word2vec/word2vec\_basic.py})。Skip-Gram模型有两个输入。一个是一组用整型表示的上下文单词,另一个是目标单词。给这些输入建立占位符节点,之后就可以填入数据了。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 建立输入占位符} \NormalTok{train_inputs }\OperatorTok{=} \NormalTok{tf.placeholder(tf.int32, shape}\OperatorTok{=}\NormalTok{[batch_size])} \NormalTok{train_labels }\OperatorTok{=} \NormalTok{tf.placeholder(tf.int32, shape}\OperatorTok{=}\NormalTok{[batch_size, }\DecValTok{1}\NormalTok{])} \end{Highlighting} \end{Shaded} 然后我们需要对批数据中的单词建立嵌套向量,TensorFlow提供了方便的工具函数。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{embed }\OperatorTok{=} \NormalTok{tf.nn.embedding_lookup(embeddings, train_inputs)} \end{Highlighting} \end{Shaded} 好了,现在我们有了每个单词的嵌套向量,接下来就是使用噪声-比对的训练方式来预测目标单词。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 计算 NCE 损失函数, 每次使用负标签的样本.} \NormalTok{loss }\OperatorTok{=} \NormalTok{tf.reduce_mean(} \NormalTok{tf.nn.nce_loss(nce_weights, nce_biases, embed, train_labels,} \NormalTok{num_sampled, vocabulary_size))} \end{Highlighting} \end{Shaded} 我们对损失函数建立了图形节点,然后我们需要计算相应梯度和更新参数的节点,比如说在这里我们会使用随机梯度下降法,TensorFlow也已经封装好了该过程。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 使用 SGD 控制器.} \NormalTok{optimizer }\OperatorTok{=} \NormalTok{tf.train.GradientDescentOptimizer(learning_rate}\OperatorTok{=}\FloatTok{1.0}\NormalTok{).minimize(loss)} \end{Highlighting} \end{Shaded} \subsection{训练模型 }\label{ux8badux7ec3ux6a21ux578b} 训练的过程很简单,只要在循环中使用\texttt{feed\_dict}不断给占位符填充数据,同时调用 \href{tensorflow-zh/SOURCE/api_docs/python/client.md\#Session.run}{\texttt{session.run}}即可。 \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{for} \NormalTok{inputs, labels }\OperatorTok{in} \NormalTok{generate_batch(...):} \NormalTok{feed_dict }\OperatorTok{=} \NormalTok{\{training_inputs: inputs, training_labels: labels\}} \NormalTok{_, cur_loss }\OperatorTok{=} \NormalTok{session.run([optimizer, loss], feed_dict}\OperatorTok{=}\NormalTok{feed_dict)} \end{Highlighting} \end{Shaded} 完整地例子可参考 \href{./word2vec_basic.py}{tensorflow/g3doc/tutorials/word2vec/word2vec\_basic.py}. \subsection{嵌套学习结果可视化 }\label{ux5d4cux5957ux5b66ux4e60ux7ed3ux679cux53efux89c6ux5316} 使用t-SNE来看一下嵌套学习完成的结果。 Et voila! 与预期的一样,相似的单词被聚类在一起。对word2vec模型更复杂的实现需要用到TensorFlow一些更高级的特性,具体是实现可以参考 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py}{tensorflow/models/embedding/word2vec.py}。 \subsection{嵌套学习的评估: 类比推理 }\label{ux5d4cux5957ux5b66ux4e60ux7684ux8bc4ux4f30-ux7c7bux6bd4ux63a8ux7406} 词嵌套在NLP的预测问题中是非常有用且使用广泛地。如果要检测一个模型是否是可以成熟地区分词性或者区分专有名词的模型,最简单的办法就是直接检验它的预测词性、语义关系的能力,比如让它解决形如\texttt{king\ is\ to\ queen\ as\ father\ is\ to\ ?}这样的问题。这种方法叫做\emph{类比推理} ,可参考\href{http://msr-waypoint.com/en-us/um/people/gzweig/Pubs/NAACL2013Regularities.pdf}{Mikolov and colleagues},数据集下载地址为: https://word2vec.googlecode.com/svn/trunk/questions-words.txt。 To see how we do this evaluation如何执行这样的评估,可以看\texttt{build\_eval\_graph()}和 \texttt{eval()}这两个函数在下面源码中的使用 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py}{tensorflow/models/embedding/word2vec.py}. 超参数的选择对该问题解决的准确性有巨大的影响。想要模型具有很好的表现,需要有一个巨大的训练数据集,同时仔细调整参数的选择并且使用例如二次抽样的一些技巧。不过这些问题已经超出了本教程的范围。 \subsection{优化实现 }\label{ux4f18ux5316ux5b9eux73b0} 以上简单的例子展示了TensorFlow的灵活性。比如说,我们可以很轻松得用现成的\texttt{tf.nn.sampled\_softmax\_loss()}来代替\texttt{tf.nn.nce\_loss()}构成目标函数。如果你对损失函数想做新的尝试,你可以用TensorFlow手动编写新的目标函数的表达式,然后用控制器执行计算。这种灵活性的价值体现在,当我们探索一个机器学习模型时,我们可以很快地遍历这些尝试,从中选出最优。 一旦你有了一个满意的模型结构,或许它就可以使实现运行地更高效(在短时间内覆盖更多的数据)。比如说,在本教程中使用的简单代码,实际运行速度都不错,因为我们使用Python来读取和填装数据,而这些在TensorFlow后台只需执行非常少的工作。如果你发现你的模型在输入数据时存在严重的瓶颈,你可以根据自己的实际问题自行实现一个数据阅读器,参考 \href{tensorflow-zh/SOURCE/how_tos/new_data_formats/index.md}{新的数据格式}。对于Skip-Gram 模型,我们已经完成了如下这个例子 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec.py}{tensorflow/models/embedding/word2vec.py}。 如果I/O问题对你的模型已经不再是个问题,并且想进一步地优化性能,或许你可以自行编写TensorFlow操作单元,详见 \href{tensorflow-zh/SOURCE/how_tos/adding_an_op/index.md}{添加一个新的操作}。相应的,我们也提供了Skip-Gram模型的例子 \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/embedding/word2vec_optimized.py}{tensorflow/models/embedding/word2vec\_optimized.py}。请自行调节以上几个过程的标准,使模型在每个运行阶段有更好地性能。 \subsection{总结 }\label{ux603bux7ed3} 在本教程中我们介绍了word2vec模型,它在解决词嵌套问题中具有良好的性能。我们解释了使用词嵌套模型的实用性,并且讨论了如何使用TensorFlow实现该模型的高效训练。总的来说,我们希望这个例子能够让向你展示TensorFlow可以提供实验初期的灵活性,以及在后期优化模型时对模型内部的可操控性。 原文地址:\href{https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/tutorials/word2vec/index.md}{Vector Representation of Words} 翻译:\href{https://github.com/btpeter}{btpeter} 校对:waiwaizheng ================================================ FILE: tex_pdf/tutorials/c2s06_recurrent.tex ================================================ \section{循环神经网络 }\label{ux5faaux73afux795eux7ecfux7f51ux7edc} \subsection{介绍 }\label{ux4ecbux7ecd} 可以在 \href{http://colah.github.io/posts/2015-08-Understanding-LSTMs/}{this great article} 查看循环神经网络(RNN)以及 LSTM 的介绍。 \subsection{语言模型 }\label{ux8bedux8a00ux6a21ux578b} 此教程将展示如何在高难度的语言模型中训练循环神经网络。该问题的目标是获得一个能确定语句概率的概率模型。为了做到这一点,通过之前已经给出的词语来预测后面的词语。我们将使用 PTB(Penn Tree Bank) 数据集,这是一种常用来衡量模型的基准,同时它比较小而且训练起来相对快速。 语言模型是很多有趣难题的关键所在,比如语音识别,机器翻译,图像字幕等。它很有意思---可以参看 \href{http://karpathy.github.io/2015/05/21/rnn-effectiveness/}{here}。 本教程的目的是重现 \href{http://arxiv.org/abs/1409.2329}{Zaremba et al., 2014} 的成果,他们在 PTB 数据集上得到了很棒的结果。 \subsection{教程文件 }\label{ux6559ux7a0bux6587ux4ef6} 本教程使用的下面文件的目录是 \lstinline{models/rnn/ptb}: \begin{longtable}[c]{@{}ll@{}} \toprule 文件 & 作用\tabularnewline \midrule \endhead \lstinline{ptb\_word\_lm.py} & 在 PTB 数据集上训练一个语言模型.\tabularnewline \lstinline{reader.py} & 读取数据集.\tabularnewline \bottomrule \end{longtable} \subsection{下载及准备数据 }\label{ux4e0bux8f7dux53caux51c6ux5907ux6570ux636e} 本教程需要的数据在 data/ 路径下,来源于 Tomas Mikolov 网站上的 PTB 数据集\lstinline{http://www.fit.vutbr.cz/\textasciitilde{}imikolov/rnnlm/simple-examples.tgz}。 该数据集已经预先处理过并且包含了全部的 10000 个不同的词语,其中包括语句结束标记符,以及标记稀有词语的特殊符号 \lstinline{(\textless{}unk\textgreater{})} 。我们在 \lstinline{reader.py} 中转换所有的词语,让他们各自有唯一的整型标识符,便于神经网络处理。 \subsection{模型 }\label{ux6a21ux578b} \subsubsection{LSTM }\label{lstm} 模型的核心由一个 LSTM 单元组成,其可以在某时刻处理一个词语,以及计算语句可能的延续性的概率。网络的存储状态由一个零矢量初始化并在读取每一个词语后更新。而且,由于计算上的原因,我们将以 \lstinline{batch\_size} 为最小批量来处理数据。 基础的伪代码就像下面这样: \begin{Shaded} \begin{Highlighting}[] \NormalTok{lstm }\OperatorTok{=} \NormalTok{rnn_cell.BasicLSTMCell(lstm_size)} \CommentTok{# 初始化 LSTM 存储状态.} \NormalTok{state }\OperatorTok{=} \NormalTok{tf.zeros([batch_size, lstm.state_size])} \NormalTok{loss }\OperatorTok{=} \FloatTok{0.0} \ControlFlowTok{for} \NormalTok{current_batch_of_words }\OperatorTok{in} \NormalTok{words_in_dataset:} \CommentTok{# 每次处理一批词语后更新状态值.} \NormalTok{output, state }\OperatorTok{=} \NormalTok{lstm(current_batch_of_words, state)} \CommentTok{# LSTM 输出可用于产生下一个词语的预测} \NormalTok{logits }\OperatorTok{=} \NormalTok{tf.matmul(output, softmax_w) }\OperatorTok{+} \NormalTok{softmax_b} \NormalTok{probabilities }\OperatorTok{=} \NormalTok{tf.nn.softmax(logits)} \NormalTok{loss }\OperatorTok{+=} \NormalTok{loss_function(probabilities, target_words)} \end{Highlighting} \end{Shaded} \subsubsection{截断反向传播 }\label{ux622aux65adux53cdux5411ux4f20ux64ad} 为使学习过程易于处理,通常的做法是将反向传播的梯度在(按时间)展开的步骤上照一个固定长度(\lstinline{num\_steps})截断。 通过在一次迭代中的每个时刻上提供长度为 \lstinline{num\_steps} 的输入和每次迭代完成之后反向传导,这会很容易实现。 一个简化版的用于计算图创建的截断反向传播代码: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 一次给定的迭代中的输入占位符.} \NormalTok{words }\OperatorTok{=} \NormalTok{tf.placeholder(tf.int32, [batch_size, num_steps])} \NormalTok{lstm }\OperatorTok{=} \NormalTok{rnn_cell.BasicLSTMCell(lstm_size)} \CommentTok{# 初始化 LSTM 存储状态.} \NormalTok{initial_state }\OperatorTok{=} \NormalTok{state }\OperatorTok{=} \NormalTok{tf.zeros([batch_size, lstm.state_size])} \ControlFlowTok{for} \NormalTok{i }\OperatorTok{in} \BuiltInTok{range}\NormalTok{(}\BuiltInTok{len}\NormalTok{(num_steps)):} \CommentTok{# 每处理一批词语后更新状态值.} \NormalTok{output, state }\OperatorTok{=} \NormalTok{lstm(words[:, i], state)} \CommentTok{# 其余的代码.} \CommentTok{# ...} \NormalTok{final_state }\OperatorTok{=} \NormalTok{state} \end{Highlighting} \end{Shaded} 下面展现如何实现迭代整个数据集: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 一个 numpy 数组,保存每一批词语之后的 LSTM 状态.} \NormalTok{numpy_state }\OperatorTok{=} \NormalTok{initial_state.}\BuiltInTok{eval}\NormalTok{()} \NormalTok{total_loss }\OperatorTok{=} \FloatTok{0.0} \ControlFlowTok{for} \NormalTok{current_batch_of_words }\OperatorTok{in} \NormalTok{words_in_dataset:} \NormalTok{numpy_state, current_loss }\OperatorTok{=} \NormalTok{session.run([final_state, loss],} \CommentTok{# 通过上一次迭代结果初始化 LSTM 状态.} \NormalTok{feed_dict}\OperatorTok{=}\NormalTok{\{initial_state: numpy_state, words: current_batch_of_words\})} \NormalTok{total_loss }\OperatorTok{+=} \NormalTok{current_loss} \end{Highlighting} \end{Shaded} \subsubsection{输入 }\label{ux8f93ux5165} 在输入 LSTM 前,词语 ID 被嵌入到了一个密集的表示中(查看 \href{tensorflow-zh/SOURCE/tutorials/word2vec/index.md}{矢量表示教程})。这种方式允许模型高效地表示词语,也便于写代码: \begin{Shaded} \begin{Highlighting}[] \CommentTok{# embedding_matrix 张量的形状是: [vocabulary_size, embedding_size]} \NormalTok{word_embeddings }\OperatorTok{=} \NormalTok{tf.nn.embedding_lookup(embedding_matrix, word_ids)} \end{Highlighting} \end{Shaded} 嵌入的矩阵会被随机地初始化,模型会学会通过数据分辨不同词语的意思。 \subsubsection{损失函数 }\label{ux635fux5931ux51fdux6570} 我们想使目标词语的平均负对数概率最小 \begin{figure}[htbp] \centering \includegraphics{../images/re.png} \caption{} \end{figure} 实现起来并非很难,而且函数 \lstinline{sequence\_loss\_by\_example} 已经有了,可以直接使用。 论文中的典型衡量标准是每个词语的平均困惑度(perplexity),计算式为 \begin{figure}[htbp] \centering \includegraphics{../images/re1.png} \caption{} \end{figure} 同时我们会观察训练过程中的困惑度值(perplexity)。 \subsubsection{多个 LSTM 层堆叠 }\label{ux591aux4e2a-lstm-ux5c42ux5806ux53e0} 要想给模型更强的表达能力,可以添加多层 LSTM 来处理数据。第一层的输出作为第二层的输入,以此类推。 类 \lstinline{MultiRNNCell} 可以无缝的将其实现: \begin{Shaded} \begin{Highlighting}[] \NormalTok{lstm }\OperatorTok{=} \NormalTok{rnn_cell.BasicLSTMCell(lstm_size)} \NormalTok{stacked_lstm }\OperatorTok{=} \NormalTok{rnn_cell.MultiRNNCell([lstm] }\OperatorTok{*} \NormalTok{number_of_layers)} \NormalTok{initial_state }\OperatorTok{=} \NormalTok{state }\OperatorTok{=} \NormalTok{stacked_lstm.zero_state(batch_size, tf.float32)} \ControlFlowTok{for} \NormalTok{i }\OperatorTok{in} \BuiltInTok{range}\NormalTok{(}\BuiltInTok{len}\NormalTok{(num_steps)):} \CommentTok{# 每次处理一批词语后更新状态值.} \NormalTok{output, state }\OperatorTok{=} \NormalTok{stacked_lstm(words[:, i], state)} \CommentTok{# 其余的代码.} \CommentTok{# ...} \NormalTok{final_state }\OperatorTok{=} \NormalTok{state} \end{Highlighting} \end{Shaded} \subsection{编译并运行代码 }\label{ux7f16ux8bd1ux5e76ux8fd0ux884cux4ee3ux7801} 首先需要构建库,在 CPU 上编译: \begin{verbatim} bazel build -c opt tensorflow/models/rnn/ptb:ptb_word_lm \end{verbatim} 如果你有一个强大的 GPU,可以运行: \begin{verbatim} bazel build -c opt --config=cuda tensorflow/models/rnn/ptb:ptb_word_lm \end{verbatim} 运行模型: \begin{verbatim} bazel-bin/tensorflow/models/rnn/ptb/ptb_word_lm \ --data_path=/tmp/simple-examples/data/ --alsologtostderr --model small \end{verbatim} 教程代码中有 3 个支持的模型配置参数:``small'', ``medium'' 和 ``large''。它们指的是 LSTM 的大小,以及用于训练的超参数集。 模型越大,得到的结果应该更好。在测试集中 \lstinline{small} 模型应该可以达到低于 120 的困惑度(perplexity),\lstinline{large} 模型则是低于 80,但它可能花费数小时来训练。 \subsection{除此之外? }\label{ux9664ux6b64ux4e4bux5916} 还有几个优化模型的技巧没有提到,包括: \begin{itemize} \tightlist \item 随时间降低学习率, \item LSTM 层间 dropout. \end{itemize} 继续学习和更改代码以进一步改善模型吧。 原文:\href{http://tensorflow.org/tutorials/recurrent/index.md}{Recurrent Neural Networks} 翻译:\href{https://github.com/Warln}{Warln} 校对:\href{https://github.com/wanghong-yang}{HongyangWang} ================================================ FILE: tex_pdf/tutorials/c2s07_seq2seq.tex ================================================ \section{Sequence-to-Sequence Models }\label{sequence-to-sequence-models} Recurrent neural networks can learn to model language, as already discussed in the \href{tensorflow-zh/SOURCE/tutorials/recurrent/index.md}{RNN Tutorial} (if you did not read it, please go through it before proceeding with this one). This raises an interesting question: could we condition the generated words on some input and generate a meaningful response? For example, could we train a neural network to translate from English to French? It turns out that the answer is \emph{yes}. This tutorial will show you how to build and train such a system end-to-end. You can start by running this binary. \begin{verbatim} bazel run -c opt <...>/models/rnn/translate/translate.py --data_dir [your_data_directory] \end{verbatim} It will download English-to-French translation data from the \href{http://www.statmt.org/wmt15/translation-task.html}{WMT'15 Website} prepare it for training and train. It takes about 20GB of disk space, and a while to download and prepare (see \protect\hyperlink{runux5fit}{later} for details), so you can start and leave it running while reading this tutorial. This tutorial references the following files from \texttt{models/rnn}. \begin{longtable}[c]{@{}ll@{}} \toprule \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut File \strut\end{minipage} & \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut What's in it? \strut\end{minipage}\tabularnewline \midrule \endhead \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \texttt{seq2seq.py} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut Library for building sequence-to-sequence models. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \texttt{translate/seq2seq\_model.py} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut Neural translation sequence-to-sequence model. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \texttt{translate/data\_utils.py} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut Helper functions for preparing translation data. \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \texttt{translate/translate.py} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut Binary that trains and runs the translation model. \strut\end{minipage}\tabularnewline \bottomrule \end{longtable} \subsection{Sequence-to-Sequence Basics }\label{sequence-to-sequence-basics} A basic sequence-to-sequence model, as introduced in \href{http://arxiv.org/pdf/1406.1078v3.pdf}{Cho et al., 2014}, consists of two recurrent neural networks (RNNs): an \emph{encoder} that processes the input and a \emph{decoder} that generates the output. This basic architecture is depicted below. Each box in the picture above represents a cell of the RNN, most commonly a GRU cell or an LSTM cell (see the \href{tensorflow-zh/SOURCE/tutorials/recurrent/index.md}{RNN Tutorial} for an explanation of those). Encoder and decoder can share weights or, as is more common, use a different set of parameters. Mutli-layer cells have been successfully used in sequence-to-sequence models too, e.g.~for translation \href{http://arxiv.org/abs/1409.3215}{Sutskever et al., 2014}. In the basic model depicted above, every input has to be encoded into a fixed-size state vector, as that is the only thing passed to the decoder. To allow the decoder more direct access to the input, an \emph{attention} mechanism was introduced in \href{http://arxiv.org/abs/1409.0473}{Bahdanu et al., 2014}. We will not go into the details of the attention mechanism (see the paper), suffice it to say that it allows the decoder to peek into the input at every decoding step. A multi-layer sequence-to-sequence network with LSTM cells and attention mechanism in the decoder looks like this. \subsection{TensorFlow seq2seq Library }\label{tensorflow-seq2seq-library} As you can see above, there are many different sequence-to-sequence models. Each of these models can use different RNN cells, but all of them accept encoder inputs and decoder inputs. This motivates the interfaces in the TensorFlow seq2seq library (\texttt{models/rnn/seq2seq.py}). The basic RNN encoder-decoder sequence-to-sequence model works as follows. \begin{Shaded} \begin{Highlighting}[] \NormalTok{outputs, states }\OperatorTok{=} \NormalTok{basic_rnn_seq2seq(encoder_inputs, decoder_inputs, cell)} \end{Highlighting} \end{Shaded} In the above call, \texttt{encoder\_inputs} are a list of tensors representing inputs to the encoder, i.e., corresponding to the letters \emph{A, B, C} in the first picture above. Similarly, \texttt{decoder\_inputs} are tensors representing inputs to the decoder, \emph{GO, W, X, Y, Z} on the first picture. The \texttt{cell} argument is an instance of the \texttt{models.rnn.rnn\_cell.RNNCell} class that determines which cell will be used inside the model. You can use an existing cell, such as \texttt{GRUCell} or \texttt{LSTMCell}, or you can write your own. Moreover, \texttt{rnn\_cell} provides wrappers to construct multi-layer cells, add dropout to cell inputs or outputs, or to do other transformations, see the \href{tensorflow-zh/SOURCE/tutorials/recurrent/index.md}{RNN Tutorial} for examples. The call to \texttt{basic\_rnn\_seq2seq} returns two arguments: \texttt{outputs} and \texttt{states}. Both of them are lists of tensors of the same length as \texttt{decoder\_inputs}. Naturally, \texttt{outputs} correspond to the outputs of the decoder in each time-step, in the first picture above that would be \emph{W, X, Y, Z, EOS}. The returned \texttt{states} represent the internal state of the decoder at every time-step. In many applications of sequence-to-sequence models, the output of the decoder at time t is fed back and becomes the input of the decoder at time t+1. At test time, when decoding a sequence, this is how the sequence is constructed. During training, on the other hand, it is common to provide the correct input to the decoder at every time-step, even if the decoder made a mistake before. Functions in \texttt{seq2seq.py} support both modes using the \texttt{feed\_previous} argument. For example, let's analyze the following use of an embedding RNN model. \begin{Shaded} \begin{Highlighting}[] \NormalTok{outputs, states }\OperatorTok{=} \NormalTok{embedding_rnn_seq2seq(} \NormalTok{encoder_inputs, decoder_inputs, cell,} \NormalTok{num_encoder_symbols, num_decoder_symbols,} \NormalTok{output_projection}\OperatorTok{=}\VariableTok{None}\NormalTok{, feed_previous}\OperatorTok{=}\VariableTok{False}\NormalTok{)} \end{Highlighting} \end{Shaded} In the \texttt{embedding\_rnn\_seq2seq} model, all inputs (both \texttt{encoder\_inputs} and \texttt{decoder\_inputs}) are integer-tensors that represent discrete values. They will be embedded into a dense representation (see the \href{tensorflow-zh/SOURCE/tutorials/word2vec/index.md}{Vectors Representations Tutorial} for more details on embeddings), but to construct these embeddings we need to specify the maximum number of discrete symbols that will appear: \texttt{num\_encoder\_symbols} on the encoder side, and \texttt{num\_decoder\_symbols} on the decoder side. In the above invocation, we set \texttt{feed\_previous} to False. This means that the decoder will use \texttt{decoder\_inputs} tensors as provided. If we set \texttt{feed\_previous} to True, the decoder would only use the first element of \texttt{decoder\_inputs}. All other tensors from this list would be ignored, and instead the previous output of the encoder would be used. This is used for decoding translations in our translation model, but it can also be used during training, to make the model more robust to its own mistakes, similar to \href{http://arxiv.org/pdf/1506.03099v2.pdf}{Bengio et al., 2015}. One more important argument used above is \texttt{output\_projection}. If not specified, the outputs of the embedding model will be tensors of shape batch-size by \texttt{num\_decoder\_symbols} as they represent the logits for each generated symbol. When training models with large output vocabularies, i.e., when \texttt{num\_decoder\_symbols} is large, it is not practical to store these large tensors. Instead, it is better to return smaller output tensors, which will later be projected onto a large output tensor using \texttt{output\_projection}. This allows to use our seq2seq models with a sampled softmax loss, as described in \href{http://arxiv.org/pdf/1412.2007v2.pdf}{Jean et. al., 2015}. In addition to \texttt{basic\_rnn\_seq2seq} and \texttt{embedding\_rnn\_seq2seq} there are a few more sequence-to-sequence models in \texttt{seq2seq.py}, take a look there. They all have similar interfaces, so we will not describe them in detail. We will use \texttt{embedding\_attention\_seq2seq} for our translation model below. \subsection{Neural Translation Model }\label{neural-translation-model} While the core of the sequence-to-sequence model is constructed by the functions in \texttt{models/rnn/seq2seq.py}, there are still a few tricks that are worth mentioning that are used in our translation model in \texttt{models/rnn/translate/seq2seq\_model.py}. \subsubsection{Sampled softmax and output projection }\label{sampled-softmax-and-output-projection} For one, as already mentioned above, we want to use sampled softmax to handle large output vocabulary. To decode from it, we need to keep track of the output projection. Both the sampled softmax loss and the output projections are constructed by the following code in \texttt{seq2seq\_model.py}. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{if} \NormalTok{num_samples }\OperatorTok{>} \DecValTok{0} \OperatorTok{and} \NormalTok{num_samples }\OperatorTok{<} \VariableTok{self}\NormalTok{.target_vocab_size:} \NormalTok{w }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"proj_w"}\NormalTok{, [size, }\VariableTok{self}\NormalTok{.target_vocab_size])} \NormalTok{w_t }\OperatorTok{=} \NormalTok{tf.transpose(w)} \NormalTok{b }\OperatorTok{=} \NormalTok{tf.get_variable(}\StringTok{"proj_b"}\NormalTok{, [}\VariableTok{self}\NormalTok{.target_vocab_size])} \NormalTok{output_projection }\OperatorTok{=} \NormalTok{(w, b)} \KeywordTok{def} \NormalTok{sampled_loss(inputs, labels):} \NormalTok{labels }\OperatorTok{=} \NormalTok{tf.reshape(labels, [}\OperatorTok{-}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{])} \ControlFlowTok{return} \NormalTok{tf.nn.sampled_softmax_loss(w_t, b, inputs, labels, num_samples,} \VariableTok{self}\NormalTok{.target_vocab_size)} \end{Highlighting} \end{Shaded} First, note that we only construct a sampled softmax if the number of samples (512 by default) is smaller that the target vocabulary size. For vocabularies smaller than 512 it might be a better idea to just use a standard softmax loss. Then, as you can see, we construct an output projection. It is a pair, consisting of a weight matrix and a bias vector. If used, the rnn cell will return vectors of shape batch-size by \texttt{size}, rather than batch-size by \texttt{target\_vocab\_size}. To recover logits, we need to multiply by the weight matrix and add the biases, as is done in lines 124-126 in \texttt{seq2seq\_model.py}. \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{if} \NormalTok{output_projection }\OperatorTok{is} \OperatorTok{not} \VariableTok{None}\NormalTok{:} \VariableTok{self}\NormalTok{.outputs[b] }\OperatorTok{=} \NormalTok{[tf.matmul(output, output_projection[}\DecValTok{0}\NormalTok{]) }\OperatorTok{+} \NormalTok{output_projection[}\DecValTok{1}\NormalTok{] }\ControlFlowTok{for} \NormalTok{...]} \end{Highlighting} \end{Shaded} \subsubsection{Bucketing and padding }\label{bucketing-and-padding} In addition to sampled softmax, our translation model also makes use of \emph{bucketing}, which is a method to efficiently handle sentences of different lengths. Let us first clarify the problem. When translating English to French, we will have English sentences of different lengths L1 on input, and French sentences of different lengths L2 on output. Since the English sentence is passed as \texttt{encoder\_inputs}, and the French sentence comes as \texttt{decoder\_inputs} (prefixed by a GO symbol), we should in principle create a seq2seq model for every pair (L1, L2+1) of lengths of an English and French sentence. This would result in an enormous graph consisting of many very similar subgraphs. On the other hand, we could just pad every sentence with a special PAD symbol. Then we'd need only one seq2seq model, for the padded lengths. But on shorter sentence our model would be inefficient, encoding and decoding many PAD symbols that are useless. As a compromise between contructing a graph for every pair of lengths and padding to a single length, we use a number of \emph{buckets} and pad each sentence to the length of the bucket above it. In \texttt{translate.py} we use the following default buckets. \begin{Shaded} \begin{Highlighting}[] \NormalTok{buckets }\OperatorTok{=} \NormalTok{[(}\DecValTok{5}\NormalTok{, }\DecValTok{10}\NormalTok{), (}\DecValTok{10}\NormalTok{, }\DecValTok{15}\NormalTok{), (}\DecValTok{20}\NormalTok{, }\DecValTok{25}\NormalTok{), (}\DecValTok{40}\NormalTok{, }\DecValTok{50}\NormalTok{)]} \end{Highlighting} \end{Shaded} This means that if the input is an English sentence with 3 tokens, and the corresponding output is a French sentence with 6 tokens, then they will be put in the first bucket and padded to length 5 for encoder inputs, and length 10 for decoder inputs. If we have an English sentence with 8 tokens and the corresponding French sentence has 18 tokens, then they will not fit into the (10, 15) bucket, and so the (20, 25) bucket will be used, i.e.~the English sentence will be padded to 20, and the French one to 25. Remember that when constructing decoder inputs we prepend the special \texttt{GO} symbol to the input data. This is done in the \texttt{get\_batch()} function in \texttt{seq2seq\_model.py}, which also reverses the input English sentence. Reversing the inputs was shown to improve results for the neural translation model in \href{http://arxiv.org/abs/1409.3215}{Sutskever et al., 2014}. To put it all together, imagine we have the sentence ``I go.'', tokenized as \texttt{{[}"I",\ "go",\ "."{]}} as input and the sentence ``Je vais.'' as output, tokenized \texttt{{[}"Je",\ "vais",\ "."{]}}. It will be put in the (5, 10) bucket, with encoder inputs representing \texttt{{[}PAD\ PAD\ "."\ "go"\ "I"{]}} and decoder inputs \texttt{{[}GO\ "Je"\ "vais"\ "."\ EOS\ PAD\ PAD\ PAD\ PAD\ PAD{]}}. \subsection{Let's Run It }\label{lets-run-it} To train the model described above, we need to a large English-French corpus. We will use the \emph{10\^{}9-French-English corpus} from the \href{http://www.statmt.org/wmt15/translation-task.html}{WMT'15 Website} for training, and the 2013 news test from the same site as development set. Both data-sets will be downloaded to \texttt{data\_dir} and training will start, saving checkpoints in \texttt{train\_dir}, when this command is run. \begin{verbatim} bazel run -c opt <...>/models/rnn/translate:translate --data_dir [your_data_directory] --train_dir [checkpoints_directory] --en_vocab_size=40000 --fr_vocab_size=40000 \end{verbatim} It takes about 18GB of disk space and several hours to prepare the training corpus. It is unpacked, vocabulary files are created in \texttt{data\_dir}, and then the corpus is tokenized and converted to integer ids. Note the parameters that determine vocabulary sizes. In the example above, all words outside the 40K most common ones will be converted to an \texttt{UNK} token representing unknown words. So if you change vocabulary size, the binary will re-map the corpus to token-ids again. After the data is prepared, training starts. Default parameters in \texttt{translate} are set to quite large values. Large models trained over a long time give good results, but it might take too long or use too much memory for your GPU. You can request to train a smaller model as in the following example. \begin{verbatim} bazel run -c opt <...>/models/rnn/translate:translate --data_dir [your_data_directory] --train_dir [checkpoints_directory] --size=256 --num_layers=2 --steps_per_checkpoint=50 \end{verbatim} The above command will train a model with 2 layers (the default is 3), each layer with 256 units (default is 1024), and will save a checkpoint every 50 steps (the default is 200). You can play with these parameters to find out how large a model can be to fit into the memory of your GPU. During training, every \texttt{steps\_per\_checkpoint} steps the binary will print out statistics from recent steps. With the default parameters (3 layers of size 1024), first messages look like this. \begin{verbatim} global step 200 learning rate 0.5000 step-time 1.39 perplexity 1720.62 eval: bucket 0 perplexity 184.97 eval: bucket 1 perplexity 248.81 eval: bucket 2 perplexity 341.64 eval: bucket 3 perplexity 469.04 global step 400 learning rate 0.5000 step-time 1.38 perplexity 379.89 eval: bucket 0 perplexity 151.32 eval: bucket 1 perplexity 190.36 eval: bucket 2 perplexity 227.46 eval: bucket 3 perplexity 238.66 \end{verbatim} You can see that each step takes just under 1.4 seconds, the perplexity on the training set and the perplexities on the development set for each bucket. After about 30K steps, we see perplexities on short sentences (bucket 0 and 1) going into single digits. Since the training corpus contains \textasciitilde{}22M sentences, one epoch (going through the training data once) takes about 340K steps with batch-size of 64. At this point the model can be used for translating English sentences to French using the \texttt{-\/-decode} option. \begin{verbatim} bazel run -c opt <...>/models/rnn/translate:translate --decode --data_dir [your_data_directory] --train_dir [checkpoints_directory] Reading model parameters from /tmp/translate.ckpt-340000 > Who is the president of the United States? Qui est le président des États-Unis ? \end{verbatim} \subsection{What Next? }\label{what-next} The example above shows how you can build your own English-to-French translator, end-to-end. Run it and see how the model performs for yourself. While it has reasonable quality, the default parameters will not give you the best translation model. Here are a few things you can improve. First of all, we use a very promitive tokenizer, the \texttt{basic\_tokenizer} function in \texttt{data\_utils}. A better tokenizer can be found on the \href{http://www.statmt.org/wmt15/translation-task.html}{WMT'15 Website}. Using that tokenizer, and a larger vocabulary, should improve your translations. Also, the default parameters of the translation model are not tuned. You can try changing the learning rate, decay, or initializing the weights of your model in a different way. You can also change the default \texttt{GradientDescentOptimizer} in \texttt{seq2seq\_model.py} to a more advanced one, such as \texttt{AdagradOptimizer}. Try these things and see how they improve your results! Finally, the model presented above can be used for any sequence-to-sequence task, not only for translation. Even if you want to transform a sequence to a tree, for example to generate a parsing tree, the same model as above can give state-of-the-art results, as demonstrated in \href{http://arxiv.org/abs/1412.7449}{Vinyals \& Kaiser et al., 2015}. So you can not only build your own translator, you can also build a parser, a chat-bot, or any program that comes to your mind. Experiment! ================================================ FILE: tex_pdf/tutorials/c2s08_2_9_mandelbrot.tex ================================================ \section{曼德布洛特(Mandelbrot)集合 }\label{ux66fcux5fb7ux5e03ux6d1bux7279mandelbrotux96c6ux5408} 虽然可视化曼德布洛特(Mandelbrot)集合与机器学习没有任何关系,但这对于将TensorFlow应用在数学更广泛的领域是一个有趣的例子。实际上,这是tensorflow一个非常直截了当的可视化运用。(我们最终也许会提供一种更加精心设计的运用方式来生成真正更加美丽的图像。) 说明:本教程使用了IPython的notebook。 \subsection{基本步骤 }\label{ux57faux672cux6b65ux9aa4} 首先,我们需要导入一些库。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 导入仿真库} \ImportTok{import} \NormalTok{tensorflow }\ImportTok{as} \NormalTok{tf} \ImportTok{import} \NormalTok{numpy }\ImportTok{as} \NormalTok{np} \CommentTok{# 导入可视化库} \ImportTok{import} \NormalTok{PIL.Image} \ImportTok{from} \NormalTok{cStringIO }\ImportTok{import} \NormalTok{StringIO} \ImportTok{from} \NormalTok{IPython.display }\ImportTok{import} \NormalTok{clear_output, Image, display} \ImportTok{import} \NormalTok{scipy.ndimage }\ImportTok{as} \NormalTok{nd} \end{Highlighting} \end{Shaded} 现在我们将定义一个函数来显示迭代计算出的图像。 \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{DisplayFractal(a, fmt}\OperatorTok{=}\StringTok{'jpeg'}\NormalTok{):} \CommentTok{"""显示迭代计算出的彩色分形图像。"""} \NormalTok{a_cyclic }\OperatorTok{=} \NormalTok{(}\FloatTok{6.28}\OperatorTok{*}\NormalTok{a}\OperatorTok{/}\FloatTok{20.0}\NormalTok{).reshape(}\BuiltInTok{list}\NormalTok{(a.shape)}\OperatorTok{+}\NormalTok{[}\DecValTok{1}\NormalTok{])} \NormalTok{img }\OperatorTok{=} \NormalTok{np.concatenate([}\DecValTok{10+20}\OperatorTok{*}\NormalTok{np.cos(a_cyclic),} \DecValTok{30+50}\OperatorTok{*}\NormalTok{np.sin(a_cyclic),} \DecValTok{155-80}\OperatorTok{*}\NormalTok{np.cos(a_cyclic)], }\DecValTok{2}\NormalTok{)} \NormalTok{img[a}\OperatorTok{==}\NormalTok{a.}\BuiltInTok{max}\NormalTok{()] }\OperatorTok{=} \DecValTok{0} \NormalTok{a }\OperatorTok{=} \NormalTok{img} \NormalTok{a }\OperatorTok{=} \NormalTok{np.uint8(np.clip(a, }\DecValTok{0}\NormalTok{, }\DecValTok{255}\NormalTok{))} \NormalTok{f }\OperatorTok{=} \NormalTok{StringIO()} \NormalTok{PIL.Image.fromarray(a).save(f, fmt)} \NormalTok{display(Image(data}\OperatorTok{=}\NormalTok{f.getvalue()))} \end{Highlighting} \end{Shaded} \subsection{会话(session)和变量(variable)初始化 }\label{ux4f1aux8bddsessionux548cux53d8ux91cfvariableux521dux59cbux5316} 为了操作的方便,我们常常使用交互式会话(interactive session),但普通会话(regular session)也能正常使用。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.InteractiveSession()} \end{Highlighting} \end{Shaded} 我们可以自由的混合使用NumPy和TensorFlow,这一点非常方便。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 使用NumPy创建一个在[-2,2]x[-2,2]范围内的2维复数数组} \NormalTok{Y, X }\OperatorTok{=} \NormalTok{np.mgrid[}\OperatorTok{-}\FloatTok{1.3}\NormalTok{:}\FloatTok{1.3}\NormalTok{:}\FloatTok{0.005}\NormalTok{, }\OperatorTok{-}\DecValTok{2}\NormalTok{:}\DecValTok{1}\NormalTok{:}\FloatTok{0.005}\NormalTok{]} \NormalTok{Z }\OperatorTok{=} \NormalTok{X}\OperatorTok{+}\NormalTok{1j}\OperatorTok{*}\NormalTok{Y} \end{Highlighting} \end{Shaded} 现在我们定义并初始化一组TensorFlow的张量 (tensors)。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{xs }\OperatorTok{=} \NormalTok{tf.constant(Z.astype(}\StringTok{"complex64"}\NormalTok{))} \NormalTok{zs }\OperatorTok{=} \NormalTok{tf.Variable(xs)} \NormalTok{ns }\OperatorTok{=} \NormalTok{tf.Variable(tf.zeros_like(xs, }\StringTok{"float32"}\NormalTok{))} \end{Highlighting} \end{Shaded} TensorFlow在使用之前需要你明确给定变量的初始值。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{tf.initialize_all_variables().run()} \end{Highlighting} \end{Shaded} \subsection{定义并运行计算 }\label{ux5b9aux4e49ux5e76ux8fd0ux884cux8ba1ux7b97} 现在我们指定更多的计算\ldots{} \begin{Shaded} \begin{Highlighting}[] \CommentTok{# 计算一个新值z: z^2 + x} \NormalTok{zs_ }\OperatorTok{=} \NormalTok{zs}\OperatorTok{*}\NormalTok{zs }\OperatorTok{+} \NormalTok{xs} \CommentTok{# 这个新值会发散吗?} \NormalTok{not_diverged }\OperatorTok{=} \NormalTok{tf.complex_abs(zs_) }\OperatorTok{<} \DecValTok{4} \CommentTok{# 更新zs并且迭代计算。} \CommentTok{#} \CommentTok{# 说明:在这些值发散之后,我们仍然在计算zs,这个计算消耗特别大!} \CommentTok{# 如果稍微简单点,这里有更好的方法来处理。} \CommentTok{#} \NormalTok{step }\OperatorTok{=} \NormalTok{tf.group(} \NormalTok{zs.assign(zs_),} \NormalTok{ns.assign_add(tf.cast(not_diverged, }\StringTok{"float32"}\NormalTok{))} \NormalTok{)} \end{Highlighting} \end{Shaded} \ldots{}继续执行几百个步骤 \begin{Shaded} \begin{Highlighting}[] \ControlFlowTok{for} \NormalTok{i }\OperatorTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{200}\NormalTok{): step.run()} \end{Highlighting} \end{Shaded} 让我们看看我们得到了什么。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{DisplayFractal(ns.}\BuiltInTok{eval}\NormalTok{())} \end{Highlighting} \end{Shaded} \begin{figure}[htbp] \centering \includegraphics{../images/mandelbrot_output.jpg} \caption{jpeg} \end{figure} 结果不错! \begin{quote} 原文:\href{http://tensorflow.org/tutorials/mandelbrot/index.md}{Mandelbrot Set} 翻译:\href{https://github.com/ericxk}{ericxk} 校对:\href{https://github.com/tensorfly}{tensorfly} \end{quote} ================================================ FILE: tex_pdf/tutorials/c2s09_pdes.tex ================================================ \section{偏微分方程 }\label{ux504fux5faeux5206ux65b9ux7a0b} \textbf{\emph{TensorFlow}} 不仅仅是用来机器学习,它更可以用来模拟仿真。在这里,我们将通过模拟仿真几滴落入一块方形水池的雨点的例子,来引导您如何使用 \textbf{\emph{TensorFlow}} 中的偏微分方程来模拟仿真的基本使用方法。 \begin{quote} 注:本教程最初是准备做为一个 \textbf{IPython} 的手册。 \textgreater{}译者注:关于偏微分方程的相关知识,译者推荐读者查看 \href{http://open.163.com/}{\textbf{网易公开课}} 上的\href{http://open.163.com/special/opencourse/multivariable.html}{\textbf{《麻省理工学院公开课:多变量微积分》}}课程。 \end{quote} \subsection{基本设置 }\label{ux57faux672cux8bbeux7f6e} 首先,我们需要导入一些必要的引用。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{#导入模拟仿真需要的库} \ImportTok{import} \NormalTok{tensorflow }\ImportTok{as} \NormalTok{tf} \ImportTok{import} \NormalTok{numpy }\ImportTok{as} \NormalTok{np} \CommentTok{#导入可视化需要的库} \ImportTok{import} \NormalTok{PIL.Image} \ImportTok{from} \NormalTok{cStringIO }\ImportTok{import} \NormalTok{StringIO} \ImportTok{from} \NormalTok{IPython.display }\ImportTok{import} \NormalTok{clear_output, Image, display} \end{Highlighting} \end{Shaded} 然后,我们还需要一个用于表示池塘表面状态的函数。 \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{DisplayArray(a, fmt}\OperatorTok{=}\StringTok{'jpeg'}\NormalTok{, rng}\OperatorTok{=}\NormalTok{[}\DecValTok{0}\NormalTok{,}\DecValTok{1}\NormalTok{]):} \CommentTok{"""Display an array as a picture."""} \NormalTok{a }\OperatorTok{=} \NormalTok{(a }\OperatorTok{-} \NormalTok{rng[}\DecValTok{0}\NormalTok{])}\OperatorTok{/}\BuiltInTok{float}\NormalTok{(rng[}\DecValTok{1}\NormalTok{] }\OperatorTok{-} \NormalTok{rng[}\DecValTok{0}\NormalTok{])}\OperatorTok{*}\DecValTok{255} \NormalTok{a }\OperatorTok{=} \NormalTok{np.uint8(np.clip(a, }\DecValTok{0}\NormalTok{, }\DecValTok{255}\NormalTok{))} \NormalTok{f }\OperatorTok{=} \NormalTok{StringIO()} \NormalTok{PIL.Image.fromarray(a).save(f, fmt)} \NormalTok{display(Image(data}\OperatorTok{=}\NormalTok{f.getvalue()))} \end{Highlighting} \end{Shaded} 最后,为了方便演示,这里我们需要打开一个 \textbf{\emph{TensorFlow}} 的交互会话(interactive session)。当然为了以后能方便调用,我们可以把相关代码写到一个可以执行的\textbf{\emph{Python}}文件中。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{sess }\OperatorTok{=} \NormalTok{tf.InteractiveSession()} \end{Highlighting} \end{Shaded} \subsection{定义计算函数 }\label{ux5b9aux4e49ux8ba1ux7b97ux51fdux6570} \begin{Shaded} \begin{Highlighting}[] \KeywordTok{def} \NormalTok{make_kernel(a):} \CommentTok{"""Transform a 2D array into a convolution kernel"""} \NormalTok{a }\OperatorTok{=} \NormalTok{np.asarray(a)} \NormalTok{a }\OperatorTok{=} \NormalTok{a.reshape(}\BuiltInTok{list}\NormalTok{(a.shape) }\OperatorTok{+} \NormalTok{[}\DecValTok{1}\NormalTok{,}\DecValTok{1}\NormalTok{])} \ControlFlowTok{return} \NormalTok{tf.constant(a, dtype}\OperatorTok{=}\DecValTok{1}\NormalTok{)} \KeywordTok{def} \NormalTok{simple_conv(x, k):} \CommentTok{"""A simplified 2D convolution operation"""} \NormalTok{x }\OperatorTok{=} \NormalTok{tf.expand_dims(tf.expand_dims(x, }\DecValTok{0}\NormalTok{), }\OperatorTok{-}\DecValTok{1}\NormalTok{)} \NormalTok{y }\OperatorTok{=} \NormalTok{tf.nn.depthwise_conv2d(x, k, [}\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{, }\DecValTok{1}\NormalTok{], padding}\OperatorTok{=}\StringTok{'SAME'}\NormalTok{)} \ControlFlowTok{return} \NormalTok{y[}\DecValTok{0}\NormalTok{, :, :, }\DecValTok{0}\NormalTok{]} \KeywordTok{def} \NormalTok{laplace(x):} \CommentTok{"""Compute the 2D laplacian of an array"""} \NormalTok{laplace_k }\OperatorTok{=} \NormalTok{make_kernel([[}\FloatTok{0.5}\NormalTok{, }\FloatTok{1.0}\NormalTok{, }\FloatTok{0.5}\NormalTok{],} \NormalTok{[}\FloatTok{1.0}\NormalTok{, }\OperatorTok{-}\DecValTok{6}\NormalTok{., }\FloatTok{1.0}\NormalTok{],} \NormalTok{[}\FloatTok{0.5}\NormalTok{, }\FloatTok{1.0}\NormalTok{, }\FloatTok{0.5}\NormalTok{]])} \ControlFlowTok{return} \NormalTok{simple_conv(x, laplace_k)} \end{Highlighting} \end{Shaded} \subsection{定义偏微分方程 }\label{ux5b9aux4e49ux504fux5faeux5206ux65b9ux7a0b} 首先,我们需要创建一个完美的 500 × 500 的正方形池塘,就像是我们在现实中找到的一样。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{N }\OperatorTok{=} \DecValTok{500} \end{Highlighting} \end{Shaded} 然后,我们需要创建了一个池塘和几滴将要坠入池塘的雨滴。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Initial Conditions -- some rain drops hit a pond} \CommentTok{# Set everything to zero} \NormalTok{u_init }\OperatorTok{=} \NormalTok{np.zeros([N, N], dtype}\OperatorTok{=}\StringTok{"float32"}\NormalTok{)} \NormalTok{ut_init }\OperatorTok{=} \NormalTok{np.zeros([N, N], dtype}\OperatorTok{=}\StringTok{"float32"}\NormalTok{)} \CommentTok{# Some rain drops hit a pond at random points} \ControlFlowTok{for} \NormalTok{n }\OperatorTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{40}\NormalTok{):} \NormalTok{a,b }\OperatorTok{=} \NormalTok{np.random.randint(}\DecValTok{0}\NormalTok{, N, }\DecValTok{2}\NormalTok{)} \NormalTok{u_init[a,b] }\OperatorTok{=} \NormalTok{np.random.uniform()} \NormalTok{DisplayArray(u_init, rng}\OperatorTok{=}\NormalTok{[}\OperatorTok{-}\FloatTok{0.1}\NormalTok{, }\FloatTok{0.1}\NormalTok{])} \end{Highlighting} \end{Shaded} \begin{figure}[htbp] \centering \includegraphics{../images/pde_output_1.jpg} \caption{jpeg} \end{figure} 现在,让我们来指定该微分方程的一些详细参数。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Parameters:} \CommentTok{# eps -- time resolution} \CommentTok{# damping -- wave damping} \NormalTok{eps }\OperatorTok{=} \NormalTok{tf.placeholder(tf.float32, shape}\OperatorTok{=}\NormalTok{())} \NormalTok{damping }\OperatorTok{=} \NormalTok{tf.placeholder(tf.float32, shape}\OperatorTok{=}\NormalTok{())} \CommentTok{# Create variables for simulation state} \NormalTok{U }\OperatorTok{=} \NormalTok{tf.Variable(u_init)} \NormalTok{Ut }\OperatorTok{=} \NormalTok{tf.Variable(ut_init)} \CommentTok{# Discretized PDE update rules} \NormalTok{U_ }\OperatorTok{=} \NormalTok{U }\OperatorTok{+} \NormalTok{eps }\OperatorTok{*} \NormalTok{Ut} \NormalTok{Ut_ }\OperatorTok{=} \NormalTok{Ut }\OperatorTok{+} \NormalTok{eps }\OperatorTok{*} \NormalTok{(laplace(U) }\OperatorTok{-} \NormalTok{damping }\OperatorTok{*} \NormalTok{Ut)} \CommentTok{# Operation to update the state} \NormalTok{step }\OperatorTok{=} \NormalTok{tf.group(} \NormalTok{U.assign(U_),} \NormalTok{Ut.assign(Ut_))} \end{Highlighting} \end{Shaded} \subsection{开始仿真 }\label{ux5f00ux59cbux4effux771f} 为了能看清仿真效果,我们可以用一个简单的 \textbf{for} 循环来远行我们的仿真程序。 \begin{Shaded} \begin{Highlighting}[] \CommentTok{# Initialize state to initial conditions} \NormalTok{tf.initialize_all_variables().run()} \CommentTok{# Run 1000 steps of PDE} \ControlFlowTok{for} \NormalTok{i }\OperatorTok{in} \BuiltInTok{range}\NormalTok{(}\DecValTok{1000}\NormalTok{):} \CommentTok{# Step simulation} \NormalTok{step.run(\{eps: }\FloatTok{0.03}\NormalTok{, damping: }\FloatTok{0.04}\NormalTok{\})} \CommentTok{# Visualize every 50 steps} \ControlFlowTok{if} \NormalTok{i }\OperatorTok{%} \DecValTok{50} \OperatorTok{==} \DecValTok{0}\NormalTok{:} \NormalTok{clear_output()} \NormalTok{DisplayArray(U.}\BuiltInTok{eval}\NormalTok{(), rng}\OperatorTok{=}\NormalTok{[}\OperatorTok{-}\FloatTok{0.1}\NormalTok{, }\FloatTok{0.1}\NormalTok{])} \end{Highlighting} \end{Shaded} \begin{figure}[htbp] \centering \includegraphics{../images/pde_output_2.jpg} \caption{jpeg} \end{figure} 看!! 雨点落在池塘中,和现实中一样的泛起了涟漪。 \begin{quote} 原文链接:\url{http://tensorflow.org/tutorials/pdes/index.md} 翻译:{[}@wangaicc{]}(https://github.com/wangaicc) 校对:{[}@tensorfly{]}(https://github.com/tensorfly) \end{quote} ================================================ FILE: tex_pdf/tutorials/c2s10_mnist_download.tex ================================================ \section{MNIST 数据下载 }\label{mnist-ux6570ux636eux4e0bux8f7d} 源码: \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/}{tensorflow/g3doc/tutorials/mnist/} 本教程的目标是展示如何下载用于手写数字分类问题所要用到的(经典)MNIST数据集。 \subsection{教程 文件 }\label{ux6559ux7a0b-ux6587ux4ef6} 本教程需要使用以下文件: \begin{longtable}[c]{@{}ll@{}} \toprule \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 文件 \strut\end{minipage} & \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 目的 \strut\end{minipage}\tabularnewline \midrule \endhead \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/g3doc/tutorials/mnist/input_data.py}{\texttt{input\_data.py}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 下载用于训练和测试的MNIST数据集的源码 \strut\end{minipage}\tabularnewline \bottomrule \end{longtable} \subsection{准备数据 }\label{ux51c6ux5907ux6570ux636e} MNIST是在机器学习领域中的一个经典问题。该问题解决的是把28x28像素的灰度手写数字图片识别为相应的数字,其中数字的范围从0到9. \begin{figure}[htbp] \centering \includegraphics{../images/mnist_digits.png} \caption{MNIST Digits} \end{figure} 更多详情, 请参考 \href{http://yann.lecun.com/exdb/mnist/}{Yann LeCun's MNIST page} 或 \href{http://colah.github.io/posts/2014-10-Visualizing-MNIST/}{Chris Olah's visualizations of MNIST}. \subsubsection{下载 }\label{ux4e0bux8f7d} \href{http://yann.lecun.com/exdb/mnist/}{Yann LeCun's MNIST page} 也提供了训练集与测试集数据的下载。 \begin{longtable}[c]{@{}ll@{}} \toprule \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 文件 \strut\end{minipage} & \begin{minipage}[b]{0.05\columnwidth}\raggedright\strut 内容 \strut\end{minipage}\tabularnewline \midrule \endhead \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz}{\texttt{train-images-idx3-ubyte.gz}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 训练集图片 - 55000 张 训练图片, 5000 张 验证图片 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz}{\texttt{train-labels-idx1-ubyte.gz}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 训练集图片对应的数字标签 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz}{\texttt{t10k-images-idx3-ubyte.gz}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 测试集图片 - 10000 张 图片 \strut\end{minipage}\tabularnewline \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut \href{http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz}{\texttt{t10k-labels-idx1-ubyte.gz}} \strut\end{minipage} & \begin{minipage}[t]{0.05\columnwidth}\raggedright\strut 测试集图片对应的数字标签 \strut\end{minipage}\tabularnewline \bottomrule \end{longtable} 在 \texttt{input\_data.py} 文件中, \texttt{maybe\_download()} 函数可以确保这些训练数据下载到本地文件夹中。 文件夹的名字在 \texttt{fully\_connected\_feed.py} 文件的顶部由一个标记变量指定,你可以根据自己的需要进行修改。 \#\#\# 解压 与 重构 这些文件本身并没有使用标准的图片格式储存,并且需要使用\texttt{input\_data.py}文件中\texttt{extract\_images()} 和\texttt{extract\_labels()}函数来手动解压(页面中有相关说明)。 图片数据将被解压成2维的tensor:\texttt{{[}image\ index,\ pixel\ index{]}} 其中每一项表示某一图片中特定像素的强度值, 范围从 \texttt{{[}0,\ 255{]}} 到 \texttt{{[}-0.5,\ 0.5{]}}。 ``image index''代表数据集中图片的编号, 从0到数据集的上限值。``pixel index''代表该图片中像素点得个数, 从0到图片的像素上限值。 以\texttt{train-*}开头的文件中包括60000个样本,其中分割出55000个样本作为训练集,其余的5000个样本作为验证集。因为所有数据集中28x28像素的灰度图片的尺寸为784,所以训练集输出的tensor格式为\texttt{{[}55000,\ 784{]}}。 数字标签数据被解压称1维的tensor: \texttt{{[}image\ index{]}},它定义了每个样本数值的类别分类。对于训练集的标签来说,这个数据规模就是:\texttt{{[}55000{]}}。 \subsubsection{数据集 对象 }\label{ux6570ux636eux96c6-ux5bf9ux8c61} 底层的源码将会执行下载、解压、重构图片和标签数据来组成以下的数据集对象: \begin{longtable}[c]{@{}ll@{}} \toprule 数据集 & 目的\tabularnewline \midrule \endhead \texttt{data\_sets.train} & 55000 组 图片和标签, 用于训练。\tabularnewline \texttt{data\_sets.validation} & 5000 组 图片和标签, 用于迭代验证训练的准确性。\tabularnewline \texttt{data\_sets.test} & 10000 组 图片和标签, 用于最终测试训练的准确性。\tabularnewline \bottomrule \end{longtable} 执行\texttt{read\_data\_sets()}函数将会返回一个\texttt{DataSet}实例,其中包含了以上三个数据集。函数\texttt{DataSet.next\_batch()}是用于获取以\texttt{batch\_size}为大小的一个元组,其中包含了一组图片和标签,该元组会被用于当前的TensorFlow运算会话中。 \begin{Shaded} \begin{Highlighting}[] \NormalTok{images_feed, labels_feed }\OperatorTok{=} \NormalTok{data_set.next_batch(FLAGS.batch_size)} \end{Highlighting} \end{Shaded} 原文地址:\href{https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/tutorials/mnist/download/index.md}{MNIST Data Download} 翻译:\href{https://github.com/btpeter}{btpeter} 校对:waiwaizheng