Repository: jackzhenguo/python-small-examples
Branch: master
Commit: aa11da6ac252
Files: 238
Total size: 177.6 KB

Directory structure:
gitextract_72lj_okl/

├── .gitignore
├── README.md
├── dev/
│   └── python-dev.md
├── md/
│   ├── 1.md
│   ├── 10.md
│   ├── 100.md
│   ├── 101.md
│   ├── 102.md
│   ├── 103.md
│   ├── 104.md
│   ├── 105.md
│   ├── 106.md
│   ├── 107.md
│   ├── 108.md
│   ├── 109.md
│   ├── 11.md
│   ├── 110.md
│   ├── 111.md
│   ├── 112.md
│   ├── 113.md
│   ├── 114.md
│   ├── 115.md
│   ├── 116.md
│   ├── 117.md
│   ├── 118.md
│   ├── 119.md
│   ├── 12.md
│   ├── 120.md
│   ├── 121.md
│   ├── 122.md
│   ├── 123.md
│   ├── 124.md
│   ├── 125.md
│   ├── 126.md
│   ├── 127.md
│   ├── 128.md
│   ├── 129.md
│   ├── 13.md
│   ├── 130.md
│   ├── 131.md
│   ├── 132.md
│   ├── 133.md
│   ├── 134.md
│   ├── 135.md
│   ├── 136.md
│   ├── 137.md
│   ├── 138.md
│   ├── 139.md
│   ├── 14.md
│   ├── 140.md
│   ├── 141.md
│   ├── 142.md
│   ├── 143.md
│   ├── 144.md
│   ├── 145.md
│   ├── 146.md
│   ├── 147.md
│   ├── 148.md
│   ├── 149.md
│   ├── 15.md
│   ├── 150.md
│   ├── 151.md
│   ├── 152.md
│   ├── 153.md
│   ├── 154.md
│   ├── 155.md
│   ├── 156.md
│   ├── 157.md
│   ├── 158.md
│   ├── 159.md
│   ├── 16.md
│   ├── 160.md
│   ├── 161.md
│   ├── 162.md
│   ├── 163.md
│   ├── 164.md
│   ├── 165.md
│   ├── 166.md
│   ├── 167.md
│   ├── 168.md
│   ├── 169.md
│   ├── 17.md
│   ├── 170.md
│   ├── 171.md
│   ├── 172.md
│   ├── 173.md
│   ├── 174.md
│   ├── 175.md
│   ├── 176.md
│   ├── 177.md
│   ├── 178.md
│   ├── 179.md
│   ├── 18.md
│   ├── 180.md
│   ├── 181.md
│   ├── 182.md
│   ├── 183.md
│   ├── 184.md
│   ├── 185.md
│   ├── 186.md
│   ├── 187.md
│   ├── 188.md
│   ├── 189.md
│   ├── 19.md
│   ├── 190.md
│   ├── 191.md
│   ├── 192.md
│   ├── 193.md
│   ├── 194.md
│   ├── 195.md
│   ├── 196.md
│   ├── 197.md
│   ├── 198.md
│   ├── 199.md
│   ├── 2.md
│   ├── 20.md
│   ├── 200.md
│   ├── 201.md
│   ├── 202.md
│   ├── 203.md
│   ├── 204.md
│   ├── 205.md
│   ├── 206.md
│   ├── 207.md
│   ├── 208.md
│   ├── 209.md
│   ├── 21.md
│   ├── 210.md
│   ├── 211.md
│   ├── 212.md
│   ├── 213.md
│   ├── 214.md
│   ├── 215.md
│   ├── 216.md
│   ├── 217.md
│   ├── 218.md
│   ├── 219.md
│   ├── 22.md
│   ├── 220.md
│   ├── 221.md
│   ├── 222.md
│   ├── 223.md
│   ├── 224.md
│   ├── 225.md
│   ├── 226.md
│   ├── 227.md
│   ├── 228.md
│   ├── 229.md
│   ├── 23.md
│   ├── 230.md
│   ├── 231.md
│   ├── 232.md
│   ├── 233.md
│   ├── 24.md
│   ├── 25.md
│   ├── 26.md
│   ├── 27.md
│   ├── 28.md
│   ├── 29.md
│   ├── 3.md
│   ├── 30.md
│   ├── 31.md
│   ├── 32.md
│   ├── 33.md
│   ├── 34.md
│   ├── 35.md
│   ├── 36.md
│   ├── 37.md
│   ├── 38.md
│   ├── 39.md
│   ├── 4.md
│   ├── 40.md
│   ├── 41.md
│   ├── 42.md
│   ├── 43.md
│   ├── 44.md
│   ├── 45.md
│   ├── 46.md
│   ├── 47.md
│   ├── 48.md
│   ├── 49.md
│   ├── 5.md
│   ├── 50.md
│   ├── 51.md
│   ├── 52.md
│   ├── 53.md
│   ├── 54.md
│   ├── 55.md
│   ├── 56.md
│   ├── 57.md
│   ├── 58.md
│   ├── 59.md
│   ├── 6.md
│   ├── 60.md
│   ├── 61.md
│   ├── 62.md
│   ├── 63.md
│   ├── 64.md
│   ├── 65.md
│   ├── 66.md
│   ├── 67.md
│   ├── 68.md
│   ├── 69.md
│   ├── 7.md
│   ├── 70.md
│   ├── 71.md
│   ├── 72.md
│   ├── 73.md
│   ├── 74.md
│   ├── 75.md
│   ├── 76.md
│   ├── 77.md
│   ├── 78.md
│   ├── 79.md
│   ├── 8.md
│   ├── 80.md
│   ├── 81.md
│   ├── 82.md
│   ├── 83.md
│   ├── 84.md
│   ├── 85.md
│   ├── 86.md
│   ├── 87.md
│   ├── 88.md
│   ├── 89.md
│   ├── 9.md
│   ├── 90.md
│   ├── 91.md
│   ├── 92.md
│   ├── 93.md
│   ├── 94.md
│   ├── 95.md
│   ├── 96.md
│   ├── 97.md
│   ├── 98.md
│   ├── 99.md
│   └── batch.py
└── script/
    └── add_nav.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
﻿.idea
.vscode
.github
img/*.html
notebook
venv


================================================
FILE: README.md
================================================

<div align="center">
<img src="https://img.shields.io/badge/-Python-brightgreen">
<img src="https://img.shields.io/badge/-%E6%95%B0%E6%8D%AE%E5%88%86%E6%9E%90-yellowgreen">
<img src="https://img.shields.io/badge/-%E7%AE%97%E6%B3%95-yellow">
<img src="https://img.shields.io/badge/-%E6%9C%BA%E5%99%A8%E5%AD%A6%E4%B9%A0-lightgrey">
<a href="https://static01.imgkr.com/temp/c6e10a16c4764dcdb32587760f6769ec.png" width="28%"><img src="https://img.shields.io/badge/%E5%85%AC%E4%BC%97%E5%8F%B7-Python%E5%B0%8F%E4%BE%8B%E5%AD%90-orange"></a>
</div>
<br>

<!-- <div align="center">
<img src="https://static01.imgkr.com/temp/f379139a2c5d463799c35c1aa68911d7.png" width="18%"/>
</div> -->
</div>

## 介绍

告别枯燥，告别枯燥，致力于打造 Python 经典小例子、小案例。 

## License

允许按照要求转载，但禁止用于任何商用目的。如果转载本库小例子、小案例，请备注下方链接：

[Python小例子所有汇总](https://ai-jupyter.com/python-small-examples/)

### 更多教程

[AI消息](https://ai-jupyter.com/)

[AI新闻报道](https://ai-jupyter.com/ai-news-all/)

[AI大模型](https://ai-jupyter.com/ai-llm/)

[AI工具集](https://ai-jupyter.com/ai-chatgpt/)

[ChatGPT4o免费使用六种方法](https://ai-jupyter.com/ai-chatgpt/)

[Python教程](https://ai-jupyter.com/python-packages/)

[数据分析教程](https://ai-jupyter.com/numpy-intro/)

[算法教程](https://ai-jupyter.com/algorithm-basic/)

[AI教程](https://ai-jupyter.com/statistics/)

[Git教程](https://ai-jupyter.com/git/)

[程序员](https://ai-jupyter.com/others/)

[资料下载](https://ai-jupyter.com/python-20/)


## Python 小例子

### 基本操作

| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
|1   | [常见算术运算](md/198.md)|	运算  |	v1|	⭐⭐ |
| 2    | [实现 relu](md/1.md) | max   | V4.0        | ⭐️⭐️ |
| 3    | [进制转化](md/2.md)                | bin,oct,hex | V4.0            |  ⭐️⭐️|
| 4    | [整数和ASCII互转](md/3.md)         | chr,ord | V1.0     | ⭐️⭐️ |
| 5    | [元素都为真检查](md/4.md)          | all   | V2.0      | ⭐️⭐️⭐️ |
| 6    | [至少一个为真检查](md/5.md)        | any | V2.0    | ⭐️⭐️⭐️ |
| 7    | [判断是真是假](md/6.md)            | bool        | V2.0        | ⭐️⭐️⭐️ |
| 8    | [创建复数](md/7.md)                | complex         | V1.0            | ⭐️⭐️⭐️ |
| 9    | [取商和余数](md/8.md)              | divmod        | V1.0          | ⭐️⭐️ |
| 10    | [转为浮点类型](md/9.md)            | float       | V1.0        | ⭐️⭐️ |
| 11   | [转为整型](md/10.md)               | int            | V1.0           | ⭐️ |
| 12   | [次幂](md/11.md)                   | pow                | V1.0               | ⭐️ |
| 13   | [四舍五入](md/12.md)               | round          | V1.0           | ⭐️ |
| 14   | [链式比较](md/13.md)               | compare       | V1.0           | ⭐️⭐️ |
| 15   | [字符串转字节](md/14.md)           | bytes,utf-8 | V1.0       | ⭐️⭐️ |
| 16   | [任意对象转为字符串](md/15.md)     | str  | V1.0 | ⭐️⭐️ |
| 17   | [执行字符串表示的代码](md/16.md)   | compile | V1.0 | ⭐️⭐️⭐️ |
| 18   | [计算表达式](md/17.md)             | eval         | V1.0         | ⭐️⭐️⭐️⭐️ |
| 19   | [字符串格式化](md/18.md)           | format     | V1.0       | ⭐️⭐️⭐️⭐️ |
| 20   | [交换两元素](md/23.md)             | pack,unpack | V1.0         | ⭐️⭐️ |
| 21   | [转为字典](md/29.md)               | dict           | V1.0           | ⭐️⭐️ |
| 22   | [冻结集合](md/30.md)               | frozenset | V1.0           | ⭐️⭐️ |
| 23   | [转为集合类型](md/31.md)           | set        | V1.0       | ⭐️⭐️ |
| 24   | [转元组](md/32.md)                 | tuple            | V1.0             | ⭐️⭐️ |
| 25   | [查看变量所占字节数](md/48.md)  | getsizeof | V1.0 | ⭐️⭐️⭐️ |
| 26 | [含单个元素的元组](md/154.md) | tuple | V1.0 | ⭐️⭐ |
| 27 | [列表删除之坑](md/159.md) | list | V1.0 | ⭐️⭐ |
| 28 | [列表快速复制之坑](md/160.md) | list  | V1.0 | ⭐️⭐⭐ |
| 29 | [发现列表前3个最大或最小数](md/195.md) | list heapq | v1.0 | ⭐️⭐⭐⭐ |
| 30 | [字符串驻留](md/161.md) | str  | V1.0 | ⭐️⭐⭐⭐⭐ |
| 31 | [创建空集合错误](md/166.md) | set  | V1.0 | ⭐️⭐ |
| 32 | [充分认识for](md/164.md) | for  | V1.0 | ⭐️⭐⭐ |
| 33 | [认识执行时机](md/165.md) | generator  | V1.0 | ⭐️⭐⭐⭐⭐ |


### 函数和模块常见用法

| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1   | [操作函数对象](md/24.md)           | operator   | V2.0       | ⭐️⭐️⭐️⭐️ |
| 2   | [创建range序列](md/55.md) | range | V1.0 | ⭐️⭐️ |
| 3   | [生成逆序序列](md/25.md)           | range | V1.0       | ⭐️⭐️ |
| 4   | [拿来就用的排序函数](md/19.md)     | sorted | V1.0 | ⭐️⭐️⭐️ |
| 5   | [求和函数](md/20.md)               | sum            | V1.0           | ⭐️⭐️ |
| 6   | [函数的五类参数使用例子](md/26.md) | variable parameter | V2.0 | ⭐️⭐️⭐️⭐️ |
| 7   | [使用slice对象](md/27.md)          | slice     | V2.0      | ⭐️⭐️⭐️⭐️⭐️ |
| 8   | [lambda 函数](md/28.md)  | lambda | V3.0 | ⭐️⭐️⭐️⭐️ |
| 9   | [枚举对象](md/47.md)       | enumerate | V1.0   | ⭐️⭐️⭐️ |
| 10   | [过滤器filter](md/49.md)  | filter | V1.5 | ⭐️⭐️⭐️ |
| 11   | [返回对象哈希值](md/50.md)    | hash | V1.0 | ⭐️⭐️ |
| 12   | [带名字的元组](md/79.md) | namedtuple | V1.0 | ⭐️⭐️⭐️ |
| 13   | [一行代码实现列表反转](md/70.md) | reverse | V1.0 | ⭐️⭐️ |
| 14   | [反转字符串的两个方法](md/86.md) | reversed | V1.0 | ⭐️⭐️ |
| 15   | [join 串联字符串](md/87.md) | join | V1.0 | ⭐️⭐️ |
| 16   | [字符串字节长度](md/88.md) | encode | V1.0 | ⭐️⭐️ |
| 17 | [groupby单字段分组](md/129.md) | itertools, groupby,lambda | V1.0 | ⭐️⭐️⭐️ |
| 18 | [groupby多字段分组](md/130.md) | itemgetter,itertools,groupby | V1.0 | ⭐️⭐️⭐️⭐️ |
| 19 | [itemgetter和key函数](md/131.md) | operator,itemgetter,itertools | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 20 | [sum函数计算和聚合同时做](md/132.md) | sum,generator | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 21 | [默认参数设为空](md/155.md) | function | V1.0 | ⭐️⭐⭐ |
| 22 | [各种参数使用之坑](md/158.md) | function paremeter | V1.0 | ⭐️⭐⭐ |
| 23 | [lambda自由参数之坑](md/157.md) | lambda | V1.0 | ⭐️⭐⭐ |
| 24 | [使用堆升序列表](md/196.md) | sort heapq | v1.0 | ⭐️⭐⭐⭐ |


### 面向对象
| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1   | [所有对象之根](md/43.md)           | object     | V1.0       | ⭐️ |
| 2   | [对象是否可调用](md/33.md)         | callable | V2.5   | ⭐️⭐️⭐️⭐️ |
| 3   | [ascii 展示对象](md/34.md)         | `__repr__` | V2.5     | ⭐️⭐️⭐️ |
| 4   | [类方法](md/35.md)                 | classmethod      | V1.5             | ⭐️⭐️⭐️ |
| 5   | [动态删除属性](md/36.md)           | delattr,hasattr | V1.5       | ⭐️⭐️ |
| 6   | [一键查看对象所有方法](md/37.md)   | dir | V1.5 | ⭐️⭐️ |
| 7   | [动态获取对象属性](md/38.md)       | getattr | V1.5   | ⭐️⭐️ |
| 8   | [对象是否有某个属性](md/39.md)     | hasattr | V1.5 | ⭐️⭐️⭐️ |
| 9   | [对象门牌号](md/40.md)             | id           | V1.0         | ⭐️ |
| 10   | [实例和对象关系判断](md/41.md)    | isinstance   | V1.5         | ⭐️⭐️⭐️ |
| 11   | [issubclass父子关系鉴定](md/42.md) | issubclass | V1.5 | ⭐️⭐️⭐️ |
| 12   | [创建属性的两种方法](md/44.md)     | property | V2.5 | ⭐️⭐️⭐️⭐️⭐️ |
| 13   | [查看对象类型](md/45.md)           | type      | V1.0       | ⭐️ |
| 14   | [元类使用介绍](md/46.md)     | type,`__class__` | V2.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 15 | [相同值的不可变对象](md/162.md) | mutable  | V1.0 | ⭐️⭐⭐ |
| 16 | [对象销毁顺序](md/163.md) | OOP del   | V1.0 | ⭐️⭐⭐⭐ |
| 17 | [子类继承父类的静态方法吗？](md/171.md) | staticmethod | V1.0 | ⭐️⭐⭐ |


### 正则
| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1   | [正则中字符 `r`作用](md/89.md) | re,r | V3.0 | ⭐️⭐️⭐️ |
| 2   | [正则原子操作](md/90.md) | re | V3.0 | ⭐️⭐️⭐️ |
| 3   | [正则中的转义](md/91.md) | re,\ | V3.0 | ⭐️⭐️⭐️ |
| 4   | [正则最普通查找](md/92.md) | re,findall | V3.0 | ⭐️⭐️⭐️ |
| 5   | [使用通用字符查找](md/93.md) | re,\s,\w,\d | V3.0 | ⭐️⭐️⭐️ |
| 6   | [使用元字符查找](md/94.md) | re,+,* | V3.0 | ⭐️⭐️⭐️ |
| 7   | [捕获子串](md/95.md) | () | V3.0 | ⭐️⭐️⭐️⭐️ |
| 8   | [贪心捕获和非贪心捕获](md/96.md) | re | V1.0 | ⭐️⭐️⭐️⭐️ |
| 9   | [使用正则做密码安全检查](md/97.md) | re | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 10   | [爬取百度首页标题](md/98.md) | re | V1.0 | ⭐️⭐️⭐️⭐️ |
| 11   | [批量转化为驼峰格式(Camel)](md/99.md) | re | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 12   | [使用正则判断是否为正浮点数](md/102.md) | str,re,float | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 13 | [使用正则提取正整数和大于0的浮点数](md/197.md) | re findall | v2 | ⭐️⭐⭐⭐ |

### 装饰器迭代器生成器
| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1 | [通俗理解装饰器](md/138.md) | decorator | V1.0 | ⭐️⭐️⭐️ |
| 2 | [测试函数运行时间的装饰器](md/136.md) | decorator | V1.0 | ⭐️⭐️⭐️⭐️ |
| 3 | [统计异常次数装饰器](md/137.md) | decorator,nonlocal | V1.5 | ⭐️⭐️⭐️⭐️ |
| 4 | [定制递减迭代器](md/139.md) | Iterator | V3.0 | ⭐️⭐️⭐️⭐️ |
| 5   | [创建迭代器](md/53.md)      | iter,`__iter__` | V1.5  | ⭐️⭐️⭐️ |
| 6   | [反向迭代器reversed](md/56.md) | reversed | V1.0 | ⭐️⭐️ |
| 7   | [zip迭代器](md/57.md)     | zip  | V1.5 | ⭐️⭐️⭐️ |
| 8 | [list分组(生成器版)](md/134.md) | yield,generator | V1.0 | ⭐️⭐️⭐️ |
| 9 | [列表全展开(生成器版)](md/135.md) | list,yield,generator | V1.0 | ⭐️⭐️⭐️ |
| 10   | [chain串联小容器为大容器](md/84.md) | itertools,chain | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 11   | [product 使用案例](md/85.md) | product | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 12 | [斐波那契数列前n项](md/126.md) | yield,range | V1.0 | ⭐️⭐️⭐️ |


### 绘图
| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1 | [turtle绘制奥运五环图](md/140.md) | turtle | V1.0 | ⭐️⭐️⭐️ |
| 2 | [turtle绘制漫天雪花](md/141.md) | turtle | V1.0 | ⭐️⭐️⭐️ |
| 3 | [Python词云图](md/142.md) | WordCloud | V1.0 | ⭐️⭐️⭐ |
| 4 | [Plotly柱状图和折线图](md/143.md) | plotly | V1.0 | ⭐️⭐ |
| 5 | [seaborn热力图](md/144.md) | seaborn | V1.0 | ⭐️⭐ |
| 6 | [Pyecharts仪表盘](md/145.md) | pyecharts | V1.0 | ⭐️⭐ |
| 7 | [Pyecharts漏斗图](md/146.md) | pyecharts | V1.0 | ⭐️⭐ |
| 8 | [Pyecharts水球图](md/147.md) | pyecharts | V1.0 | ⭐️⭐ |
| 9 | [Pyecharts饼图](md/148.md) | pyecharts | V1.0 | ⭐️⭐ |
| 10 | [Pyecharts极坐标图](md/149.md) | pyecharts | V1.0 | ⭐️⭐ |
| 11 | [Pyecharts词云图](md/150.md) | pyecharts | V1.0 | ⭐️⭐ |
| 12 | [Pyecharts热力图](md/151.md) | pyecharts | V1.0 | ⭐️⭐ |
| 13 | [matplotlib绘制动图](md/152.md) | matplotlib | V1.0 | ⭐️⭐ |
| 14 | [seaborn pairplot图](md/153.md) | seaborn | V1.0 | ⭐️⭐⭐⭐ |
| 15 | [pyecharts传入Numpy数据绘图失败](md/167.md) | numpy pyecharts  | V1.0 | ⭐️⭐⭐ |
| 16 | [图像处理包pillow](md/169.md) | pillow  | V1.0 | ⭐️⭐⭐ |

### 数据分析
| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1 | [数据分析神器：deepnote](./md/177.md) | deepnote | v1.0 | ⭐️⭐⭐ |
| 2 | [NumPy 的pad填充方法](md/172.md) | NumPy pad | V1.0 | ⭐️⭐⭐⭐ |
| 3 | [创建下对角线为1、2、3、4的对角矩阵](md/173.md) | NumPy diag | V1.0 | ⭐️⭐⭐ |
| 4 | [cut 数据分箱](md/174.md) | Pandas cut | v1.0 | ⭐️⭐⭐ |
| 5 | [丢弃空值和填充空值](./md/175.md) | Pandas dropna fillna | v1.0 | ⭐️⭐⭐ |
| 6 | [apply 方法去掉特殊字符](./md/178.md) | pandas apply | v1.0 | ⭐️⭐⭐ |
| 7 | [使用map对列做特征工程](./md/179.md) | pandas map | v1.0 | ⭐️⭐⭐ |
| 8 | [category列转数值](./md/180.md) | pandas category | v1.0 | ⭐️⭐⭐ |
| 9 | [rank排名](./md/181.md) | pandas rank | v1.0 | ⭐️⭐⭐|
| 10 | [完成数据下采样，调整步长由小时为天](./md/182.md) | pandas resample | v1.0 | ⭐️⭐⭐ |
| 11 | [如何用 Pandas 快速生成时间序列数据](./md/183.md) | pandas util | v1.0 | ⭐️⭐⭐ |
| 12 | [如何快速找出 DataFrame 所有列 null 值个数](./md/184.md) | pandas isnull sum | v1.0 | ⭐️⭐⭐ |
| 13 | [重新排序 DataFrame 的列](./md/185.md) | pandas dataframe | v1.0 | ⭐️⭐⭐ |
| 14 | [使用 count 统计词条 出现次数](./md/186.md) | pandas count | v1.0 | ⭐️⭐⭐ |
| 15 | [split 求时分(HH:mm)的分钟差](./md/187.md) | pandas split | v1.0 | ⭐️⭐⭐ |
| 16 | [melt透视数据小技巧](./md/188.md) | pandas melt | v1.0 | ⭐️⭐⭐ |
| 17 | [pivot 透视小技巧](./md/189.md) | pandas melt | v1.0 | ⭐️⭐⭐ |
| 18 | [p随机读取文件的K行，生成N个](./md/190.md) | pandas sample | v1.0 | ⭐️⭐⭐ |
| 19 | [格式化Pandas的时间列](md/191.md) | pandas apply | v1.0 | ⭐️⭐⭐⭐ |

### 其他常用
| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1   | [help 一键帮助](md/51.md)  | help | V1.0 | ⭐️ |
| 2   | [获取用户输入](md/52.md)     | input | V1.0 | ⭐️ |
| 3   | [文件读写和mode 取值表](md/54.md) | open,read,write,with,mode | V2.0 | ⭐️⭐️⭐️ |
| 4   | [operator使用举例](md/58.md) | operator | V1.0 | ⭐️⭐️⭐️⭐️ |
| 5   | [传输json对象](md/59.md)  | json | V2.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 6   | [获取文件后缀名](md/103.md) | os,splitext | V1.0 | ⭐️⭐️ |
| 7   | [获取路径中的文件名](md/104.md) | os,split | V1.0 | ⭐️⭐️ |
| 8   | [批量修改文件后缀](md/105.md) | argparse,listdir | V1.0 | ⭐️⭐️⭐️⭐️ |
| 9   | [xls批量转换成xlsx](md/106.md) | os,listdir,splitext | V1.0 | ⭐️⭐️⭐️⭐️ |
| 10   | [获取指定后缀名的文件](md/107.md) | os,listdir,splitext | V1.0 | ⭐️⭐️⭐️⭐️ |
| 11   | [批量压缩文件](md/108.md) | zipfile | V1.0 | ⭐️⭐️⭐️⭐️ |
| 12   | [32位加密](md/109.md) | hashlib | V1.0 | ⭐️⭐️⭐️⭐️ |
| 13   | [年的日历图](md/110.md) | calendar | V1.0 | ⭐️⭐️ |
| 14   | [判断是否为闰年](md/111.md) | calendar | V1.0 | ⭐️⭐️⭐️ |
| 15   | [判断月有几天](md/112.md) | calendar,datetime | V1.0 | ⭐️⭐️⭐️ |
| 16   | [月的第一天](md/113.md) | datetime | V1.0 | ⭐️⭐️ |
| 17 | [月的最后一天](md/114.md) | calendar,datetime | V1.0 | ⭐️⭐️ |
| 18 | [获取当前时间](md/115.md) | time,datetime | V1.0 | ⭐️⭐️ |
| 19 | [字符时间转时间](md/116.md) | time,datetime | V1.0 | ⭐️⭐️ |
| 20 | [时间转字符时间](md/117.md) | time,datetime | V1.0 | ⭐️⭐️ |
| 21 | [获得某天后的1~n天](md/133.md) | Calendar,monthrange | V4.0 | ⭐️⭐️⭐️ |
| 22 | [默认启动主线程](md/118.md) | threading | V1.0 | ⭐️⭐️ |
| 23 | [创建线程](md/119.md) | threading | V1.0 | ⭐️⭐️ |
| 24 | [交替获得CPU时间片](md/120.md) | threading | V1.0 | ⭐️⭐️⭐️ |
| 25 | [多线程抢夺同一个变量](md/121.md) | threading | V1.0 | ⭐️⭐️⭐️ |
| 26 | [多线程变量竞争引起的问题](md/122.md) | threading | V1.0 | ⭐️⭐️⭐️ |
| 27 | [多线程锁](md/123.md) | threading,lock | V1.0 | ⭐️⭐️⭐️ |
| 28 | [时间转数组及常用格式](md/124.md) | time,datetime,format | V1.0 | ⭐️⭐️⭐️ |
| 29   | [nonlocal用于内嵌函数中](md/21.md) | nonlocal | V2.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 30   | [global 声明全局变量](md/22.md)    | global | V2.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 31 | [共享变量未绑定之坑](md/156.md) | global | V1.0 | ⭐️⭐⭐ |
| 32 | [优化代码异常输出包](md/168.md) | debugger  | V1.0 | ⭐️⭐⭐ |
| 33 | [一行代码找到编码](md/170.md) | chardet  | V1.0 | ⭐️⭐⭐ |
| 34 | [创建SQLite连接](md/192.md) | SQLite | v1.0 | ⭐️⭐⭐⭐ |
| 35 | [json对象转python对象](md/193.md) | python json | v1.0 | ⭐️⭐⭐⭐ |
| 36 | [python对象转json对象](md/194.md) | python json | v1.0 | ⭐️⭐⭐⭐ |
| 37 | [一行代码让 pip 安装加速 100 倍](md/176.md) | pip install | v1.0 | ⭐️⭐⭐ |


### 工作常用案例
| 小例子 | 链接                    | 标签              | 版本          | 难度 |
| ---- | ---------------------------------- | ---- | ---- | ---- |
| 1   | [不用else和if实现计算器](md/60.md) | operator | V1.0 | ⭐️⭐️⭐️ |
| 2   | [去最求平均](md/61.md)      | list,sort,round | V1.0  | ⭐️⭐️⭐️⭐️ |
| 3   | [打印99乘法表](md/62.md)    | for,range,format | V1.0 | ⭐️⭐️⭐️ |
| 4   | [递归版flatten函数](md/63.md) | recursion,list,isinstance | V1.0 | ⭐️⭐️⭐️⭐️ |
| 5   | [列表等分为n份](md/64.md)  | list,ceil | V1.0 | ⭐️⭐️⭐️ |
| 6   | [压缩列表](md/65.md)       | list,filter | V1.0   | ⭐️⭐️⭐️⭐️ |
| 7   | [求更长的列表](md/66.md)     | max,lambda | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 8   | [求列表众数](md/67.md)     | max,lambda,count | V1.0 | ⭐️⭐️⭐️⭐️ |
| 9   | [所有多个列表的最大值](md/68.md) | max,lambda | V1.0 | ⭐️⭐️⭐️⭐️ |
| 10   | [列表检查重复](md/69.md)     | set  | V1.0 | ⭐️⭐️⭐️ |
| 11   | [浮点数等差数列](md/71.md)    | range,float | V1.0 | ⭐️⭐️⭐️⭐️ |
| 12   | [按条件分组](md/72.md)      | lambda | V1.0  | ⭐️⭐️⭐️⭐️ |
| 13   | [map实现向量运算](md/73.md)  | map,lambda | V1.0 | ⭐️⭐️⭐️ |
| 14   | [值最大的字典](md/74.md)     | max,lambda | V1.0 | ⭐️⭐️⭐️⭐️ |
| 15   | [合并两个字典](md/75.md)     | **   | V1.0 | ⭐️⭐️⭐️ |
| 16   | [Topn 字典](md/76.md)    | heapq,nlargest | V1.0 | ⭐️⭐️⭐️ |
| 17   | [判断是否为异位词](md/77.md) | collections,Counter | V1.0 | ⭐️⭐️⭐️ |
| 18   | [逻辑上合并字典](md/78.md) | ChainMap | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 19   | [sample 样本抽样](md/80.md) | random,sample | V1.0 | ⭐️⭐️⭐️ |
| 20   | [重洗数据集](md/81.md) | shuffle | V1.0 | ⭐️⭐️⭐️ |
| 21   | [10个均匀分布的坐标点](md/82.md) | random,uniform | V1.0 | ⭐️⭐️⭐️ |
| 22   | [10个高斯分布的坐标点](md/83.md) | random,gauss | V1.0 | ⭐️⭐️⭐️⭐️ |
| 23   | [是否互为排序词](md/100.md) | collections,defaultdict | V1.0 | ⭐️⭐️⭐️⭐️ |
| 24   | [str1是否由str2旋转而来](md/101.md) | str | V1.0 | ⭐️⭐️⭐️ |
| 25 | [寻找第n次出现位置](md/125.md) | enumerator | V1.0 | ⭐️⭐️⭐️ |
| 26 | [找出所有重复元素](md/127.md) | calendar,datetime | V1.0 | ⭐️⭐️⭐️⭐️ |
| 27 | [联合统计次数](md/128.md) | Counter | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 28 | [求两点球面距离](md/199.md) | math asin | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 29 | [获取文件编码](md/200.md) | chardet | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |
| 30 | [格式化json串](md/201.md) | json | V1.0 | ⭐️⭐️⭐️⭐️⭐️ |


================================================
FILE: dev/python-dev.md
================================================
### Python 实战


#### 221 自动群发邮件

Python自动群发邮件

```python
import smtplib
from email import (header)
from email.mime import (text, application, multipart)
import time

def sender_mail():
    smt_p = smtplib.SMTP()
    smt_p.connect(host='smtp.qq.com', port=25)
    sender, password = '113097485@qq.com', "**************"
    smt_p.login(sender, password)
    receiver_addresses, count_num = [
        'guozhennianhua@163.com', 'xiaoxiazi99@163.com'], 1
    for email_address in receiver_addresses:
        try:
            msg = multipart.MIMEMultipart()
            msg['From'] = "zhenguo"
            msg['To'] = email_address
            msg['subject'] = header.Header('这是邮件主题通知', 'utf-8')
            msg.attach(text.MIMEText(
                '这是一封测试邮件，请勿回复本邮件~', 'plain', 'utf-8'))
            smt_p.sendmail(sender, email_address, msg.as_string())
            time.sleep(10)
            print('第%d次发送给%s' % (count_num, email_address))
            count_num = count_num + 1
        except Exception as e:
            print('第%d次给%s发送邮件异常' % (count_num, email_address))
            continue
    smt_p.quit()

sender_mail()
```


注意：
发送邮箱是qq邮箱，所以要在qq邮箱中设置开启SMTP服务，设置完成时会生成一个授权码，将这个授权码赋值给文中的`password`变量

#### 222 二分搜索

二分搜索是程序员必备的算法，无论什么场合，都要非常熟练地写出来。

小例子描述：
在**有序数组**`arr`中，指定区间`[left,right]`范围内，查找元素`x`
如果不存在，返回`-1`

二分搜索`binarySearch`实现的主逻辑

```python
def binarySearch(arr, left, right, x):
    while left <= right:

        mid = int(left + (right - left) / 2); # 找到中间位置。求中点写成(left+right)/2更容易溢出，所以不建议这样写

        # 检查x是否出现在位置mid
        if arr[mid] == x:
            print('found %d 在索引位置%d 处' %(x,mid))
            return mid

            # 假如x更大，则不可能出现在左半部分
        elif arr[mid] < x:
            left = mid + 1 #搜索区间变为[mid+1,right]
            print('区间缩小为[%d,%d]' %(mid+1,right))

        # 同理，假如x更小，则不可能出现在右半部分
        elif x<arr[mid]:
            right = mid - 1 #搜索区间变为[left,mid-1]
            print('区间缩小为[%d,%d]' %(left,mid-1))

    # 假如搜索到这里，表明x未出现在[left,right]中
    return -1
```

在`Ipython`交互界面中，调用`binarySearch`的小Demo:

```python
In [8]: binarySearch([4,5,6,7,10,20,100],0,6,5)
区间缩小为[0,2]
found 5 at 1
Out[8]: 1

In [9]: binarySearch([4,5,6,7,10,20,100],0,6,4)
区间缩小为[0,2]
区间缩小为[0,0]
found 4 at 0
Out[9]: 0

In [10]: binarySearch([4,5,6,7,10,20,100],0,6,20)
区间缩小为[4,6]
found 20 at 5
Out[10]: 5

In [11]: binarySearch([4,5,6,7,10,20,100],0,6,100)
区间缩小为[4,6]
区间缩小为[6,6]
found 100 at 6
Out[11]: 6
```

#### 223 爬取天气数据并解析温度值

爬取天气数据并解析温度值

素材来自朋友袁绍，感谢！

爬取的html 结构

<img src="./img/1.png" width="50%"/>

```python
import requests
from lxml import etree
import pandas as pd
import re

url = 'http://www.weather.com.cn/weather1d/101010100.shtml#input'
with requests.get(url) as res:
    content = res.content
    html = etree.HTML(content)
```


通过lxml模块提取值

lxml比beautifulsoup解析在某些场合更高效

```python
location = html.xpath('//*[@id="around"]//a[@target="_blank"]/span/text()')
temperature = html.xpath('//*[@id="around"]/div/ul/li/a/i/text()')
```

结果：

```python
['香河', '涿州', '唐山', '沧州', '天津', '廊坊', '太原', '石家庄', '涿鹿', '张家口', '保定', '三河', '北京孔庙', '北京国子监', '中国地质博物馆', '月坛公
园', '明城墙遗址公园', '北京市规划展览馆', '什刹海', '南锣鼓巷', '天坛公园', '北海公园', '景山公园', '北京海洋馆']

['11/-5°C', '14/-5°C', '12/-6°C', '12/-5°C', '11/-1°C', '11/-5°C', '8/-7°C', '13/-2°C', '8/-6°C', '5/-9°C', '14/-6°C', '11/-4°C', '13/-3°C'
, '13/-3°C', '12/-3°C', '12/-3°C', '13/-3°C', '12/-2°C', '12/-3°C', '13/-3°C', '12/-2°C', '12/-2°C', '12/-2°C', '12/-3°C']
```


构造DataFrame对象

```python
df = pd.DataFrame({'location':location, 'temperature':temperature})
print('温度列')
print(df['temperature'])
```

正则解析温度值

```python
df['high'] = df['temperature'].apply(lambda x: int(re.match('(-?[0-9]*?)/-?[0-9]*?°C', x).group(1) ) )
df['low'] = df['temperature'].apply(lambda x: int(re.match('-?[0-9]*?/(-?[0-9]*?)°C', x).group(1) ) )
print(df)
```

详细说明子字符创捕获

除了简单地判断是否匹配之外，正则表达式还有提取子串的强大功能。用`()`表示的就是要提取的分组（group）。比如：`^(\d{3})-(\d{3,8})$`分别定义了两个组，可以直接从匹配的字符串中提取出区号和本地号码

```python
m = re.match(r'^(\d{3})-(\d{3,8})$', '010-12345')
print(m.group(0))
print(m.group(1))
print(m.group(2))

# 010-12345
# 010
# 12345
```

如果正则表达式中定义了组，就可以在`Match`对象上用`group()`方法提取出子串来。

注意到`group(0)`永远是原始字符串，`group(1)`、`group(2)`……表示第1、2、……个子串。


最终结果

```kepython
Name: temperature, dtype: object
    location temperature  high  low
0         香河     11/-5°C    11   -5
1         涿州     14/-5°C    14   -5
2         唐山     12/-6°C    12   -6
3         沧州     12/-5°C    12   -5
4         天津     11/-1°C    11   -1
5         廊坊     11/-5°C    11   -5
6         太原      8/-7°C     8   -7
7        石家庄     13/-2°C    13   -2
8         涿鹿      8/-6°C     8   -6
9        张家口      5/-9°C     5   -9
10        保定     14/-6°C    14   -6
11        三河     11/-4°C    11   -4
12      北京孔庙     13/-3°C    13   -3
13     北京国子监     13/-3°C    13   -3
14   中国地质博物馆     12/-3°C    12   -3
15      月坛公园     12/-3°C    12   -3
16   明城墙遗址公园     13/-3°C    13   -3
17  北京市规划展览馆     12/-2°C    12   -2
18       什刹海     12/-3°C    12   -3
19      南锣鼓巷     13/-3°C    13   -3
20      天坛公园     12/-2°C    12   -2
21      北海公园     12/-2°C    12   -2
22      景山公园     12/-2°C    12   -2
23     北京海洋馆     12/-3°C    12   -3
```

### 十、数据分析

本项目基于Kaggle电影影评数据集，通过这个系列，你将学到如何进行数据探索性分析(EDA)，学会使用数据分析利器`pandas`，会用绘图包`pyecharts`，以及EDA时可能遇到的各种实际问题及一些处理技巧。


本项目需要导入的包：

```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from pyecharts.charts import Bar,Grid,Line
import pyecharts.options as opts
from pyecharts.globals import ThemeType
```

#### 1 创建DataFrame
pandas中一个dataFrame实例：
```python
Out[89]:
        a  val
0  apple1  1.0
1  apple2  2.0
2  apple3  3.0
3  apple4  4.0
4  apple5  5.0
```

我们的**目标**是变为如下结构：
```python
a  apple1  apple2  apple3  apple4  apple5
0     1.0     2.0     3.0     4.0     5.0
```

乍看可使用`pivot`，但很难一步到位。

所以另辟蹊径，提供一种简单且好理解的方法：

```python
In [113]: pd.DataFrame(index=[0],columns=df.a,data=dict(zip(df.a,df.val)))
Out[113]:
a  apple1  apple2  apple3  apple4  apple5
0     1.0     2.0     3.0     4.0     5.0
```
以上方法是重新创建一个DataFrame,直接把`df.a`所有可能取值作为新dataframe的列，index调整为`[0]`，注意类型必须是数组类型(array-like 或者 Index)，两个轴确定后，`data`填充数据域。

```python
In [116]: dict(zip(df.a,df.val))
Out[116]: {'apple1': 1.0, 'apple2': 2.0, 'apple3': 3.0, 'apple4': 4.0, 'apple5': 5.0}
```


#### 2 导入数据
数据来自kaggle，共包括三个文件：

1. movies.dat
2. ratings.dat
3. users.dat

`movies.dat`包括三个字段：['Movie ID', 'Movie Title', 'Genre']

使用pandas导入此文件：

```python
import pandas as pd

movies = pd.read_csv('./data/movietweetings/movies.dat', delimiter='::', engine='python', header=None, names = ['Movie ID', 'Movie Title', 'Genre'])
```

导入后，显示前5行：

```python
   Movie ID                                        Movie Title  \
0         8      Edison Kinetoscopic Record of a Sneeze (1894)   
1        10                La sortie des usines Lumi猫re (1895)   
2        12                      The Arrival of a Train (1896)   
3        25  The Oxford and Cambridge University Boat Race ...   
4        91                         Le manoir du diable (1896)   
5       131                           Une nuit terrible (1896)   
6       417                      Le voyage dans la lune (1902)   
7       439                     The Great Train Robbery (1903)   
8       443        Hiawatha, the Messiah of the Ojibway (1903)   
9       628                    The Adventures of Dollie (1908)  
                                          Genre  
0                             Documentary|Short  
1                             Documentary|Short  
2                             Documentary|Short  
3                                           NaN  
4                                  Short|Horror  
5                           Short|Comedy|Horror  
6  Short|Action|Adventure|Comedy|Fantasy|Sci-Fi  
7                    Short|Action|Crime|Western  
8                                           NaN  
9                                  Action|Short  
```


次导入其他两个数据文件

`users.dat`:

```python
users = pd.read_csv('./data/movietweetings/users.dat', delimiter='::', engine='python', header=None, names = ['User ID', 'Twitter ID'])
print(users.head())
```

结果：

```python
   User ID  Twitter ID
0        1   397291295
1        2    40501255
2        3   417333257
3        4   138805259
4        5  2452094989
5        6   391774225
6        7    47317010
7        8    84541461
8        9  2445803544
9       10   995885060
```


`rating.data`:

```python
ratings = pd.read_csv('./data/movietweetings/ratings.dat', delimiter='::', engine='python', header=None, names = ['User ID', 'Movie ID', 'Rating', 'Rating Timestamp'])
print(ratings.head())
```

结果：

```python
   User ID  Movie ID  Rating  Rating Timestamp
0        1    111161      10        1373234211
1        1    117060       7        1373415231
2        1    120755       6        1373424360
3        1    317919       6        1373495763
4        1    454876      10        1373621125
5        1    790724       8        1374641320
6        1    882977       8        1372898763
7        1   1229238       9        1373506523
8        1   1288558       5        1373154354
9        1   1300854       8        1377165712
```

 **read_csv 使用说明**

说明，本次导入`dat`文件使用`pandas.read_csv`函数。

第一个位置参数`./data/movietweetings/ratings.dat` 表示文件的相对路径

第二个关键字参数：`delimiter='::'`，表示文件分隔符使用`::`

后面几个关键字参数分别代表使用的引擎，文件没有表头，所以`header`为`None;`

导入后dataframe的列名使用`names`关键字设置，这个参数大家可以记住，比较有用。


Kaggle电影数据集第一节，我们使用数据处理利器 `pandas`， 函数`read_csv` 导入给定的三个数据文件。

```python
import pandas as pd

movies = pd.read_csv('./data/movietweetings/movies.dat', delimiter='::', engine='python', header=None, names = ['Movie ID', 'Movie Title', 'Genre'])
users = pd.read_csv('./data/movietweetings/users.dat', delimiter='::', engine='python', header=None, names = ['User ID', 'Twitter ID'])
ratings = pd.read_csv('./data/movietweetings/ratings.dat', delimiter='::', engine='python', header=None, names = ['User ID', 'Movie ID', 'Rating', 'Rating Timestamp'])
```

用到的`read_csv`，某些重要的参数，如何使用在上一节也有所提到。下面开始数据探索分析(EDA)

> 找出得分前10喜剧(comedy)


#### 3 处理组合值

表`movies`字段`Genre`表示电影的类型，可能有多个值，分隔符为`|`，取值也可能为`None`.

针对这类字段取值，可使用Pandas中Series提供的`str`做一步转化，**注意它是向量级的**，下一步，如Python原生的`str`类似，使用`contains`判断是否含有`comedy`字符串：

```python
mask = movies.Genre.str.contains('comedy',case=False,na=False)
```

注意使用的两个参数：`case`, `na`

case为 False，表示对大小写不敏感；
na Genre列某个单元格为`NaN`时，我们使用的充填值，此处填充为`False`

返回的`mask`是一维的`Series`，结构与 movies.Genre相同，取值为True 或 False.

观察结果：

```python
0    False
1    False
2    False
3    False
4    False
5     True
6     True
7    False
8    False
9    False
Name: Genre, dtype: bool

```


 #### 4 访问某列

得到掩码mask后，pandas非常方便地能提取出目标记录：

```python
comedy = movies[mask]
comdey_ids = comedy['Movie ID']

```

以上，在pandas中被最频率使用，不再解释。看结果`comedy_ids.head()`：

```python
5      131
6      417
15    2354
18    3863
19    4099
20    4100
21    4101
22    4210
23    4395
25    4518
Name: Movie ID, dtype: int64

```


1-4介绍`数据读入`，`处理组合值`，`索引数据`等, pandas中使用较多的函数，基于Kaggle真实电影影评数据集，最后得到所有`喜剧 ID`：

```python
5      131
6      417
15    2354
18    3863
19    4099
20    4100
21    4101
22    4210
23    4395
25    4518
Name: Movie ID, dtype: int64

```

下面继续数据探索之旅~

#### 5 连接两个表

拿到所有喜剧的ID后，要想找出其中平均得分最高的前10喜剧，需要关联另一张表：`ratings`:

再回顾下ratings表结构：

```python
   User ID  Movie ID  Rating  Rating Timestamp
0        1    111161      10        1373234211
1        1    117060       7        1373415231
2        1    120755       6        1373424360
3        1    317919       6        1373495763
4        1    454876      10        1373621125
5        1    790724       8        1374641320
6        1    882977       8        1372898763
7        1   1229238       9        1373506523
8        1   1288558       5        1373154354
9        1   1300854       8        1377165712

```


pandas 中使用`join`关联两张表，连接字段是`Movie ID`，如果顺其自然这么使用`join`：

```python
combine = ratings.join(comedy, on='Movie ID', rsuffix='2')

```

左右滑动，查看完整代码

大家可验证这种写法，仔细一看，会发现结果非常诡异。

究其原因，这是pandas join函数使用的一个算是坑点，它在官档中介绍，连接右表时，此处右表是`comedy`，它的`index`要求是连接字段，也就是 `Movie ID`. 

左表的index不要求，但是要在参数 `on`中给定。

**以上是要注意的一点**

修改为：

```python
combine = ratings.join(comedy.set_index('Movie ID'), on='Movie ID')
print(combine.head(10))

```

以上是OK的写法

观察结果：

```python
   User ID  Movie ID  Rating  Rating Timestamp Movie Title Genre
0        1    111161      10        1373234211         NaN   NaN
1        1    117060       7        1373415231         NaN   NaN
2        1    120755       6        1373424360         NaN   NaN
3        1    317919       6        1373495763         NaN   NaN
4        1    454876      10        1373621125         NaN   NaN
5        1    790724       8        1374641320         NaN   NaN
6        1    882977       8        1372898763         NaN   NaN
7        1   1229238       9        1373506523         NaN   NaN
8        1   1288558       5        1373154354         NaN   NaN
9        1   1300854       8        1377165712         NaN   NaN

```

Genre列为`NaN`表明，这不是喜剧。需要筛选出此列不为`NaN` 的记录。

#### 6 按列筛选

pandas最方便的地方，就是向量化运算，尽可能减少了for循环的嵌套。

按列筛选这种常见需求，自然可以轻松应对。

为了照顾初次接触 pandas 的朋友，分两步去写：

```python
mask = pd.notnull(combine['Genre'])

```

结果是一列只含`True 或 False`的值

```python
result = combine[mask]
print(result.head())

```

结果中，Genre字段中至少含有一个Comedy字符串，表明验证了我们以上操作是OK的。

```python
    User ID  Movie ID  Rating  Rating Timestamp             Movie Title  \
12        1   1588173       9        1372821281      Warm Bodies (2013)   
13        1   1711425       3        1372604878        21 & Over (2013)   
14        1   2024432       8        1372703553   Identity Thief (2013)   
17        1   2101441       1        1372633473  Spring Breakers (2012)   
28        2   1431045       7        1457733508         Deadpool (2016)   

                             Genre  
12           Comedy|Horror|Romance  
13                          Comedy  
14    Adventure|Comedy|Crime|Drama  
17              Comedy|Crime|Drama  
28  Action|Adventure|Comedy|Sci-Fi  


```


截止目前已经求出所有喜剧电影`result`，前5行如下，Genre中都含有`Comedy`字符串：
```python
    User ID  Movie ID  Rating  Rating Timestamp             Movie Title  \
12        1   1588173       9        1372821281      Warm Bodies (2013)   
13        1   1711425       3        1372604878        21 & Over (2013)   
14        1   2024432       8        1372703553   Identity Thief (2013)   
17        1   2101441       1        1372633473  Spring Breakers (2012)   
28        2   1431045       7        1457733508         Deadpool (2016)   

                             Genre  
12           Comedy|Horror|Romance  
13                          Comedy  
14    Adventure|Comedy|Crime|Drama  
17              Comedy|Crime|Drama  
28  Action|Adventure|Comedy|Sci-Fi  
```


#### 7 按照Movie ID 分组

result中会有很多观众对同一部电影的打分，所以要求得分前10的喜剧，先按照`Movie ID`分组，然后求出平均值：
```python
score_as_movie = result.groupby('Movie ID').mean()
```

前5行显示如下：
```python
               User ID  Rating  Rating Timestamp
Movie ID                                        
131       34861.000000     7.0      1.540639e+09
417       34121.409091     8.5      1.458680e+09
2354       6264.000000     8.0      1.456343e+09
3863      43803.000000    10.0      1.430439e+09
4099      25084.500000     7.0      1.450323e+09
```

#### 8 按照电影得分排序

```python
score_as_movie.sort_values(by='Rating', ascending = False,inplace=True)
score_as_movie
```
前5行显示如下：
```python
	User ID	Rating	Rating Timestamp
Movie ID			
7134690	30110.0	10.0	1.524974e+09
416889	1319.0	10.0	1.543320e+09
57840	23589.0	10.0	1.396802e+09
5693562	50266.0	10.0	1.511024e+09
5074	43803.0	10.0	1.428352e+09
```
都是满分？这有点奇怪，会不会这些电影都只有几个人评分，甚至只有1个？评分样本个数太少，显然最终的平均分数不具有太强的说服力。

所以，下面要进行每部电影的评分人数统计

#### 9 分组后使用聚合函数

根据`Movie ID`分组后，使用`count`函数统计`每组个数`，只保留count列，最后得到`watchs2`:

```python
watchs = result.groupby('Movie ID').agg(['count'])
watchs2 = watchs['Rating']['count']
```
打印前20行：
```python
print(watchs2.head(20))
```
结果：
```python
Movie ID
131      1
417     22
2354     1
3863     1
4099     2
4100     1
4101     1
4210     1
4395     1
4518     1
4546     2
4936     2
5074     1
5571     1
6177     1
6414     3
6684     1
6689     1
7145     1
7162     2
Name: count, dtype: int64
```
果然，竟然有这么多电影的评论数只有1次！样本个数太少，评论的平均值也就没有什么说服力。

查看`watchs2`一些重要统计量：
```python
watchs2.describe()
```
结果：
```python
count    10740.000000
mean        20.192086
std         86.251411
min          1.000000
25%          1.000000
50%          2.000000
75%          7.000000
max       1843.000000
Name: count, dtype: float64
```
共有10740部**喜剧**电影被评分，平均打分次数20次，标准差86，75%的电影样本打分次数小于7次，最小1次，最多1843次。

#### 10 频率分布直方图

绘制评论数的频率分布直方图，便于更直观的观察电影被评论的分布情况。上面分析到，75%的电影打分次数小于7次，所以绘制打分次数小于20次的直方图：

```python
fig = plt.figure(figsize=(12,8))
histn = plt.hist(watchs2[watchs2 <=19],19,histtype='step')
plt.scatter([i+1 for i in range(len(histn[0]))],histn[0])
```

![](./img/20200131094927.jpg)

`histn`元祖表示个数和对应的被分割的区间，查看`histn[0]`:
```python
array([4383., 1507.,  787.,  541.,  356.,  279.,  209.,  163.,  158.,
        118.,  114.,   90.,  104.,   81.,   80.,   73.,   62.,   65.,
         52.])
```
```python
sum(histn[0]) # 9222
```
看到电影评论次数1到19次的喜剧电影9222部，共有10740部喜剧电影，大约`86%`的喜剧电影评论次数`小于20次`，有`1518`部电影评论数不小于20次。

我们肯定希望挑选出被评论次数尽可能多的电影，因为难免会有水军和滥竽充数等`异常评论`行为。那么，如何准确的量化最小抽样量呢？


#### 11 最小抽样量

根据统计学的知识，最小抽样量和Z值、样本方差和样本误差相关，下面给出具体的求解最小样本量的计算方法。

采用如下计算公式：

$$ n = \frac{Z^2\sigma^2}{E^2} $$


此处，$Z$ 值取为95%的置信度对应的Z值也就是1.96，样本误差取为均值的2.5%.

根据以上公式，编写下面代码：

```python
n3 = result.groupby('Movie ID').agg(['count','mean','std'])
n3r = n3[n3['Rating']['count']>=20]['Rating']
```
只计算影评超过20次，且满足最小样本量的电影。计算得到的`n3r`前5行：
```python
	count	mean	std
Movie ID			
417	22	8.500000	1.263027
12349	68	8.485294	1.227698
15324	20	8.350000	1.039990
15864	51	8.431373	1.374844
17925	44	8.636364	1.259216
```
进一步求出最小样本量：
```python
nmin = (1.96**2*n3r['std']**2) / ( (n3r['mean']*0.025)**2 )
```
`nmin`前5行：
```python
Movie ID
417         135.712480
12349       128.671290
15324        95.349276
15864       163.434005
17925       130.668350
```

筛选出满足最小抽样量的喜剧电影：

```python
n3s = n3r[ n3r['count'] >= nmin ]
```
结果显示如下，因此共有`173`部电影满足最小样本抽样量。

```python

count	mean	std
Movie ID			
53604	129	8.635659	1.230714
57012	207	8.449275	1.537899
70735	224	8.839286	1.190799
75686	209	8.095694	1.358885
88763	296	8.945946	1.026984
...	...	...	...
6320628	860	7.966279	1.469924
6412452	276	7.510870	1.389529
6662050	22	10.000000	0.000000
6966692	907	8.673649	1.286455
7131622	1102	7.851180	1.751500
173 rows × 3 columns
```

#### 12 去重和连表

按照平均得分从大到小排序：
```python
n3s_sort = n3s.sort_values(by='mean',ascending=False)
```
结果：
```python
	count	mean	std
Movie ID			
6662050	22	10.000000	0.000000
4921860	48	10.000000	0.000000
5262972	28	10.000000	0.000000
5512872	353	9.985836	0.266123
3863552	199	9.010050	1.163372
...	...	...	...
1291150	647	6.327666	1.785968
2557490	546	6.307692	1.858434
1478839	120	6.200000	0.728761
2177771	485	6.150515	1.523922
1951261	1091	6.083410	1.736127
173 rows × 3 columns
```
仅靠`Movie ID`还是不知道哪些电影，连接`movies`表：
```python
ms = movies.drop_duplicates(subset=['Movie ID'])
ms = ms.set_index('Movie ID')
n3s_final = n3s_drops.join(ms,on='Movie ID')
```

#### 13 结果分析

喜剧榜单前50名：
```python
Movie Title
Five Minutes (2017)
MSG 2 the Messenger (2015)
Avengers: Age of Ultron Parody (2015)
Be Somebody (2016)
Bajrangi Bhaijaan (2015)
Back to the Future (1985)
La vita 鐚?bella (1997)
The Intouchables (2011)
The Sting (1973)
Coco (2017)
Toy Story 3 (2010)
3 Idiots (2009)
Green Book (2018)
Dead Poets Society (1989)
The Apartment (1960)
P.K. (2014)
The Truman Show (1998)
Am鑼卨ie (2001)
Inside Out (2015)
Toy Story 4 (2019)
Toy Story (1995)
Finding Nemo (2003)
Dr. Strangelove or: How I Learned to Stop Worrying and Love the Bomb (1964)
Home Alone (1990)
Zootopia (2016)
Up (2009)
Monsters, Inc. (2001)
La La Land (2016)
Relatos salvajes (2014)
En man som heter Ove (2015)
Snatch (2000)
Lock, Stock and Two Smoking Barrels (1998)
How to Train Your Dragon 2 (2014)
As Good as It Gets (1997)
Guardians of the Galaxy (2014)
The Grand Budapest Hotel (2014)
Fantastic Mr. Fox (2009)
Silver Linings Playbook (2012)
Sing Street (2016)
Deadpool (2016)
Annie Hall (1977)
Pride (2014)
In Bruges (2008)
Big Hero 6 (2014)
Groundhog Day (1993)
The Breakfast Club (1985)
Little Miss Sunshine (2006)
Deadpool 2 (2018)
The Terminal (2004)
```

前10名评论数图：

![](./img/2020013109495711.jpg)

代码：
```python
x = n3s_final['Movie Title'][:10].tolist()[::-1]
y = n3s_final['count'][:10].tolist()[::-1]
bar = (
    Bar()
    .add_xaxis(x)
    .add_yaxis('评论数',y,category_gap='50%')
    .reversal_axis()
    .set_global_opts(title_opts=opts.TitleOpts(title="喜剧电影被评论次数"),
                    toolbox_opts=opts.ToolboxOpts(),)
)
grid = (
    Grid(init_opts=opts.InitOpts(theme=ThemeType.LIGHT))
    .add(bar, grid_opts=opts.GridOpts(pos_left="30%"))
)
grid.render_notebook()
```

前10名得分图：

![](./img/2020013109500812.jpg)

代码：
```python
x = n3s_final['Movie Title'][:10].tolist()[::-1]
y = n3s_final['mean'][:10].round(3).tolist()[::-1]
bar = (
    Bar()
    .add_xaxis(x)
    .add_yaxis('平均得分',y,category_gap='50%')
    .reversal_axis()
    .set_global_opts(title_opts=opts.TitleOpts(title="喜剧电影平均得分"),
                    xaxis_opts=opts.AxisOpts(min_=8.0,name='平均得分'),
                    toolbox_opts=opts.ToolboxOpts(),)
)
grid = (
    Grid(init_opts=opts.InitOpts(theme=ThemeType.MACARONS))
    .add(bar, grid_opts=opts.GridOpts(pos_left="30%"))
)
grid.render_notebook()
```


#### 14 生成哑变量

分类变量的数值化，是指将枚举类变量转化为indicator变量或称dummy变量。

那么什么是`indicator变量`，看看如下例子，A变量解析为：`[1,0,0]`, B解析为：`[0,1,0]`, C解析为：`[0,0,1]`
```python
In [8]: s = pd.Series(list('ABCA'))
In [9]: pd.get_dummies(s)
Out[9]:
   A  B  C
0  1  0  0
1  0  1  0
2  0  0  1
3  1  0  0
```

如果输入的字符有4个唯一值，看到字符a被解析为[1,0,0,0]，向量长度为4.

```python
In [5]: s = pd.Series(list('abaccd'))
In [6]: pd.get_dummies(s)
Out[6]:
   a  b  c  d
0  1  0  0  0
1  0  1  0  0
2  1  0  0  0
3  0  0  1  0
4  0  0  1  0
5  0  0  0  1
```

也就是说dummy向量的长度等于输入字符串中，唯一字符的个数。

#### 15 讨厌的SettingWithCopyWarning！！！

Pandas 处理数据，太好用了，谁用谁知道！

使用过 Pandas 的，几乎都会遇到一个警告：

*SettingWithCopyWarning*

非常烦人！

尤其是刚接触 Pandas 的，完全不理解为什么弹出这么一串：

```python
d:\source\test\settingwithcopy.py:9: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy 
```

归根结底，是因为代码中出现`链式操作`...

有人就问了，什么是`链式操作`?

这样的：

```python
tmp = df[df.a<4]
tmp['c'] = 200
```

先记住这个最典型的情况，即可！

有的人就问了：出现这个 Warning, 需要理会它吗？ 

如果结果不对，当然要理会；如果结果对，不care.

举个例子~~

```python
import pandas as  pd

df = pd.DataFrame({'a':[1,3,5],'b':[4,2,7]},index=['a','b','c'])
df.loc[df.a<4,'c'] = 100
print(df)
print('it\'s ok')

tmp = df[df.a<4]
tmp['c'] = 200
print('-----tmp------')
print(tmp)
print('-----df-------')
print(df)
```

输出结果：
```python
   a  b      c
a  1  4  100.0
b  3  2  100.0
c  5  7    NaN
it's ok
d:\source\test\settingwithcopy.py:9: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy     
  tmp['c'] = 200
-----tmp------
   a  b    c
a  1  4  200
b  3  2  200
-----df-------
   a  b      c
a  1  4  100.0
b  3  2  100.0
c  5  7    NaN
```

it's ok 行后面的发生链式赋值，导致结果错误。因为 tmp 变了，df 没赋上值啊，所以必须理会。

it's ok 行前的是正解。

以上，链式操作尽量避免，如何避免？多使用 `.loc[row_indexer,col_indexer]`，提示告诉我们的~

#### 16 NumPy 数据归一化、分布可视化

仅使用 `NumPy`，下载数据，归一化，使用 `seaborn` 展示数据分布。

**下载数据**

```python
import numpy as np

url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
wid = np.genfromtxt(url, delimiter=',', dtype='float', usecols=[1])
```
仅提取 `iris` 数据集的第二列 `usecols = [1]`

**展示数据**

```python
array([3.5, 3. , 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7, 3.4, 3. ,
       3. , 4. , 4.4, 3.9, 3.5, 3.8, 3.8, 3.4, 3.7, 3.6, 3.3, 3.4, 3. ,
       3.4, 3.5, 3.4, 3.2, 3.1, 3.4, 4.1, 4.2, 3.1, 3.2, 3.5, 3.1, 3. ,
       3.4, 3.5, 2.3, 3.2, 3.5, 3.8, 3. , 3.8, 3.2, 3.7, 3.3, 3.2, 3.2,
       3.1, 2.3, 2.8, 2.8, 3.3, 2.4, 2.9, 2.7, 2. , 3. , 2.2, 2.9, 2.9,
       3.1, 3. , 2.7, 2.2, 2.5, 3.2, 2.8, 2.5, 2.8, 2.9, 3. , 2.8, 3. ,
       2.9, 2.6, 2.4, 2.4, 2.7, 2.7, 3. , 3.4, 3.1, 2.3, 3. , 2.5, 2.6,
       3. , 2.6, 2.3, 2.7, 3. , 2.9, 2.9, 2.5, 2.8, 3.3, 2.7, 3. , 2.9,
       3. , 3. , 2.5, 2.9, 2.5, 3.6, 3.2, 2.7, 3. , 2.5, 2.8, 3.2, 3. ,
       3.8, 2.6, 2.2, 3.2, 2.8, 2.8, 2.7, 3.3, 3.2, 2.8, 3. , 2.8, 3. ,
       2.8, 3.8, 2.8, 2.8, 2.6, 3. , 3.4, 3.1, 3. , 3.1, 3.1, 3.1, 2.7,
       3.2, 3.3, 3. , 2.5, 3. , 3.4, 3. ])
      
```

这是单变量(univariate)长度为 150 的一维 NumPy 数组。

**归一化**

求出最大值、最小值
```python
smax = np.max(wid)
smin = np.min(wid)

In [51]: smax,smin
Out[51]: (4.4, 2.0)
````
归一化公式：
```python
s = (wid - smin) / (smax - smin)
```
只打印小数点后三位设置：
```python
np.set_printoptions(precision=3)  
```

归一化结果：
```markdown
array([0.625, 0.417, 0.5  , 0.458, 0.667, 0.792, 0.583, 0.583, 0.375,
       0.458, 0.708, 0.583, 0.417, 0.417, 0.833, 1.   , 0.792, 0.625,
       0.75 , 0.75 , 0.583, 0.708, 0.667, 0.542, 0.583, 0.417, 0.583,
       0.625, 0.583, 0.5  , 0.458, 0.583, 0.875, 0.917, 0.458, 0.5  ,
       0.625, 0.458, 0.417, 0.583, 0.625, 0.125, 0.5  , 0.625, 0.75 ,
       0.417, 0.75 , 0.5  , 0.708, 0.542, 0.5  , 0.5  , 0.458, 0.125,
       0.333, 0.333, 0.542, 0.167, 0.375, 0.292, 0.   , 0.417, 0.083,
       0.375, 0.375, 0.458, 0.417, 0.292, 0.083, 0.208, 0.5  , 0.333,
       0.208, 0.333, 0.375, 0.417, 0.333, 0.417, 0.375, 0.25 , 0.167,
       0.167, 0.292, 0.292, 0.417, 0.583, 0.458, 0.125, 0.417, 0.208,
       0.25 , 0.417, 0.25 , 0.125, 0.292, 0.417, 0.375, 0.375, 0.208,
       0.333, 0.542, 0.292, 0.417, 0.375, 0.417, 0.417, 0.208, 0.375,
       0.208, 0.667, 0.5  , 0.292, 0.417, 0.208, 0.333, 0.5  , 0.417,
       0.75 , 0.25 , 0.083, 0.5  , 0.333, 0.333, 0.292, 0.542, 0.5  ,
       0.333, 0.417, 0.333, 0.417, 0.333, 0.75 , 0.333, 0.333, 0.25 ,
       0.417, 0.583, 0.458, 0.417, 0.458, 0.458, 0.458, 0.292, 0.5  ,
       0.542, 0.417, 0.208, 0.417, 0.583, 0.417])
```

**分布可视化**

```python
import seaborn as sns
sns.distplot(s,kde=False,rug=True)
```
频率分布直方图：


![](https://imgkr.cn-bj.ufileos.com/49bf5190-429c-4172-a53c-e3f6b66d4e64.png)


```python
sns.distplot(s,hist=True,kde=True,rug=True)
```
带高斯密度核函数的直方图：

![](https://imgkr.cn-bj.ufileos.com/4e4a72a5-8f59-4893-b435-e4b57e22a18e.png)


**分布 fit 图**

拿 `gamma` 分布去 fit ：
```python
from scipy import stats
sns.distplot(s, kde=False, fit = stats.gamma)
```


![](https://imgkr.cn-bj.ufileos.com/89446755-7420-4f96-97fe-c4e45d0d3dec.png)


拿双 `gamma` 去 fit：
```python
from scipy import stats
sns.distplot(s, kde=False, fit = stats.dgamma)
```

![](https://imgkr.cn-bj.ufileos.com/f2c2a660-5433-4b4f-ad7b-d01da4121319.png)

#### 17 Pandas 使用技巧

对于动辄就几十或几百个 G 的数据，在读取的这么大数据的时候，我们有没有办法随机选取一小部分数据，然后读入内存，快速了解数据和开展 EDA ？

使用 Pandas 的 skiprows 和 概率知识，就能做到。

下面解释具体怎么做。

如下所示，读取某 100 G 大小的 big_data.csv 数据

1) 使用 skiprows 参数，

2) x > 0 确保首行读入， 

3) np.random.rand() > 0.01 表示 99% 的数据都会被随机过滤掉

言外之意，只有全部数据的 1% 才有机会选入内存中。

```python
import pandas as pd
import numpy as np

df = pd.read_csv("big_data.csv", 
skiprows = 
lambda x: x>0 and np.random.rand() > 0.01)

print("The shape of the df is {}. 
It has been reduced 100 times!".format(df.shape))
```

使用这种方法，读取的数据量迅速缩减到原来的 1% ，对于迅速展开数据分析有一定的帮助。

### 十一、一步一步掌握Flask web开发

#### 1 Flask版 hello world

Flask是Python轻量级web框架，容易上手，被广大Python开发者所喜爱。

今天我们先从hello world开始，一步一步掌握Flask web开发。例子君是Flask框架的小白，接下来与读者朋友们，一起学习这个对我而言的新框架，大家多多指导。

首先`pip install Flask`,安装Flask，然后import Flask，同时创建一个 `app`
```python
from flask import Flask

App = Flask(__name__)
```

写一个index页的入口函数，返回hello world.

通过装饰器：App.route('/')创建index页的路由或地址，一个`/`表示index页，也就是主页。

```python
@App.route('/')
def index():
    return "hello world"
```

调用 `index`函数:
```python
if __name__ == "__main__":
    App.run(debug=True)
```

然后启动，会在console下看到如下启动信息，表明`服务启动成功`。
```python
* Debug mode: on
 * Restarting with stat
 * Debugger is active!
 * Debugger PIN: 663-788-611
 * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
```

 接下来，打开一个网页，相当于启动客户端，并在Url栏中输入：`http://127.0.0.1:5000/`，看到页面上答应出`hello world`，证明服务访问成功。

 同时在服务端后台看到如下信息，表示处理一次来自客户端的`get`请求。
 ```python
 27.0.0.1 - - [03/Feb/2020 21:26:50] "GET / HTTP/1.1" 200 -
 ```

 以上就是flask的hello world 版

#### 2 Flask之数据入库操作

数据持久化就是将数据写入到数据库存储的过程。

本例子使用`sqlite3`数据库。

1)导入`sqlite3`，未安装前使用命令`pip install sqlite3`

创建一个`py`文件：`sqlite3_started.py`，并写下第一行代码：
```python
import sqlite3
```
2)手动创建一个数据库实例`db`, 命名`test.db`

3)创建与数据库实例`test.db`的连接:
```python
conn = sqlite3.connect("test.db")
```

4)拿到连接`conn`的cursor
```python
c = conn.cursor()
```

5)创建第一张表`books`

共有四个字段：`id`,`sort`,`name`,`price`，类型分别为：`int`,`int`,`text`,`real`. 其中`id`为`primary key`. 主键的取值必须是唯一的(`unique`)，否则会报错。


```python
c.execute('''CREATE TABLE books
      (id int primary key,
       sort int,
       name text,
       price real)''')
```
第一次执行上面语句，表`books`创建完成。当再次执行时，就会报`重复建表`的错误。需要优化脚本，检查表是否存在`IF NOT EXISTS books`，不存在再创建：
```python
c.execute('''CREATE TABLE IF NOT EXISTS books
      (id int primary key,
       sort int,
       name text,
       price real)''')
```

6)插入一行记录

共为4个字段赋值

```python
c.execute('''INSERT INTO books VALUES
       (1, 
       1, 
       'computer science',
       39.0)''')
```

7)一次插入多行记录

先创建一个list:`books`，使用`executemany`一次插入多行。
```python
books = [(2, 2, 'Cook book', 68),
         (3, 2, 'Python intro', 89),
         (4, 3, 'machine learning', 59),
         ]


c.executemany('INSERT INTO books VALUES (?, ?, ?, ?)', books)
```

8)提交

提交后才会真正生效，写入到数据库

```python
conn.commit()
```

9)关闭期初建立的连接conn

务必记住手动关闭，否则会出现内存泄漏
```python
conn.close()
print('Done')
```

10)查看结果
例子君使用`vs code`，在扩展库中选择：`SQLite`安装。

![image-20200208211721377](./img/image-20200208211721377.png)

新建一个`sq`文件：`a.sql`，内容如下：

```sql
SELECT * from books 
```
右键`run query`，得到表`books`插入的4行记录可视化图：

![image-20200208211806853](./img/image-20200208211806853.png)

以上十步就是sqlite3写入数据库的主要步骤，作为Flask系列的第二篇，为后面的前端讲解打下基础。

#### 3 Flask各层调用关系

这篇介绍Flask和B/S模式，即浏览器/服务器模式，是接下来快速理解Flask代码的关键理论篇：**理解Views、models和渲染模板层的调用关系**。

1) 发出请求

当我们在浏览器地址栏中输入某个地址，按回车后，完成第一步。

2) 视图层 views接收1)步发出的请求，Flask中使用解释器的方式处理这个求情，实例代码如下，它通常涉及到调用models层和模板文件层

```python
@main_blue.route('/', methods=['GET', 'POST'])
def index():
    form = TestForm()
    print('test')
```

3) models层会负责创建数据模型，执行CRUD操作

4) 模板文件层处理html模板

5) 组合后返回html

6) models层和html模板组合后返回给views层

7）最后views层响应并渲染到浏览器页面，我们就能看到请求的页面。

完整过程图如下所示：

![image-20200211152007983](./img/web1.png)

读者朋友们，如果你和例子君一样都是初学Flask编程，需要好好理解上面的过程。理解这些对于接下来的编程会有一定的理论指导，方向性指导价值。

### Python 问答

#### Python 如何生成二维码？


## qrcode

今天先来解答如何生成二维码。Python的`qrcode`包支持生成二维码。

用法也很简单：

```python
import qrcode

# 二维码内容
data = "http://www.zglg.work/wp-content/uploads/2020/10/image-3.png"
# 生成二维码
img = qrcode.make(data=data)
# 直接显示二维码
img.show()
# 保存二维码为文件
img.save("我的微信.jpg")
```

生成的二维码如下：

![](https://imgkr2.cn-bj.ufileos.com/f0b08c53-0107-483b-bbe5-072bebc58e8d.png?UCloudPublicKey=TOKEN_8d8b72be-579a-4e83-bfd0-5f6ce1546f13&Signature=rVtaeBWhzLPPq%252BFCVtiOv6rS0tI%253D&Expires=1603544615)


大家微信扫描后，会出现我的二维码。

另外，还可以设置二维码的颜色等样式：

```python
import qrcode

# 实例化二维码生成类
qr = qrcode.QRCode(border=2)
# 设置二维码数据
data = "http://www.zglg.work/wp-content/uploads/2020/10/image-3.png"
qr.add_data(data=data)
# 启用二维码颜色设置
qr.make(fit=True)
img = qr.make_image(fill_color="orange", back_color="white")

# 显示二维码
img.show()
```

生成一个orange的二维码：

![](https://imgkr2.cn-bj.ufileos.com/cbd26fd8-27cf-4630-935f-6896822ce483.png?UCloudPublicKey=TOKEN_8d8b72be-579a-4e83-bfd0-5f6ce1546f13&Signature=uy1r24x%252Fp5QpI5Wy10Ebdaz%252BpLM%253D&Expires=1603544681)

更多样式，大家可以自己去玩耍。

## Python小项目：句子KWIC显示

上下文关键字（KWIC, Key Word In Context）是最常见的多行协调显示格式。

此小项目描述：输入一系列句子，给定一个给定单词，每个句子中至少会出现一次给定单词。目标输出，给定单词按照KWIC显示，KWIC显示的基本要求：待查询单词居中，前面`pre`序列右对齐，后面`post`序列左对齐，待查询单词前和后长度相等，若输入句子无法满足要求，用空格填充。

输入参数：输入句子sentences, 待查询单词selword, 滑动窗口长度`window_len`

举例，输入如下六个句子，给定单词`secure`，输出如下字符串：

```python
               pre keyword    post 

     welfare , and secure  the blessings of
     nations , and secured immortal glory with 
       , and shall secure  to you the 
    cherished . To secure  us against these 
     defense as to secure  our cities and 
          I can to secure  economy and fidelity 
```

请补充实现下面函数：

```python
def kwic(sentences: List[str], selword: str, window_len: int) -> str:
    """
    :type: sentences: input sentences
    :type: selword: selected word
    :type: window_len: window length
    """
```

更多KWIC显示参考如下：

http://dep.chs.nihon-u.ac.jp/english_lang/tukamoto/kwic_e.html

完整代码已经公布在：http://www.zglg.work/Python-20-topics/python-project1-kwic/
![image](https://user-images.githubusercontent.com/20391209/123213609-c494dc00-d4f8-11eb-84d6-4d8caabb44f7.png)
![image](https://user-images.githubusercontent.com/20391209/123213901-26eddc80-d4f9-11eb-96cd-d3518005c4df.png)


================================================
FILE: md/1.md
================================================
```markdown
@author jackzhenguo
@desc 实现 relu
@date 2019/2/10
```

#### 1 实现 relu

在神经网络中，`relu`作为神经元的激活函数：

```python
def relu(x):
    """
    x: 输入参数
    return：输出relu值
    """
    return max(0,x)                                                                 
```

测试：

```python
relu(5) # 5

relu(-1) # 0
```


<center>[下一个例子](2.md)</center>


================================================
FILE: md/10.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 10  转为整型　　

int(x, base =10) , x可能为字符串或数值，将 x 转换为一个普通整数。

参数base指定x进制数，常见的2，8，10，16分别表示二进制、八进制、十进制、十六进制的数字

如果参数是字符串，必须为整数型字符串，如果是浮点数字符串会抛出异常。

如果x是浮点数，int后截去小数点，只保留整数部分。

```python
In [2]: int('0110',2)

Out[2]: 6

In [3]: int('0732',8)
Out[3]: 474

In [4]: int('12',16)
Out[4]: 18

In [5]: int('12',10)
Out[5]: 12

In [6]: int(1.45)
Out[6]: 1

In [7]: int('1.45')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-6cf4b951408f> in <module>
----> 1 int('1.45')

ValueError: invalid literal for int() with base 10: '1.45'
```


<center>[上一个例子](9.md)    [下一个例子](11.md)</center>


================================================
FILE: md/100.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/9/3
```

#### 100 是否互为排序词

排序词(permutation)：两个字符串含有相同字符，但字符顺序不同。

```python
from collections import defaultdict


def is_permutation(str1, str2):
    if str1 is None or str2 is None:
        return False
    if len(str1) != len(str2):
        return False
    unq_s1 = defaultdict(int)
    unq_s2 = defaultdict(int)
    for c1 in str1:
        unq_s1[c1] += 1
    for c2 in str2:
        unq_s2[c2] += 1

    return unq_s1 == unq_s2
```

这个小例子，使用python内置的`defaultdict`，默认类型初始化为`int`，计数默次数都为0. 这个解法本质是 `hash map lookup`

统计出的两个defaultdict：unq_s1，unq_s2，如果相等，就表明str1、 str2互为排序词。

下面测试：

```python
r = is_permutation('nice', 'cine')
print(r)  # True

r = is_permutation('', '')
print(r)  # True

r = is_permutation('', None)
print(r)  # False

r = is_permutation('work', 'woo')
print(r)  # False

```

<center>[上一个例子](99.md)    [下一个例子](101.md)</center>

================================================
FILE: md/101.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/9/4
```

#### 101 str1是否由str2旋转而来

`stringbook`旋转后得到`bookstring`,写一段代码验证`str1`是否为`str2`旋转得到。

**思路**

转化为判断：`str1`是否为`str2+str2`的子串

```python
def is_rotation(s1: str, s2: str) -> bool:
    if s1 is None or s2 is None:
        return False
    if len(s1) != len(s2):
        return False

    def is_substring(s1: str, s2: str) -> bool:
        return s1 in s2
    return is_substring(s1, s2 + s2)
```

**测试**

```python
r = is_rotation('stringbook', 'bookstring')
print(r)  # True

r = is_rotation('greatman', 'maneatgr')
print(r)  # False
```

<center>[上一个例子](100.md)    [下一个例子](102.md)</center>

================================================
FILE: md/102.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/9/9
```

#### 102 使用正则判断是否为正浮点数

从一系列字符串中，挑选出所有正浮点数。

该怎么办？

玩玩正则表达式，用正则搞它！

关键是，正则表达式该怎么写呢？

有了！

`^[1-9]\d*\.\d*$`

`^` 表示字符串开始

`[1-9]` 表示数字1,2,3,4,5,6,7,8,9

`^[1-9]` 连起来表示以数字 `1-9` 作为开头

`\d` 表示一位 `0-9` 的数字

`*` 表示前一位字符出现 0 次，1 次或多次

`\d*` 表示数字出现 0 次，1 次或多次 

`\.` 表示小数点

`\$` 表示字符串以前一位的字符结束

`^[1-9]\d*\.\d*$` 连起来就求出所有大于 1.0 的正浮点数。

那 0.0 到 1.0 之间的正浮点数，怎么求，干嘛不直接汇总到上面的正则表达式中呢？

这样写不行吗：`^[0-9]\d*\.\d*$`

```python
In [85]: import re

In [87]: recom = re.compile(r'^[0-9]\d*\.\d*$')

In [88]: recom.match('000.2')
Out[88]: <re.Match object; span=(0, 5), match='000.2'>
```

结果显示，正则表达式 `^[0-9]\d*\.\d*$` 竟然匹配到 `000.2 `，认为它是一个正浮点数。

所以知道为啥要先匹配大于 1.0 的浮点数了吧！

如果能写出这个正则表达式，再写另一部分就不困难了！

0.0 到 1.0 间的浮点数：`^0\.\d*[1-9]\d*$`

两个式子连接起来就是最终的结果：

`^[1-9]\d*\.\d*|0\.\d*[1-9]\d*$`

<center>[上一个例子](101.md)    [下一个例子](103.md)</center>

================================================
FILE: md/103.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/9/10
```

#### 103 获取后缀名

```python
import os
file_ext = os.path.splitext('./data/py/test.py')
front,ext = file_ext
In [5]: front
Out[5]: './data/py/test'

In [6]: ext
Out[6]: '.py'
```

<center>[上一个例子](102.md)    [下一个例子](104.md)</center>

================================================
FILE: md/104.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/3
```

#### 104 获取路径中的文件名

```python
In [11]: import os
    ...: file_ext = os.path.split('./data/py/test.py')
    ...: ipath,ifile = file_ext
    ...:

In [12]: ipath
Out[12]: './data/py'

In [13]: ifile
Out[13]: 'test.py'
```

<center>[上一个例子](103.md)    [下一个例子](105.md)</center>

================================================
FILE: md/105.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/4
```

#### 105 批量修改文件后缀

**批量修改文件后缀**

本例子使用Python的`os`模块和 `argparse`模块，将工作目录`work_dir`下所有后缀名为`old_ext`的文件修改为后缀名为`new_ext`

通过本例子，大家将会大概清楚`argparse`模块的主要用法。

导入模块

```python
import argparse
import os
```

定义脚本参数

```python
def get_parser():
    parser = argparse.ArgumentParser(
        description='工作目录中文件后缀名修改')
    parser.add_argument('work_dir', metavar='WORK_DIR', type=str, nargs=1,
                        help='修改后缀名的文件目录')
    parser.add_argument('old_ext', metavar='OLD_EXT',
                        type=str, nargs=1, help='原来的后缀')
    parser.add_argument('new_ext', metavar='NEW_EXT',
                        type=str, nargs=1, help='新的后缀')
    return parser
```

后缀名批量修改

```python
def batch_rename(work_dir, old_ext, new_ext):
    """
    传递当前目录，原来后缀名，新的后缀名后，批量重命名后缀
    """
    for filename in os.listdir(work_dir):
        # 获取得到文件后缀
        split_file = os.path.splitext(filename)
        file_ext = split_file[1]
        # 定位后缀名为old_ext 的文件
        if old_ext == file_ext:
            # 修改后文件的完整名称
            newfile = split_file[0] + new_ext
            # 实现重命名操作
            os.rename(
                os.path.join(work_dir, filename),
                os.path.join(work_dir, newfile)
            )
    print("完成重命名")
    print(os.listdir(work_dir))
```

实现Main

```python
def main():
    """
    main函数
    """
    # 命令行参数
    parser = get_parser()
    args = vars(parser.parse_args())
    # 从命令行参数中依次解析出参数
    work_dir = args['work_dir'][0]
    old_ext = args['old_ext'][0]
    if old_ext[0] != '.':
        old_ext = '.' + old_ext
    new_ext = args['new_ext'][0]
    if new_ext[0] != '.':
        new_ext = '.' + new_ext

    batch_rename(work_dir, old_ext, new_ext)
```

<center>[上一个例子](104.md)    [下一个例子](106.md)</center>

================================================
FILE: md/106.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/5
```

#### 106 xls批量转换成xlsx

```python
import os


def xls_to_xlsx(work_dir):
    """
    传递当前目录，原来后缀名，新的后缀名后，批量重命名后缀
    """
    old_ext, new_ext = '.xls', '.xlsx'
    for filename in os.listdir(work_dir):
        # 获取得到文件后缀
        split_file = os.path.splitext(filename)
        file_ext = split_file[1]
        # 定位后缀名为old_ext 的文件
        if old_ext == file_ext:
            # 修改后文件的完整名称
            newfile = split_file[0] + new_ext
            # 实现重命名操作
            os.rename(
                os.path.join(work_dir, filename),
                os.path.join(work_dir, newfile)
            )
    print("完成重命名")
    print(os.listdir(work_dir))


xls_to_xlsx('./data')

# 输出结果：
# ['cut_words.csv', 'email_list.xlsx', 'email_test.docx', 'email_test.jpg', 'email_test.xlsx', 'geo_data.png', 'geo_data.xlsx',
'iotest.txt', 'pyside2.md', 'PySimpleGUI-4.7.1-py3-none-any.whl', 'test.txt', 'test_excel.xlsx', 'ziptest', 'ziptest.zip']
```

<center>[上一个例子](105.md)    [下一个例子](107.md)</center>

================================================
FILE: md/107.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/6
```

#### 107 获取指定后缀名的文件

```python
import os

def find_file(work_dir,extension='jpg'):
    lst = []
    for filename in os.listdir(work_dir):
        print(filename)
        splits = os.path.splitext(filename)
        ext = splits[1] # 拿到扩展名
        if ext == '.'+extension:
            lst.append(filename)
    return lst

r = find_file('.','md') 
print(r) # 返回所有目录下的md文件
```

<center>[上一个例子](106.md)    [下一个例子](108.md)</center>

================================================
FILE: md/108.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/8
```

#### 108 批量压缩文件


```python
import zipfile  # 导入zipfile,这个是用来做压缩和解压的Python模块；
import os
import time


def batch_zip(start_dir):
    start_dir = start_dir  # 要压缩的文件夹路径
    file_news = start_dir + '.zip'  # 压缩后文件夹的名字

    z = zipfile.ZipFile(file_news, 'w', zipfile.ZIP_DEFLATED)
    for dir_path, dir_names, file_names in os.walk(start_dir):
        # 这一句很重要，不replace的话，就从根目录开始复制
        f_path = dir_path.replace(start_dir, '')
        f_path = f_path and f_path + os.sep  # 实现当前文件夹以及包含的所有文件的压缩
        for filename in file_names:
            z.write(os.path.join(dir_path, filename), f_path + filename)
    z.close()
    return file_news


batch_zip('./data/ziptest')


```

<center>[上一个例子](107.md)    [下一个例子](109.md)</center>

================================================
FILE: md/109.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/10
```

#### 109 32位加密

```python
import hashlib
# 对字符串s实现32位加密


def hash_cry32(s):
    m = hashlib.md5()
    m.update((str(s).encode('utf-8')))
    return m.hexdigest()


print(hash_cry32(1))  # c4ca4238a0b923820dcc509a6f75849b
print(hash_cry32('hello'))  # 5d41402abc4b2a76b9719d911017c592
```

<center>[上一个例子](108.md)    [下一个例子](110.md)</center>

================================================
FILE: md/11.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 11 次幂

base为底的exp次幂，如果mod给出，取余

```python
In [1]: pow(2,1.5)                                                              
Out[1]: 2.8284271247461903

In [1]: pow(3, 2, 4) # 3的2次方结果再对4取余数
Out[1]: 1
```


<center>[上一个例子](10.md)    [下一个例子](12.md)</center>


================================================
FILE: md/110.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/12
```

#### 110 年的日历图

```python
import calendar
from datetime import date
mydate = date.today()
year_calendar_str = calendar.calendar(2019)
print(f"{mydate.year}年的日历图：{year_calendar_str}\n")
```

打印结果：

```python
2019

      January                   February                   March
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
    1  2  3  4  5  6                   1  2  3                   1  2  3
 7  8  9 10 11 12 13       4  5  6  7  8  9 10       4  5  6  7  8  9 10
14 15 16 17 18 19 20      11 12 13 14 15 16 17      11 12 13 14 15 16 17
21 22 23 24 25 26 27      18 19 20 21 22 23 24      18 19 20 21 22 23 24
28 29 30 31               25 26 27 28               25 26 27 28 29 30 31

       April                      May                       June
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
 1  2  3  4  5  6  7             1  2  3  4  5                      1  2
 8  9 10 11 12 13 14       6  7  8  9 10 11 12       3  4  5  6  7  8  9
15 16 17 18 19 20 21      13 14 15 16 17 18 19      10 11 12 13 14 15 16
22 23 24 25 26 27 28      20 21 22 23 24 25 26      17 18 19 20 21 22 23
29 30                     27 28 29 30 31            24 25 26 27 28 29 30

        July                     August                  September
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
 1  2  3  4  5  6  7                1  2  3  4                         1
 8  9 10 11 12 13 14       5  6  7  8  9 10 11       2  3  4  5  6  7  8
15 16 17 18 19 20 21      12 13 14 15 16 17 18       9 10 11 12 13 14 15
22 23 24 25 26 27 28      19 20 21 22 23 24 25      16 17 18 19 20 21 22
29 30 31                  26 27 28 29 30 31         23 24 25 26 27 28 29
                                                    30

      October                   November                  December
Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su      Mo Tu We Th Fr Sa Su
    1  2  3  4  5  6                   1  2  3                         1
 7  8  9 10 11 12 13       4  5  6  7  8  9 10       2  3  4  5  6  7  8
14 15 16 17 18 19 20      11 12 13 14 15 16 17       9 10 11 12 13 14 15
21 22 23 24 25 26 27      18 19 20 21 22 23 24      16 17 18 19 20 21 22
28 29 30 31               25 26 27 28 29 30         23 24 25 26 27 28 29
                                                    30 31
```

<center>[上一个例子](109.md)    [下一个例子](111.md)</center>

================================================
FILE: md/111.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/13
```

#### 111 判断是否为闰年

```python
import calendar
from datetime import date

mydate = date.today()
is_leap = calendar.isleap(mydate.year)
print_leap_str = "%s年是闰年" if is_leap else "%s年不是闰年\n"
print(print_leap_str % mydate.year)
```

打印结果：

```python
2019年不是闰年
```

<center>[上一个例子](110.md)    [下一个例子](112.md)</center>

================================================
FILE: md/112.md
================================================
#### 112 判断月有几天

```python
import calendar
from datetime import date

mydate = date.today()
weekday, days = calendar.monthrange(mydate.year, mydate.month)
print(f'{mydate.year}年-{mydate.month}月的第一天是那一周的第{weekday}天\n')
print(f'{mydate.year}年-{mydate.month}月共有{days}天\n')
```

打印结果：

```python
2019年-12月的第一天是那一周的第6天

2019年-12月共有31天
```

<center>[上一个例子](111.md)    [下一个例子](113.md)</center>

================================================
FILE: md/113.md
================================================
#### 113 月的第一天

```python
from datetime import date
mydate = date.today()
month_first_day = date(mydate.year, mydate.month, 1)
print(f"当月第一天:{month_first_day}\n")
```

打印结果：

```python
# 当月第一天:2019-12-01
```

<center>[上一个例子](112.md)    [下一个例子](114.md)</center>

================================================
FILE: md/114.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/14
```

#### 114 月最后一天

```python
from datetime import date
import calendar
mydate = date.today()
_, days = calendar.monthrange(mydate.year, mydate.month)
month_last_day = date(mydate.year, mydate.month, days)
print(f"当月最后一天:{month_last_day}\n")
```

打印结果：

```python
当月最后一天:2019-12-31
```

<center>[上一个例子](113.md)    [下一个例子](115.md)</center>

================================================
FILE: md/115.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/15
```

#### 115 获取当前时间

```python
from datetime import date, datetime
from time import localtime, strftime

today_date = date.today()
print(today_date)  # 2019-12-22

today_time = datetime.today()
print(today_time)  # 2019-12-22 18:02:33.398894

local_time = localtime()
print(strftime("%Y-%m-%d %H:%M:%S", local_time))  # 转化为定制的格式 2019-12-22 18:13:41
```

<center>[上一个例子](114.md)    [下一个例子](116.md)</center>

================================================
FILE: md/116.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/16
```

#### 116 字符时间转时间

```python
from time import strptime

# parse str time to struct time
struct_time = strptime('2019-12-22 10:10:08', "%Y-%m-%d %H:%M:%S")
print(struct_time) # struct_time类型就是time中的一个类

# time.struct_time(tm_year=2019, tm_mon=12, tm_mday=22, tm_hour=10, tm_min=10, tm_sec=8, tm_wday=6, tm_yday=356, tm_isdst=-1)
```

<center>[上一个例子](115.md)    [下一个例子](117.md)</center>

================================================
FILE: md/117.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/16
```

#### 117 时间转字符时间

```python
from time import strftime, strptime, localtime

In [2]: print(localtime()) #这是输入的时间
Out[2]: time.struct_time(tm_year=2019, tm_mon=12, tm_mday=22, tm_hour=18, tm_min=24, tm_sec=56, tm_wday=6, tm_yday=356, tm_isdst=0)

print(strftime("%m-%d-%Y %H:%M:%S", localtime()))  # 转化为定制的格式
# 这是字符串表示的时间：   12-22-2019 18:26:21
```

<center>[上一个例子](116.md)    [下一个例子](118.md)</center>

================================================
FILE: md/118.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/21
```

#### 118 默认启动主线程

一般的，程序默认执行只在一个线程，这个线程称为主线程，例子演示如下：

导入线程相关的模块 `threading`:

```python
import threading
```

threading的类方法 `current_thread()`返回当前线程：

```python
t = threading.current_thread()
print(t) # <_MainThread(MainThread, started 139908235814720)>
```

所以，验证了程序默认是在`MainThead`中执行。

`t.getName()`获得这个线程的名字，其他常用方法，`t.ident`获得线程`id`,`isAlive()`判断线程是否存活等。

```python
print(t.getName()) # MainThread
print(t.ident) # 139908235814720
print(t.isAlive()) # True
```

<center>[上一个例子](117.md)    [下一个例子](119.md)</center>


================================================
FILE: md/119.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/23
```

#### 119 创建线程

创建一个线程：

```python
my_thread = threading.Thread()
```

创建一个名称为`my_thread`的线程：

```python
my_thread = threading.Thread(name='my_thread')
```

创建线程的目的是告诉它帮助我们做些什么，做些什么通过参数`target`传入，参数类型为`callable`，函数就是可调用的：

```python
def print_i(i):
    print('打印i:%d'%(i,))
my_thread = threading.Thread(target=print_i,args=(1,))
```

`my_thread`线程已经全副武装，但是我们得按下发射按钮，启动start()，它才开始真正起飞。

```python
my_thread().start()
```

打印结果如下，其中`args`指定函数`print_i`需要的参数i，类型为元祖。

```python
打印i:1
```

至此，多线程相关的核心知识点，已经总结完毕。但是，仅仅知道这些，还不够！光纸上谈兵，当然远远不够。

<center>[上一个例子](118.md)    [下一个例子](120.md)</center>

================================================
FILE: md/12.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 12 四舍五入

四舍五入，`ndigits`代表小数点后保留几位：

```python
In [11]: round(10.0222222, 3)
Out[11]: 10.022

In [12]: round(10.05,1)
Out[12]: 10.1
```


<center>[上一个例子](11.md)    [下一个例子](13.md)</center>

================================================
FILE: md/120.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/24
```

#### 120 交替获得CPU时间片

为了更好解释，假定计算机是单核的，尽管对于`cpython`，这个假定有些多余。

开辟3个线程，装到`threads`中:

```python
import time
from datetime import datetime
import threading


def print_time():
    for _ in range(5): # 在每个线程中打印5次
        time.sleep(0.1) # 模拟打印前的相关处理逻辑
        print('当前线程%s,打印结束时间为:%s'%(threading.current_thread().getName(),datetime.today()))


threads = [threading.Thread(name='t%d'%(i,),target=print_time) for i in range(3)]
```

启动3个线程：

```python
[t.start() for t in threads]
```

打印结果如下，`t0`,`t1`,`t2`三个线程，根据操作系统的调度算法，轮询获得CPU时间片，注意观察，`t2`线程可能被连续调度，从而获得时间片。

```markdown
当前线程t0,打印结束时间为:2020-01-12 02:27:15.705235
当前线程t1,打印结束时间为:2020-01-12 02:27:15.705402
当前线程t2,打印结束时间为:2020-01-12 02:27:15.705687
当前线程t0,打印结束时间为:2020-01-12 02:27:15.805767
当前线程t1,打印结束时间为:2020-01-12 02:27:15.805886
当前线程t2,打印结束时间为:2020-01-12 02:27:15.806044
当前线程t0,打印结束时间为:2020-01-12 02:27:15.906200
当前线程t2,打印结束时间为:2020-01-12 02:27:15.906320
当前线程t1,打印结束时间为:2020-01-12 02:27:15.906433
当前线程t0,打印结束时间为:2020-01-12 02:27:16.006581
当前线程t1,打印结束时间为:2020-01-12 02:27:16.006766
当前线程t2,打印结束时间为:2020-01-12 02:27:16.007006
当前线程t2,打印结束时间为:2020-01-12 02:27:16.107564
当前线程t0,打印结束时间为:2020-01-12 02:27:16.107290
当前线程t1,打印结束时间为:2020-01-12 02:27:16.107741
```

<center>[上一个例子](119.md)    [下一个例子](121.md)</center>

================================================
FILE: md/121.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/25
```

#### 121 多线程抢夺同一个变量

多线程编程，存在抢夺同一个变量的问题。

比如下面例子，创建的10个线程同时竞争全局变量`a`:


```python
import threading


a = 0
def add1():
    global a    
    a += 1
    print('%s  adds a to 1: %d'%(threading.current_thread().getName(),a))
    
threads = [threading.Thread(name='t%d'%(i,),target=add1) for i in range(10)]
[t.start() for t in threads]
```

执行结果：

```python
t0  adds a to 1: 1
t1  adds a to 1: 2
t2  adds a to 1: 3
t3  adds a to 1: 4
t4  adds a to 1: 5
t5  adds a to 1: 6
t6  adds a to 1: 7
t7  adds a to 1: 8
t8  adds a to 1: 9
t9  adds a to 1: 10
```

结果一切正常，每个线程执行一次，把`a`的值加1，最后`a` 变为10，一切正常。

运行上面代码十几遍，一切也都正常。

所以，我们能下结论：这段代码是线程安全的吗？

NO！

多线程中，只要存在同时读取和修改一个全局变量的情况，如果不采取其他措施，就一定不是线程安全的。

尽管，有时，某些情况的资源竞争，暴露出问题的概率`极低极低`：

本例中，如果线程0 在修改a后，其他某些线程还是get到的是没有修改前的值，就会暴露问题。


但是在本例中，`a = a + 1`这种修改操作，花费的时间太短了，短到我们无法想象。所以，线程间轮询执行时，都能get到最新的a值。所以，暴露问题的概率就变得微乎其微。

<center>[上一个例子](120.md)    [下一个例子](122.md)</center>

================================================
FILE: md/122.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/10/28
```

#### 122 多线程变量竞争引起的问题

只要弄明白问题暴露的原因，叫问题出现还是不困难的。

想象数据库的写入操作，一般需要耗费我们可以感知的时间。

为了模拟这个写入动作，简化期间，我们只需要延长修改变量`a`的时间，问题很容易就会还原出来。

```python
import threading
import time


a = 0
def add1():
    global a    
    tmp = a + 1
    time.sleep(0.2) # 延时0.2秒，模拟写入所需时间
    a = tmp
    print('%s  adds a to 1: %d'%(threading.current_thread().getName(),a))
    
threads = [threading.Thread(name='t%d'%(i,),target=add1) for i in range(10)]
[t.start() for t in threads]
```

重新运行代码，只需一次，问题立马完全暴露，结果如下：

```python
t0  adds a to 1: 1
t1  adds a to 1: 1
t2  adds a to 1: 1
t3  adds a to 1: 1
t4  adds a to 1: 1
t5  adds a to 1: 1
t7  adds a to 1: 1
t6  adds a to 1: 1
t8  adds a to 1: 1
t9  adds a to 1: 1
```

看到，10个线程全部运行后，`a`的值只相当于一个线程执行的结果。

下面分析，为什么会出现上面的结果：

这是一个很有说服力的例子，因为在修改a前，有0.2秒的休眠时间，某个线程延时后，CPU立即分配计算资源给其他线程。直到分配给所有线程后，根据结果反映出，0.2秒的休眠时长还没耗尽，这样每个线程get到的a值都是0，所以才出现上面的结果。


以上最核心的三行代码：

```python
tmp = a + 1
time.sleep(0.2) # 延时0.2秒，模拟写入所需时间
a = tmp
```

<center>[上一个例子](121.md)    [下一个例子](123.md)</center>

================================================
FILE: md/123.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/11/1
```

#### 123 多线程锁

知道问题出现的原因后，要想修复问题，也没那么复杂。

通过python中提供的锁机制，某段代码只能单线程执行时，上锁，其他线程等待，直到释放锁后，其他线程再争锁，执行代码，释放锁，重复以上。

创建一把锁`locka`:

```python
import threading
import time


locka = threading.Lock()
```

通过 `locka.acquire()` 获得锁，通过`locka.release()`释放锁，它们之间的这些代码，只能单线程执行。

```python
a = 0
def add1():
    global a    
    try:
        locka.acquire() # 获得锁
        tmp = a + 1
        time.sleep(0.2) # 延时0.2秒，模拟写入所需时间
        a = tmp
    finally:
        locka.release() # 释放锁
    print('%s  adds a to 1: %d'%(threading.current_thread().getName(),a))
    
threads = [threading.Thread(name='t%d'%(i,),target=add1) for i in range(10)]
[t.start() for t in threads]
```

执行结果如下：

```python
t0  adds a to 1: 1
t1  adds a to 1: 2
t2  adds a to 1: 3
t3  adds a to 1: 4
t4  adds a to 1: 5
t5  adds a to 1: 6
t6  adds a to 1: 7
t7  adds a to 1: 8
t8  adds a to 1: 9
t9  adds a to 1: 10
```

一切正常，其实这已经是单线程顺序执行了，就本例子而言，已经失去多线程的价值，并且还带来了因为线程创建开销，浪费时间的副作用。

程序中只有一把锁，通过 `try...finally`还能确保不发生死锁。但是，当程序中启用多把锁，还是很容易发生死锁。

注意使用场合，避免死锁，是我们在使用多线程开发时需要注意的一些问题。

<center>[上一个例子](122.md)    [下一个例子](124.md)</center>


================================================
FILE: md/124.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/11/16
```

**124 时间转数组及常用格式**

```python
In [68]: str_to_struct = time.strptime(format_time,'%Y-%m-%d %H:%M:%S')

In [69]: str_to_struct
Out[69]: time.struct_time(tm_year=2020, tm_mon=2, tm_mday=22, tm_hour=11, tm_min=19, tm_sec=19, tm_wday=5, tm_yday=53, tm_isdst=-1)
```

最后再记住常用字符串格式

**常用字符串格式**

%m：月

%M: 分钟

```markdown
    %Y  Year with century as a decimal number.
    %m  Month as a decimal number [01,12].
    %d  Day of the month as a decimal number [01,31].
    %H  Hour (24-hour clock) as a decimal number [00,23].
    %M  Minute as a decimal number [00,59].
    %S  Second as a decimal number [00,61].
    %z  Time zone offset from UTC.
    %a  Locale's abbreviated weekday name.
    %A  Locale's full weekday name.
    %b  Locale's abbreviated month name.
```

<center>[上一个例子](123.md)    [下一个例子](125.md)</center>

================================================
FILE: md/125.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/11/14
```

#### 125 寻找第n次出现位置

```python
def search_n(s, c, n):
    size = 0
    for i, x in enumerate(s):
        if x == c:
            size += 1
        if size == n:
            return i
    return -1


print(search_n("fdasadfadf", "a", 3))# 结果为7，正确
print(search_n("fdasadfadf", "a", 30))# 结果为-1，正确
```

<center>[上一个例子](124.md)    [下一个例子](126.md)</center>

================================================
FILE: md/126.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/11/23
```

#### 126 斐波那契数列前n项

```python
def fibonacci(n):
    a, b = 1, 1
    for _ in range(n):
        yield a
        a, b = b, a + b


list(fibonacci(5))  # [1, 1, 2, 3, 5]
```

<center>[上一个例子](125.md)    [下一个例子](127.md)</center>

================================================
FILE: md/127.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/11/30
```

#### 127 找出所有重复元素

```python
from collections import Counter


def find_all_duplicates(lst):
    c = Counter(lst)
    return list(filter(lambda k: c[k] > 1, c))


find_all_duplicates([1, 2, 2, 3, 3, 3])  # [2,3]
```

<center>[上一个例子](126.md)    [下一个例子](128.md)</center>

================================================
FILE: md/128.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/12/1
```

#### 128 联合统计次数
Counter对象间可以做数学运算

```python
from collections import Counter
a = ['apple', 'orange', 'computer', 'orange']
b = ['computer', 'orange']

ca = Counter(a)
cb = Counter(b)
#Counter对象间可以做数学运算
ca + cb  # Counter({'orange': 3, 'computer': 2, 'apple': 1})


# 进一步抽象，实现多个列表内元素的个数统计


def sumc(*c):
    if (len(c) < 1):
        return
    mapc = map(Counter, c)
    s = Counter([])
    for ic in mapc: # ic 是一个Counter对象
        s += ic
    return s


#Counter({'orange': 3, 'computer': 3, 'apple': 1, 'abc': 1, 'face': 1})
sumc(a, b, ['abc'], ['face', 'computer'])

```

<center>[上一个例子](127.md)    [下一个例子](129.md)</center>

================================================
FILE: md/129.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/12/3
```

#### 129 groupby单字段分组

天气记录：

```python
a = [{'date': '2019-12-15', 'weather': 'cloud'},
 {'date': '2019-12-13', 'weather': 'sunny'},
 {'date': '2019-12-14', 'weather': 'cloud'}]
```

按照天气字段`weather`分组汇总：

```python
from itertools import groupby
for k, items in  groupby(a,key=lambda x:x['weather']):
     print(k)
```

输出结果看出，分组失败！原因：分组前必须按照分组字段`排序`，这个很坑~

```python
cloud
sunny
cloud
```

修改代码：

```python
a.sort(key=lambda x: x['weather'])
for k, items in  groupby(a,key=lambda x:x['weather']):
     print(k)
     for i in items:
         print(i)
```

输出结果：

```python
cloud
{'date': '2019-12-15', 'weather': 'cloud'}
{'date': '2019-12-14', 'weather': 'cloud'}
sunny
{'date': '2019-12-13', 'weather': 'sunny'}
```

<center>[上一个例子](128.md)    [下一个例子](130.md)</center>

================================================
FILE: md/13.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 13 链式比较

Python支持这种连续不等比较，写起来更方便

```python
i = 3
print(1 < i < 3)  # False
print(1 < i <= 3)  # True
```


<center>[上一个例子](12.md)    [下一个例子](14.md)</center>


================================================
FILE: md/130.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/12/8
```

#### 130 groupby多字段分组

`itemgetter`是一个类，`itemgetter('weather')`返回一个可调用的对象，它的参数可有多个：

```python
from operator import itemgetter
from itertools import groupby

a.sort(key=itemgetter('weather', 'date'))
for k, items in groupby(a, key=itemgetter('weather')):
     print(k)
     for i in items:
         print(i)
```

结果如下，使用`weather`和`date`两个字段排序`a`，

```python
cloud
{'date': '2019-12-14', 'weather': 'cloud'}
{'date': '2019-12-15', 'weather': 'cloud'}
sunny
{'date': '2019-12-13', 'weather': 'sunny'}
```

注意这个结果与上面结果有些微妙不同，这个更多是我们想看到和使用更多的。

<center>[上一个例子](129.md)    [下一个例子](131.md)</center>

================================================
FILE: md/131.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/12/19
```

#### 131 itemgetter和key函数

注意到`sort`和`groupby`所用的`key`函数，除了`lambda`写法外，还有一种简写，就是使用`itemgetter`：

```python
a = [{'date': '2019-12-15', 'weather': 'cloud'},
 {'date': '2019-12-13', 'weather': 'sunny'},
 {'date': '2019-12-14', 'weather': 'cloud'}]
from operator import itemgetter
from itertools import groupby

a.sort(key=itemgetter('weather'))
for k, items in groupby(a, key=itemgetter('weather')):
     print(k)
     for i in items:
         print(i)
```

结果：

```python
cloud
{'date': '2019-12-15', 'weather': 'cloud'}
{'date': '2019-12-14', 'weather': 'cloud'}
sunny
{'date': '2019-12-13', 'weather': 'sunny'}
```

<center>[上一个例子](130.md)    [下一个例子](132.md)</center>

================================================
FILE: md/132.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/12/30
```

#### 132 sum函数计算和聚合同时做

Python中的聚合类函数`sum`,`min`,`max`第一个参数是`iterable`类型，一般使用方法如下：

```python
a = [4,2,5,1]
sum([i+1 for i in a]) # 16
```

使用列表生成式`[i+1 for i in a]`创建一个长度与`a`一行的临时列表，这步完成后，再做`sum`聚合。

试想如果你的数组`a`长度十百万级，再创建一个这样的临时列表就很不划算，最好是一边算一边聚合，稍改动为如下：

```python
a = [4,2,5,1]
sum(i+1 for i in a) # 16
```

此时`i+1 for i in a`是`(i+1 for i in a)`的简写，得到一个生成器(`generator`)对象，如下所示：

```python
In [8]:(i+1 for i in a)
OUT [8]:<generator object <genexpr> at 0x000002AC7FFA8CF0>
```

生成器每迭代一步吐出(`yield`)一个元素并计算和聚合后，进入下一次迭代，直到终点。

<center>[上一个例子](131.md)    [下一个例子](133.md)</center>

================================================
FILE: md/133.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2020/1/1
```

#### 133 获得某天后的1~n天

```python
import calendar
from datetime import date,datetime

def getEverydaySince(year,month,day,n=10):
    i = 0
    _, days = calendar.monthrange(year, month)
    while i < n: 
        d = date(year,month,day)    
        if day == days:
            month,day = month+1,0
            _, days = calendar.monthrange(year, month)
            if month == 13:
                year,month = year+1,1
                _, days = calendar.monthrange(year, month)
        yield d
        day += 1
        i += 1
```

测试结果：

```markdown
In [3]: for day in getEverydaySince(2020,2,1): 
   ...:     print(day)                                                                      
2020-02-01
2020-02-02
2020-02-03
2020-02-04
2020-02-05
2020-02-06
2020-02-07
2020-02-08
2020-02-09
2020-02-10
```


<center>[上一个例子](132.md)    [下一个例子](134.md)</center>

================================================
FILE: md/134.md
================================================

```markdown
@author jackzhenguo
@desc
@tag yield generator
@version 
@date 2020/02/01
```

#### 134 list分组(生成器版)

```python
from math import ceil

def divide_iter(lst, n):
    if n <= 0:
        yield lst
        return
    i, div = 0, ceil(len(lst) / n)
    while i < n:
        yield lst[i * div: (i + 1) * div]
        i += 1

list(divide_iter([1, 2, 3, 4, 5], 0))  # [[1, 2, 3, 4, 5]]
list(divide_iter([1, 2, 3, 4, 5], 2))  # [[1, 2, 3], [4, 5]]
```

<center>[上一个例子](133.md)    [下一个例子](135.md)</center>

================================================
FILE: md/135.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/02
```

#### 135 列表全展开(生成器版)
```python
#多层列表展开成单层列表
a=[1,2,[3,4,[5,6],7],8,["python",6],9]
def function(lst):
    for i in lst:
        if type(i)==list:
            yield from function(i)
        else:
            yield i
print(list(function(a))) # [1, 2, 3, 4, 5, 6, 7, 8, 'python', 6, 9]
```

<center>[上一个例子](134.md)    [下一个例子](136.md)</center>

================================================
FILE: md/136.md
================================================

```markdown
@author jackzhenguo
@desc
@tag decorator
@version 
@date 2020/02/03
```

#### 136 测试函数运行时间的装饰器
```python
#测试函数执行时间的装饰器示例
import time
def timing_func(fn):
    def wrapper():
        start=time.time()
        fn()   #执行传入的fn参数
        stop=time.time()
        return (stop-start)
    return wrapper
```

使用装饰器：

```python
@timing_func
def test_list_append():
    lst=[]
    for i in range(0,100000):
        lst.append(i)  
@timing_func
def test_list_compre():
    [i for i in range(0,100000)]  #列表生成式
    
a=test_list_append()
c=test_list_compre()

print("test list append time:",a)
print("test list comprehension time:",c)
print("append/compre:",round(a/c,3))

#test list append time: 0.0219423770904541
#test list comprehension time: 0.007980823516845703
#append/compre: 2.749
```


<center>[上一个例子](135.md)    [下一个例子](137.md)</center>

================================================
FILE: md/137.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/04
```

#### 137 统计异常次数装饰器


写一个装饰器，统计某个异常重复出现指定次数时，经历的时长。

```python
import time
import math


def excepter(f):
    i = 0
    t1 = time.time()
    def wrapper(): 
        try:
            f()
        except Exception as e:
            nonlocal i
            i += 1
            print(f'{e.args[0]}: {i}')
            t2 = time.time()
            if i == n:
                print(f'spending time:{round(t2-t1,2)}')
    return wrapper

```

关键词`nonlocal`常用于函数嵌套中，声明变量i为非局部变量；

如果不声明，`i+=1`表明`i`为函数`wrapper`内的局部变量，因为在`i+=1`引用(reference)时,`i`未被声明，所以会报`unreferenced variable`的错误。

使用创建的装饰函数`excepter`, `n`是异常出现的次数。

共测试了两类常见的异常：`被零除`和`数组越界`。

```python
n = 10 # except count

@excepter
def divide_zero_except():
    time.sleep(0.1)
    j = 1/(40-20*2)

# test zero divived except
for _ in range(n):
    divide_zero_except()


@excepter
def outof_range_except():
    a = [1,3,5]
    time.sleep(0.1)
    print(a[3])
# test out of range except
for _ in range(n):
    outof_range_except()

```

打印出来的结果如下：

```python
division by zero: 1
division by zero: 2
division by zero: 3
division by zero: 4
division by zero: 5
division by zero: 6
division by zero: 7
division by zero: 8
division by zero: 9
division by zero: 10
spending time:1.01
list index out of range: 1
list index out of range: 2
list index out of range: 3
list index out of range: 4
list index out of range: 5
list index out of range: 6
list index out of range: 7
list index out of range: 8
list index out of range: 9
list index out of range: 10
spending time:1.01
```


#### 

<center>[上一个例子](136.md)    [下一个例子](138.md)</center>

================================================
FILE: md/138.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/05
```

#### 138 装饰器通俗理解

再看一个装饰器：

```python
def call_print(f):
  def g():
    print('you\'re calling %s function'%(f.__name__,))
  return g
```

使用`call_print`装饰器：

```python
@call_print
def myfun():
  pass
 
@call_print
def myfun2():
  pass
```

myfun()后返回：

```python
In [27]: myfun()
you're calling myfun function

In [28]: myfun2()
you're calling myfun2 function
```

**使用call_print**

你看，`@call_print`放置在任何一个新定义的函数上面，都会默认输出一行，你正在调用这个函数的名。

这是为什么呢？注意观察新定义的`call_print`函数(加上@后便是装饰器):

```python
def call_print(f):
  def g():
    print('you\'re calling %s function'%(f.__name__,))
  return g
```

它必须接受一个函数`f`，然后返回另外一个函数`g`.

**装饰器本质**

本质上，它与下面的调用方式效果是等效的：

```
def myfun():
  pass

def myfun2():
  pass
  
def call_print(f):
  def g():
    print('you\'re calling %s function'%(f.__name__,))
  return g
```

下面是最重要的代码：

```
myfun = call_print(myfun)
myfun2 = call_print(myfun2)
```

大家看明白吗？也就是call_print(myfun)后不是返回一个函数吗，然后再赋值给myfun.

再次调用myfun, myfun2时，效果是这样的：

```python
In [32]: myfun()
you're calling myfun function

In [33]: myfun2()
you're calling myfun2 function
```

你看，这与装饰器的实现效果是一模一样的。装饰器的写法可能更加直观些，所以不用显示的这样赋值：`myfun = call_print(myfun)`，`myfun2 = call_print(myfun2)`，但是装饰器的这种封装，猛一看，有些不好理解。

<center>[上一个例子](137.md)    [下一个例子](139.md)</center>

================================================
FILE: md/139.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/06
```
#### 139 定制递减迭代器

```python
#编写一个迭代器，通过循环语句，实现对某个正整数的依次递减1，直到0.
class Descend(Iterator):
    def __init__(self,N):
        self.N=N
        self.a=0
    def __iter__(self):
        return self 
    def __next__(self):
        while self.a<self.N:
            self.N-=1
            return self.N
        raise StopIteration
    
descend_iter=Descend(10)
print(list(descend_iter))
[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
```

核心要点：

1 `__nex__ `名字不能变，实现定制的迭代逻辑

2 `raise StopIteration`：通过 raise 中断程序，必须这样写
    

<center>[上一个例子](138.md)    [下一个例子](140.md)</center>

================================================
FILE: md/14.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 14  字符串转字节　　

字符串转换为字节类型

```python
In [12]: s = "apple"                                                            

In [13]: a = bytes(s,encoding='utf-8')   
In [14] a 
Out[14]: b'apple'

# 转化后a变为字节序列，bytes类型，
# 并且每个字符都被转化为数值，如下所示
In [15]: for i in a: 
    ...:     print(i)                                                                       
97
112
112
108
101
```


<center>[上一个例子](13.md)    [下一个例子](15.md)</center>


================================================
FILE: md/140.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/07
```
		     
#### 140 turtle绘制奥运五环图

turtle绘图的函数非常好用，基本看到函数名字，就能知道它的含义，下面使用turtle，仅用15行代码来绘制奥运五环图。

1 导入库

```python
import turtle as p
```

2 定义画圆函数

```python
def drawCircle(x,y,c='red'):
    p.pu()# 抬起画笔
    p.goto(x,y) # 绘制圆的起始位置
    p.pd()# 放下画笔
    p.color(c)# 绘制c色圆环
    p.circle(30,360) #绘制圆：半径，角度
```

3 画笔基本设置

```python
p = turtle
p.pensize(3) # 画笔尺寸设置3
```

4 绘制五环图

调用画圆函数

```python
drawCircle(0,0,'blue')
drawCircle(60,0,'black')
drawCircle(120,0,'red')
drawCircle(90,-30,'green')
drawCircle(30,-30,'yellow')    

p.done()
```

<center>[上一个例子](139.md)    [下一个例子](141.md)</center>

================================================
FILE: md/141.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/08
```
		     
#### 141 turtle绘制漫天雪花


导入模块

导入 `turtle`库和 python的 `random`

```python
import turtle as p
import random
```

绘制雪花

```python
def snow(snow_count):
    p.hideturtle()
    p.speed(500)
    p.pensize(2)
    for i in range(snow_count):
        r = random.random()
        g = random.random()
        b = random.random()
        p.pencolor(r, g, b)
        p.pu()
        p.goto(random.randint(-350, 350), random.randint(1, 270))
        p.pd()
        dens = random.randint(8, 12)
        snowsize = random.randint(10, 14)
        for _ in range(dens):
            p.forward(snowsize)  # 向当前画笔方向移动snowsize像素长度
            p.backward(snowsize)  # 向当前画笔相反方向移动snowsize像素长度
            p.right(360 / dens)  # 顺时针移动360 / dens度

```

绘制地面

```python
def ground(ground_line_count):
    p.hideturtle()
    p.speed(500)
    for i in range(ground_line_count):
        p.pensize(random.randint(5, 10))
        x = random.randint(-400, 350)
        y = random.randint(-280, -1)
        r = -y / 280
        g = -y / 280
        b = -y / 280
        p.pencolor(r, g, b)
        p.penup()  # 抬起画笔
        p.goto(x, y)  # 让画笔移动到此位置
        p.pendown()  # 放下画笔
        p.forward(random.randint(40, 100))  # 眼当前画笔方向向前移动40~100距离
```

主函数

```python
def main():
    p.setup(800, 600, 0, 0)
    # p.tracer(False)
    p.bgcolor("black")
    snow(30)
    ground(30)
    # p.tracer(True)
    p.mainloop()

main()
```

<center>[上一个例子](140.md)    [下一个例子](142.md)</center>

================================================
FILE: md/142.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/09
```

```python
import hashlib
import pandas as pd
from wordcloud import WordCloud
```

```python
geo_data=pd.read_excel(r"../data/geo_data.xlsx")
print(geo_data)
# 0     深圳
# 1     深圳
# 2     深圳
# 3     深圳
# 4     深圳
# 5     深圳
# 6     深圳
# 7     广州
# 8     广州
# 9     广州
```

```python
#筛选出非空列表值
words = ','.join(x for x in geo_data['city'] if x != []) 
wc = WordCloud(
    background_color="green", #背景颜色"green"绿色
    max_words=100, #显示最大词数
    font_path='./fonts/simhei.ttf', #显示中文
    min_font_size=5,
    max_font_size=100,
    width=500  #图幅宽度
    )
```

```python
x = wc.generate(words)
x.to_file('../data/geo_data.png') 
```

<center>[上一个例子](141.md)    [下一个例子](143.md)</center>

================================================
FILE: md/143.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/10
```

```python
#柱状图+折线图
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(
    go.Scatter(
        x=[0, 1, 2, 3, 4, 5],
        y=[1.5, 1, 1.3, 0.7, 0.8, 0.9]
    ))
fig.add_trace(
    go.Bar(
        x=[0, 1, 2, 3, 4, 5],
        y=[2, 0.5, 0.7, -1.2, 0.3, 0.4]
    ))
fig.show()
```    

<center>[上一个例子](142.md)    [下一个例子](144.md)</center>

================================================
FILE: md/144.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/11
```

```python
# 导入库
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# 生成数据集
data = np.random.random((6,6))
np.fill_diagonal(data,np.ones(6))
features = ["prop1","prop2","prop3","prop4","prop5", "prop6"]
data = pd.DataFrame(data, index = features, columns=features)
print(data)
# 绘制热力图
heatmap_plot = sns.heatmap(data, center=0, cmap='gist_rainbow')
plt.show()
```    

<center>[上一个例子](143.md)    [下一个例子](145.md)</center>

================================================
FILE: md/145.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/12
```

```python
from pyecharts import charts

# 仪表盘
gauge = charts.Gauge()
gauge.add('Python小例子', [('Python机器学习', 30), ('Python基础', 70.),
                        ('Python正则', 90)])
gauge.render(path="./data/仪表盘.html")
print('ok')
```  

仪表盘中共展示三项，每项的比例为30%,70%,90%，如下图默认名称显示第一项：Python机器学习，完成比例为30%

<center>[上一个例子](144.md)    [下一个例子](146.md)</center>

================================================
FILE: md/146.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/13
```

```python
from pyecharts import options as opts
from pyecharts.charts import Funnel, Page
from random import randint

def funnel_base() -> Funnel:
  c = (
    Funnel()
    .add("豪车", [list(z) for z in zip(['宝马', '法拉利', '奔驰', '奥迪', '大众', '丰田', '特斯拉'],
                 [randint(1, 20) for _ in range(7)])])
    .set_global_opts(title_opts=opts.TitleOpts(title="豪车漏斗图"))
  )
  return c
funnel_base().render('./img/car_fnnel.html')  
```

<center>[上一个例子](145.md)    [下一个例子](147.md)</center>

================================================
FILE: md/147.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/14
```

```python
from pyecharts import options as opts
from pyecharts.charts import Liquid, Page
from pyecharts.globals import SymbolType

def liquid() -> Liquid:
    c = (
        Liquid()
        .add("lq", [0.67, 0.30, 0.15])
        .set_global_opts(title_opts=opts.TitleOpts(title="Liquid"))
    )
    return c

liquid().render('./img/liquid.html')
```    

<center>[上一个例子](146.md)    [下一个例子](148.md)</center>

================================================
FILE: md/148.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/15
```

```python
from pyecharts import options as opts
from pyecharts.charts import Pie
from random import randint

def pie_base() -> Pie:
    c = (
        Pie()
        .add("", [list(z) for z in zip(['宝马', '法拉利', '奔驰', '奥迪', '大众', '丰田', '特斯拉'],
                                       [randint(1, 20) for _ in range(7)])])
        .set_global_opts(title_opts=opts.TitleOpts(title="Pie-基本示例"))
        .set_series_opts(label_opts=opts.LabelOpts(formatter="{b}: {c}"))
    )
    return c

pie_base().render('./img/pie_pyecharts.html')   
```

<center>[上一个例子](147.md)    [下一个例子](149.md)</center>

================================================
FILE: md/149.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/16
```

```python
import random
from pyecharts import options as opts
from pyecharts.charts import Page, Polar

def polar_scatter0() -> Polar:
    data = [(alpha, random.randint(1, 100)) for alpha in range(101)] # r = random.randint(1, 100)
    print(data)
    c = (
        Polar()
        .add("", data, type_="bar", label_opts=opts.LabelOpts(is_show=False))
        .set_global_opts(title_opts=opts.TitleOpts(title="Polar"))
    )
    return c

polar_scatter0().render('./img/polar.html')
```    

<center>[上一个例子](148.md)    [下一个例子](150.md)</center>

================================================
FILE: md/15.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 15 任意对象转为字符串　　

```python
                                                              

In [1]: str(100)                                                                 
Out[1]: '100'

In [2]: str([3,2,10])                                                           
Out[2]: '[3, 2, 10]'

In [3]: str({'a':1, 'b':10})                                                    
Out[3]: "{'a': 1, 'b': 10}"

In [11]: from collections import defaultdict                                    
In [12]: dd = defaultdict(int)                                                  

In [14]: for i in [1,3,2,2,3,3]: 
    ...:     dd[i] += 1 
    ...:                                                                        

In [15]: dd                                                                     
Out[15]: defaultdict(int, {1: 1, 3: 3, 2: 2})

In [16]: str(dd)                                                                
Out[16]: "defaultdict(<class 'int'>, {1: 1, 3: 3, 2: 2})"

```

<center>[上一个例子](14.md)    [下一个例子](16.md)</center>


================================================
FILE: md/150.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/17
```

```python
from pyecharts import options as opts
from pyecharts.charts import Page, WordCloud
from pyecharts.globals import SymbolType

words = [
    ("Python", 100),
    ("C++", 80),
    ("Java", 95),
    ("R", 50),
    ("JavaScript", 79),
    ("C", 65)
]

def wordcloud() -> WordCloud:
    c = (
        WordCloud()
        # word_size_range: 单词字体大小范围
        .add("", words, word_size_range=[20, 100], shape='cardioid')
        .set_global_opts(title_opts=opts.TitleOpts(title="WordCloud"))
    )
    return c

wordcloud().render('./img/wordcloud.html')     
```

<center>[上一个例子](149.md)    [下一个例子](151.md)</center>

================================================
FILE: md/151.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/18
```

```python
import random
from pyecharts import options as opts
from pyecharts.charts import HeatMap

def heatmap_car() -> HeatMap:
    x = ['宝马', '法拉利', '奔驰', '奥迪', '大众', '丰田', '特斯拉']
    y = ['中国','日本','南非','澳大利亚','阿根廷','阿尔及利亚','法国','意大利','加拿大']
    value = [[i, j, random.randint(0, 100)]
             for i in range(len(x)) for j in range(len(y))]
    c = (
        HeatMap()
        .add_xaxis(x)
        .add_yaxis("销量", y, value)
        .set_global_opts(
            title_opts=opts.TitleOpts(title="HeatMap"),
            visualmap_opts=opts.VisualMapOpts(),
        )
    )
    return c

heatmap_car().render('./img/heatmap_pyecharts.html') 
```

热力图描述的实际是三维关系，x轴表示车型，y轴表示国家，每个色块的颜色值代表销量，颜色刻度尺显示在左下角，颜色越红表示销量越大。

<center>[上一个例子](150.md)    [下一个例子](152.md)</center>

================================================
FILE: md/152.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/19
```

### matplotlib绘制动画

matplotlib是python中最经典的绘图包，里面animation模块能绘制动画。

首先导入小例子使用的模块：

```python
from matplotlib import pyplot as plt
from matplotlib import animation
from random import randint, random
```

生成数据，frames_count是帧的个数，data_count每个帧的柱子个数

```python 
class Data:
    data_count = 32
    frames_count = 2

    def __init__(self, value):
        self.value = value
        self.color = (0.5, random(), random()) #rgb

    # 造数据
    @classmethod
    def create(cls):
        return [[Data(randint(1, cls.data_count)) for _ in range(cls.data_count)]
                for frame_i in range(cls.frames_count)]

```

绘制动画：animation.FuncAnimation函数的回调函数的参数fi表示第几帧，注意要调用axs.cla()清除上一帧。

```python
def draw_chart():
    fig = plt.figure(1, figsize=(16, 9))
    axs = fig.add_subplot(111)
    axs.set_xticks([])
    axs.set_yticks([])

    # 生成数据
    frames = Data.create()

    def animate(fi):
        axs.cla()  # clear last frame
        axs.set_xticks([])
        axs.set_yticks([])
        return axs.bar(list(range(Data.data_count)),        # X
                       [d.value for d in frames[fi]],       # Y
                       1,                                   # width
                       color=[d.color for d in frames[fi]]  # color
                       )
    # 动画展示
    anim = animation.FuncAnimation(fig, animate, frames=len(frames))
    plt.show()
```

```python
draw_chart()
179     
```

<center>[上一个例子](151.md)    [下一个例子](153.md)</center>

================================================
FILE: md/153.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/20
```

### 学会画 pairplot 图
seaborn 绘图库，基于 matplotlib 开发，提供更高层绘图接口。

学习使用 seaborn 绘制 pairplot 图

pairplot 图能直观的反映出两两特征间的关系，帮助我们对数据集建立初步印象，更好的完成分类和聚类任务。

使用 skearn 导入经典的 Iris 数据集，共有 150 条记录，4 个特征，target 有三种不同值。如下所示：

```markdown
     sepal_length  sepal_width  petal_length  petal_width    species
0             5.1          3.5           1.4          0.2     setosa
1             4.9          3.0           1.4          0.2     setosa
2             4.7          3.2           1.3          0.2     setosa
3             4.6          3.1           1.5          0.2     setosa
4             5.0          3.6           1.4          0.2     setosa
..            ...          ...           ...          ...        ...
145           6.7          3.0           5.2          2.3  virginica
146           6.3          2.5           5.0          1.9  virginica
147           6.5          3.0           5.2          2.0  virginica
148           6.2          3.4           5.4          2.3  virginica
149           5.9          3.0           5.1          1.8  virginica
```

使用 seaborn 绘制 sepal_length, petal_length 两个特征间的关系矩阵：

```python
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn import tree

sns.set(style="ticks")

df02 = df.iloc[:,[0,2,4]] # 选择一对特征
sns.pairplot(df02)
plt.show()
```


设置颜色多显：
```python
sns.pairplot(df02, hue="species")
plt.show()
```

绘制所有特征散点矩阵：
```python
sns.pairplot(df, hue="species")
plt.show()
```
   

<center>[上一个例子](152.md)    [下一个例子](154.md)</center>

================================================
FILE: md/154.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/21
```

### 含单个元素的元组
Python中有些函数的参数类型为元组，其内有1个元素，这样创建是错误的：
```python
c = (5) # NO!
```

它实际创建一个整型元素5，必须要在元素后加一个逗号:

```python
c = (5,) # YES!
186   

```

<center>[上一个例子](153.md)    [下一个例子](155.md)</center>

================================================
FILE: md/155.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/22
```

### 默认参数设为空

含有默认参数的函数，如果类型为容器，且设置为空：

```python
def f(a,b=[]):  # NO!
    print(b)
    return b
```

```python 
ret = f(1)
ret.append(1)
ret.append(2)
```

# 当再调用f(1)时，预计打印为 []
f(1)
# 但是却为 [1,2]

这是可变类型的默认参数之坑，请务必设置此类默认参数为None：

def f(a,b=None): # YES!
    pass

<center>[上一个例子](154.md)    [下一个例子](156.md)</center>

================================================
FILE: md/156.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/23
```

### 共享变量未绑定之坑
有时想要多个函数共享一个全局变量，但却在某个函数内试图修改它为局部变量：

```python
i = 1
def f():
    i+=1 #NO!
    
def g():
    print(i)

```

应该在f函数内显示声明i为global变量：

```python 
i = 1
def f():
    global i # YES!
    i+=1
```


<center>[上一个例子](155.md)    [下一个例子](157.md)</center>

================================================
FILE: md/157.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/24
```

### lambda自由参数之坑
排序和分组的key函数常使用lambda，表达更加简洁，但是有个坑新手容易掉进去：
```python
a = [lambda x: x+i for i in range(3)] # NO!
for f in a:
    print(f(1))
# 你可能期望输出：1,2,3
```

但是实际却输出: 3,3,3. 定义lambda使用的i被称为自由参数，它只在调用lambda函数时，值才被真正确定下来，这就犹如下面打印出2，你肯定确信无疑吧。

```python 
a = 0
a = 1
a = 2
def f(a):
    print(a)
```

正确做法是转化自由参数为lambda函数的默认参数：
```python
a = [lambda x,i=i: x+i for i in range(3)] # YES!    
```

<center>[上一个例子](156.md)    [下一个例子](158.md)</center>

================================================
FILE: md/158.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/25
```

### 各种参数使用之坑

Python强大多变，原因之一在于函数参数类型的多样化。方便的同时，也为使用者带来更多的约束规则。如果不了解这些规则，调用函数时，可能会出现如下一些语法异常：

*(1) SyntaxError: positional argument follows keyword argument*


*(2) TypeError: f() missing 1 required keyword-only argument: 'b'*


*(3) SyntaxError: keyword argument repeated*

*(4) TypeError: f() missing 1 required positional argument: 'b'*

*(5) TypeError: f() got an unexpected keyword argument 'a'*

*(6) TypeError: f() takes 0 positional arguments but 1 was given*


总结主要的参数使用规则

位置参数

`位置参数`的定义：`函数调用`时根据函数定义的参数位（形参）置来传递参数，是最常见的参数类型。

```python
def f(a):
  return a

f(1) # 位置参数 
```
位置参数不能缺少：
```python
def f(a,b):
  pass

f(1) # TypeError: f() missing 1 required positional argument: 'b'
```

**规则1：位置参数必须一一对应，缺一不可**

关键字参数

在函数调用时，通过‘键--值’方式为函数形参传值，不用按照位置为函数形参传值。

```python
def f(a):
  print(f'a:{a}')
```
这么调用，`a`就是关键字参数：
```python
f(a=1)
```
但是下面调用就不OK:
```python
f(a=1,20.) # SyntaxError: positional argument follows keyword argument
```

**规则2：关键字参数必须在位置参数右边**


下面调用也不OK:
```python
f(1,width=20.,width=30.) #SyntaxError: keyword argument repeated

```

**规则3：对同一个形参不能重复传值**


默认参数

在定义函数时，可以为形参提供默认值。对于有默认值的形参，调用函数时如果为该参数传值，则使用传入的值，否则使用默认值。如下`b`是默认参数：
```python
def f(a,b=1):
  print(f'a:{a}, b:{b}')

```


**规则4：无论是函数的定义还是调用，默认参数的定义应该在位置形参右面**

只在定义时赋值一次；默认参数通常应该定义成不可变类型


可变位置参数

如下定义的参数a为可变位置参数：
```python
def f(*a):
  print(a)
```
调用方法：
```python
f(1) #打印结果为元组： (1,)
f(1,2,3) #打印结果：(1, 2, 3)
```

但是，不能这么调用：
```python
f(a=1) # TypeError: f() got an unexpected keyword argument 'a'
```


可变关键字参数

如下`a`是可变关键字参数：
```python
def f(**a):
  print(a)
```
调用方法：
```python
f(a=1) #打印结果为字典：{'a': 1}
f(a=1,b=2,width=3) #打印结果：{'a': 1, 'b': 2, 'width': 3}
```

但是，不能这么调用：
```python
f(1) TypeError: f() takes 0 positional arguments but 1 was given
```


<center>[上一个例子](157.md)    [下一个例子](159.md)</center>

================================================
FILE: md/159.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/26
```

### 列表删除之坑

删除一个列表中的元素，此元素可能在列表中重复多次：

```python
def del_item(lst,e):
    return [lst.remove(i) for i in lst if i==e] # NO!
```

考虑删除这个序列[1,3,3,3,5]中的元素3，结果发现只删除其中两个：

```python
del_item([1,3,3,3,5],3) # 结果：[1,3,5]
```

正确做法：

```python
def del_item(lst,e):
    d = dict(zip(range(len(lst)),lst)) # YES! 构造字典
    return [v for k,v in d.items() if v!=e]

```  

<center>[上一个例子](158.md)    [下一个例子](160.md)</center>


================================================
FILE: md/16.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 16 执行字符串表示的代码

将字符串编译成python能识别或可执行的代码，也可以将文字读成字符串再编译。

```python
In [1]: s  = "print('helloworld')"
    
In [2]: r = compile(s,"<string>", "exec")
    
In [3]: r
Out[3]: <code object <module> at 0x0000000005DE75D0, file "<string>", line 1>
    
In [4]: exec(r)
helloworld

s  = """
def f():
    a = 100 % 52
    print(a)
f()
"""
r = compile(s,"<string>", "exec")
exec(r)
```

输出
48

<center>[上一个例子](15.md)    [下一个例子](17.md)</center>


================================================
FILE: md/160.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/27
```
### 列表快速复制之坑

在python中`*`与列表操作，实现快速元素复制：

```python
a = [1,3,5] * 3 # [1,3,5,1,3,5,1,3,5]
a[0] = 10 # [10, 2, 3, 1, 2, 3, 1, 2, 3]
```

如果列表元素为列表或字典等复合类型：

```python
a = [[1,3,5],[2,4]] * 3 # [[1, 3, 5], [2, 4], [1, 3, 5], [2, 4], [1, 3, 5], [2, 4]]

a[0][0] = 10 #  
```

结果可能出乎你的意料，其他`a[1[0]`等也被修改为10

```python
[[10, 3, 5], [2, 4], [10, 3, 5], [2, 4], [10, 3, 5], [2, 4]]
```

这是因为*复制的复合对象都是浅引用，也就是说id(a[0])与id(a[2])门牌号是相等的。如果想要实现深复制效果，这么做：

```python
a = [[] for _ in range(3)]
```  

<center>[上一个例子](159.md)    [下一个例子](161.md)</center>

================================================
FILE: md/161.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/28
```

### 字符串驻留

```python
In [1]: a = 'something'
    ...: b = 'some'+'thing'
    ...: id(a)==id(b)
Out[1]: True
```
如果上面例子返回`True`，但是下面例子为什么是`False`:
```python
In [1]: a = '@zglg.com'

In [2]: b = '@zglg'+'.com'

In [3]: id(a)==id(b)
Out[3]: False
```
这与Cpython 编译优化相关，行为称为`字符串驻留`，但驻留的字符串中只包含字母，数字或下划线。   

<center>[上一个例子](160.md)    [下一个例子](162.md)</center>

================================================
FILE: md/162.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/02/29
```

### 相同值的不可变对象
```python
In [5]: d = {}
    ...: d[1] = 'java'
    ...: d[1.0] = 'python'

In [6]: d
Out[6]: {1: 'python'}

### key=1,value=java的键值对神奇消失了
In [7]: d[1]
Out[7]: 'python'
In [8]: d[1.0]
Out[8]: 'python'
```
这是因为具有相同值的不可变对象在Python中始终具有`相同的哈希值`

由于存在`哈希冲突`，不同值的对象也可能具有相同的哈希值。  

<center>[上一个例子](161.md)    [下一个例子](163.md)</center>

================================================
FILE: md/163.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/01
```

### 对象销毁顺序
创建一个类`SE`:
```python
class SE(object):
  def __init__(self):
    print('init')
  def __del__(self):
    print('del')
```
创建两个SE实例，使用`is`判断：
```python
In [63]: SE() is SE()
init
init
del
del
Out[63]: False

```
创建两个SE实例，使用`id`判断：
```python
In [64]: id(SE()) == id(SE())
init
del
init
del
Out[64]: True
```

调用`id`函数, Python 创建一个 SE 类的实例，并使用`id`函数获得内存地址后，销毁内存丢弃这个对象。

当连续两次进行此操作, Python会将相同的内存地址分配给第二个对象，所以两个对象的id值是相同的.


但是is行为却与之不同，通过打印顺序就可以看到。 

<center>[上一个例子](162.md)    [下一个例子](164.md)</center>

================================================
FILE: md/164.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/02
```

### 充分认识for

```python
In [65]: for i in range(5):
    ...:   print(i)
    ...:   i = 10
0
1
2
3
4
```
为什么不是执行一次就退出？

按照for在Python中的工作方式, i = 10 并不会影响循环。range(5)生成的下一个元素就被解包，并赋值给目标列表的变量`i`.   

<center>[上一个例子](163.md)    [下一个例子](165.md)</center>

================================================
FILE: md/165.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/03
```

### 认识执行时机

```python
array = [1, 3, 5]
g = (x for x in array if array.count(x) > 0)
```
`g`为生成器，list(g)后返回`[1,3,5]`，因为每个元素肯定至少都出现一次。所以这个结果这不足为奇。但是，请看下例：
```python
array = [1, 3, 5]
g = (x for x in array if array.count(x) > 0)
array = [5, 7, 9]
```
请问,list(g)等于多少？这不是和上面那个例子结果一样吗，结果也是`[1,3,5]`，但是：
```python
In [74]: list(g)
Out[74]: [5]
```

这有些不可思议~~ 原因在于：

生成器表达式中, in 子句在声明时执行, 而条件子句则是在运行时执行。


所以代码：
```python
array = [1, 3, 5]
g = (x for x in array if array.count(x) > 0)
array = [5, 7, 9]
```

等价于：
```python
g = (x for x in [1,3,5] if [5,7,9].count(x) > 0)
``` 

<center>[上一个例子](164.md)    [下一个例子](166.md)</center>

================================================
FILE: md/166.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/04
```

### 创建空集合错误

这是Python的一个集合：`{1,3,5}`，它里面没有重复元素，在去重等场景有重要应用。下面这样创建空集合是错误的：

```python
empty = {} #NO!
```

cpython会解释它为字典

使用内置函数`set()`创建空集合：

```python
empty = set() #YES!
``` 

<center>[上一个例子](165.md)    [下一个例子](167.md)</center>

================================================
FILE: md/167.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/05
```

### pyecharts传入Numpy数据绘图失败

echarts使用广泛，echarts+python结合后的包：pyecharts，同样可很好用，但是传入Numpy的数据，像下面这样绘图会失败：

```python
from pyecharts.charts import Bar
import pyecharts.options as opts
import numpy as np
c = (
    Bar()
    .add_xaxis([1, 2, 3, 4, 5])
    # 传入Numpy数据绘图失败！
    .add_yaxis("商家A", np.array([0.1, 0.2, 0.3, 0.4, 0.5]))
)

c.render()
```

<img src="./img/image-20200129164119080.png" width="50%"/>

由此可见pyecharts对Numpy数据绘图不支持，传入原生Python的list:

```python
from pyecharts.charts import Bar
import pyecharts.options as opts
import numpy as np
c = (
    Bar()
    .add_xaxis([1, 2, 3, 4, 5])
    # 传入Python原生list
    .add_yaxis("商家A", np.array([0.1, 0.2, 0.3, 0.4, 0.5]).tolist())
)

c.render()
```

<img src="./img/image-20200129164339971.png" width="50%"/>    

<center>[上一个例子](166.md)    [下一个例子](168.md)</center>

================================================
FILE: md/168.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/06
```

### 优化代码异常输出包

一行代码优化输出的异常信息
```python
pip install pretty-errors
```

写一个函数测试：

```python
def divided_zero():
    for i in range(10, -1, -1):
        print(10/i)


divided_zero()
```

在没有import这个`pretty-errors`前，输出的错误信息有些冗余：

```python
Traceback (most recent call last):
  File "c:\Users\HUAWEI\.vscode\extensions\ms-python.python-2019.11.50794\pythonFiles\ptvsd_launcher.py", line 43, in <module>
    main(ptvsdArgs)
  File "c:\Users\HUAWEI\.vscode\extensions\ms-python.python-2019.11.50794\pythonFiles\lib\python\old_ptvsd\ptvsd\__main__.py",
line 432, in main
    run()
  File "c:\Users\HUAWEI\.vscode\extensions\ms-python.python-2019.11.50794\pythonFiles\lib\python\old_ptvsd\ptvsd\__main__.py",
line 316, in run_file
    runpy.run_path(target, run_name='__main__')
  File "D:\anaconda3\lib\runpy.py", line 263, in run_path
    pkg_name=pkg_name, script_name=fname)
  File "D:\anaconda3\lib\runpy.py", line 96, in _run_module_code
    mod_name, mod_spec, pkg_name, script_name)
  File "D:\anaconda3\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "d:\source\sorting-visualizer-master\sorting\debug_test.py", line 6, in <module>
    divided_zero()
  File "d:\source\sorting-visualizer-master\sorting\debug_test.py", line 3, in divided_zero
    print(10/i)
ZeroDivisionError: division by zero
```

我们使用刚安装的`pretty_errors`，`import`下:

```python
import pretty_errors

def divided_zero():
    for i in range(10, -1, -1):
        print(10/i)

divided_zero()
```

此时看看输出的错误信息，非常精简只有2行，去那些冗余信息：

```python
ZeroDivisionError:
division by zero
```

完整的输出信息如下图片所示：

<img src="./img/errors.png" width="50%"/>    

<center>[上一个例子](167.md)    [下一个例子](169.md)</center>

================================================
FILE: md/169.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/07
```

#### 图像处理包pillow

两行代码实现旋转和缩放图像

首先安装pillow:

```python
pip install pillow
```

旋转图像下面图像45度：

<img src="./img/plotly2.png" width="40%"/>

```python
In [1]: from PIL import Image
In [2]: im = Image.open('./img/plotly2.png')
In [4]: im.rotate(45).show()
```

旋转45度后的效果图

<img src="./img/image-20200105085120611.png" width="40%"/>

等比例缩放图像：

```python
im.thumbnail((128,72),Image.ANTIALIAS)
```

缩放后的效果图：

![](./img/pillow_suofang.png)


过滤图像后的效果图：

```python
from PIL import ImageFilter
im.filter(ImageFilter.CONTOUR).show()
```

<img src="./img/pillow_filter.png" width="40%"/>
  

<center>[上一个例子](168.md)    [下一个例子](170.md)</center>

================================================
FILE: md/17.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/15
```

#### 17 计算表达式

将字符串str 当成有效的表达式来求值并返回计算结果取出字符串中内容

```python
In [1]: s = "1 + 3 +5"

In [2]: eval(s)
Out[2]: 9

s = ["{'小汽车':10, '面包车':8}", "{'面包车':5}"]
from collections import defaultdict
d = defaultdict(int)

for item in s:
    my_dict = eval(item)
    print(type(my_dict))
    for key in my_dict:
        d[key] += my_dict[key]
print(d)

<class 'dict'>
<class 'dict'>
defaultdict(<class 'int'>, {'小汽车': 10, '面包车': 13})
```

<center>[上一个例子](16.md)    [下一个例子](18.md)</center>


================================================
FILE: md/170.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/08
```

### 一行代码找到编码

兴高采烈地，从网页上抓取一段 `content`

但是，一 `print ` 就不那么兴高采烈了，结果看到一串这个：

```markdown
b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python'
```

这是啥？ 又 x 又 c 的！

再一看，哦，原来是十六进制字节串 (`bytes`)，`\x` 表示十六进制

接下来，你一定想转化为人类能看懂的语言，想到 `decode`：

```python
In [3]: b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python'.decode()
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-3-7d0ea6148880> in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte
```

马上，一盆冷水泼头上，抛异常了。。。。。

根据提示，`UnicodeDecodeError`，这是 unicode 解码错误。

原来，`decode` 默认的编码方法：`utf-8` 

所以排除  b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python' 使用 `utf-8` 的编码方式

可是，这不是四选一选择题啊，逐个排除不正确的！

编码方式几十种，不可能逐个排除吧。

那就猜吧！！！！！！！！！！！！！

**人生苦短，我用Python**

**Python， 怎忍心让你受累呢~**

尽量三行代码解决问题

**第一步，安装 chardet**  它是 char detect 的缩写。

**第二步，pip install chardet**

**第三步，出结果**

```python
In [6]: chardet.detect(b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python')
Out[6]: {'encoding': 'GB2312', 'confidence': 0.99, 'language': 'Chinese'}
```

编码方法：gb2312

解密字节串：

```python
In [7]: b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python'.decode('gb2312')
Out[7]: '人生苦短，我用Python'
```

目前，`chardet` 包支持的检测编码几十种。     

<center>[上一个例子](169.md)    [下一个例子](171.md)</center>

================================================
FILE: md/171.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/09
```

问：子类继承父类的方法吗？
 
答：子类的实例继承了父类的static_method静态方法，调用该方法，还是调用的父类的方法和类属性。

```python
# coding:utf-8


class Foo(object):
    X = 1
    Y = 2

    @staticmethod
    def averag(*mixes):
        return sum(mixes) / len(mixes)

    @staticmethod
    def static_method():
        return Foo.averag(Foo.X, Foo.Y)

    @classmethod
    def class_method(cls):
        return cls.averag(cls.X, cls.Y)


class Son(Foo):
    X = 3
    Y = 5

    @staticmethod
    def averag(*mixes):
        return sum(mixes) / 3

p = Son()
print(p.static_method())
print(p.class_method())
# 1.5
# 2.6666666666666665
```

<center>[上一个例子](170.md)    [下一个例子](172.md)</center>

================================================
FILE: md/172.md
================================================

```markdown
@author jackzhenguo
@desc NumPy 的 pad 填充方法
@tag NumPy 
@version v1.0
@date 2020/11/27
```

今天介绍 NumPy 一个实用的方法 `pad`，实现数组周围向外扩展层的功能。

```python
In [1]: import numpy as np                                                      
In [2]: help(np.pad)                                                            
In [4]: a = np.ones((3,4))  
Out[4]: 
array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])
```

np.pad 默认在原数组周边向外扩展 pad_width 层：

```python
In [6]: np.pad(a,pad_width=2)                                                   
Out[6]: 
array([[0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 0., 0.],
       [0., 0., 1., 1., 1., 1., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0.]])
```

此函数在为数组充填值，卷积中有重要应用。

以上就是《python-small-examples》第 172 个小例子：NumPy 的 pad 填充方法。

<center>[上一个例子](171.md)    [下一个例子](173.md)</center>

================================================
FILE: md/173.md
================================================

```markdown
@author jackzhenguo
@desc 创建一个下对角线为1、2、3、4的对角矩阵
@tag
@version 
@date 2020/03/11
```

```python
In [1]: import numpy as np

In [2]: Z = np.diag(1+np.arange(4),k=-1)
   ...: print(Z)

[[0 0 0 0 0]
 [1 0 0 0 0]
 [0 2 0 0 0]
 [0 0 3 0 0]
 [0 0 0 4 0]]
 ```
 
 其中，k 参数：大于0，表示与主对角线上移k，小于0下移k

<center>[上一个例子](172.md)    [下一个例子](174.md)</center>

================================================
FILE: md/174.md
================================================

```markdown
@author jackzhenguo
@desc cut 数据分箱
@tag
@version 
@date 2020/11/28
```

第174个小例子：cut 数据分箱

将百分制分数转为A,B,C,D四个等级，bins 被分为 [0,60,75,90,100]，labels 等于['D', 'C', 'B', 'A']：

```python
# 生成20个[0,100]的随机整数
In [30]: a = np.random.randint(1,100,20)                   
In [31]: a                                    
Out[31]: 
array([48, 22, 46, 84, 13, 52, 36, 35, 27, 99, 31, 37, 15, 31,  5, 46, 98,99, 60, 43])

# cut分箱
In [33]: pd.cut(a, [0,60,75,90,100], labels = ['D', 'C', 'B', 'A'])             
Out[33]: 
[D, D, D, B, D, ..., D, A, A, D, D]
Length: 20
Categories (4, object): [D < C < B < A]
```

分箱后，48分对应D，22分对应D，46对应D，84分对应B，...

<center>[上一个例子](173.md)    [下一个例子](175.md)</center>

================================================
FILE: md/175.md
================================================

```markdown
@author jackzhenguo
@desc 丢弃空值和填充空值
@tag
@version 
@date 2020/03/13
```

丢弃空值

np.nan 是 pandas 中常见空值，使用 dropna 过滤空值，axis 0 表示按照行，1 表示按列，how 默认为 any ，意思是只要有一个 nan 就过滤某行或某列，all 所有都为 nan

```python
# axis 0 表示按照行，all 此行所有值都为 nan
df.dropna(axis=0, how='all')
```

充填空值

空值一般使用某个统计值填充，如平均数、众数、中位数等，使用函数 fillna:

```python
# 使用a列平均数填充列的空值，inplace true表示就地填充
df["a"].fillna(df["a"].mean(), inplace=True)
```


<center>[上一个例子](174.md)    [下一个例子](176.md)</center>

================================================
FILE: md/176.md
================================================

```markdown
@author jackzhenguo
@desc 一行代码让 pip 安装加速 100 倍
@tag
@version 
@date 2020/03/14
```

pip 安装普通方法：

```python
pip install scrapy 
```

这个安装可能是龟速，甚至直接抛出 timeout 异常，然后可能你会加长 socket 延时，通过设置 `defualt-timeout` 参数：

```python
pip --defualt-timeout = 600 install scrapy
```

但是这不会加快安装速度，直接添加一个参数：

```python
-i https://pypi.tuna.tsinghua.edu.cn/simple 
```

完整安装命令：

```python
pip --defualt-timeout = 600 install scrapy -i https://pypi.tuna.tsinghua.edu.cn/simple 
```

后面安装你可以直接复制我这行命令，安装包的速度会快很多。

<center>[上一个例子](175.md)    [下一个例子](177.md)</center>

================================================
FILE: md/177.md
================================================

```markdown
@author jackzhenguo
@desc 数据分析神器：deepnote
@tag
@version 
@date 2020/03/15
```

一个和 jupyter notebook很像的神器：deepnote

jupyter notebook 是运行 python 非常好用的笔记本之一，尤其作数据分析、数据科学领域应用广泛。最近发现一款兼容 jupyter notebook，极好用的notebook: deepnote

使用也是免费！https://deepnote.com/

它的特点：时事协作，运行在云端

上手使用一下，使用shift+enter 执行代码

------

  执行代码，体验很好，很香：

邀请伙伴直接进入你的notebook，多人协作，开发更快：

多了一种选择，调换着使用它们会很不错！

<center>[上一个例子](176.md)    [下一个例子](178.md)</center>

================================================
FILE: md/178.md
================================================

```markdown
@author jackzhenguo
@desc apply 方法去掉特殊字符
@tag
@version 
@date 2020/03/16
```

### apply 方法去掉特殊字符

某列单元格含有特殊字符，如标点符号，使用元素级操作方法 apply 干掉它们：

```python
import string
exclude = set(string.punctuation)

def remove_punctuation(x):
    x = ''.join(ch for ch in x if ch not in exclude)
    return x
# 原df
Out[26]: 
      a       b
0   c,d  edc.rc
1     3       3
2  d ef       4

# 过滤a列标点
In [27]: df.a = df.a.apply(remove_punctuation) 
In [28]: df                
Out[28]: 
      a       b
0    cd  edc.rc
1     3       3
2  d ef       4
```

<center>[上一个例子](177.md)    [下一个例子](179.md)</center>

================================================
FILE: md/179.md
================================================

```markdown
@author jackzhenguo
@desc 使用map对列做特征工程
@tag
@version 
@date 2020/03/17
```

**使用map对列做特征工程**

先生成数据：

```python
d = {
"gender":["male", "female", "male","female"],
"color":["red", "green", "blue","green"],
"age":[25, 30, 15, 32]
}

df = pd.DataFrame(d)
df
```


在 `gender` 列上，使用 map 方法，快速完成如下映射：

```python
d = {"male": 0, "female": 1}
df["gender2"] = df["gender"].map(d)
```


<center>[上一个例子](178.md)    [下一个例子](180.md)</center>

================================================
FILE: md/18.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/16
```

#### 18 字符串格式化　

格式化输出字符串，format(value, format_spec)实质上是调用了value的__format__(format_spec)方法。

```
In [104]: print("i am {0},age{1}".format("tom",18))
i am tom,age18
```

| 3.1415926  | {:.2f}  | 3.14      | 保留小数点后两位             |
| ---------- | ------- | --------- | ---------------------------- |
| 3.1415926  | {:+.2f} | +3.14     | 带符号保留小数点后两位       |
| -1         | {:+.2f} | -1.00     | 带符号保留小数点后两位       |
| 2.71828    | {:.0f}  | 3         | 不带小数                     |
| 5          | {:0>2d} | 05        | 数字补零 (填充左边, 宽度为2) |
| 5          | {:x<4d} | 5xxx      | 数字补x (填充右边, 宽度为4)  |
| 10         | {:x<4d} | 10xx      | 数字补x (填充右边, 宽度为4)  |
| 1000000    | {:,}    | 1,000,000 | 以逗号分隔的数字格式         |
| 0.25       | {:.2%}  | 25.00%    | 百分比格式                   |
| 1000000000 | {:.2e}  | 1.00e+09  | 指数记法                     |
| 18         | {:>10d} | ' 18'     | 右对齐 (默认, 宽度为10)      |
| 18         | {:<10d} | '18 '     | 左对齐 (宽度为10)            |
| 18         | {:^10d} | ' 18 '    | 中间对齐 (宽度为10)          |

<center>[上一个例子](17.md)    [下一个例子](19.md)</center>

================================================
FILE: md/180.md
================================================

```markdown
@author jackzhenguo
@desc category列转数值
@tag
@version 
@date 2020/03/18
```

第 180 个小例子：**category列转数值**

某列取值只可能为有限个枚举值，往往需要转为数值，使用get_dummies，或自己定义函数：

```python
pd.get_dummies(df['a'])
```

自定义函数，结合 apply:

```python
def c2n(x):
    if x=='A':
        return 95
    if x=='B':
        return 80

df['a'].apply(c2n)
```

<center>[上一个例子](179.md)    [下一个例子](181.md)</center>

================================================
FILE: md/181.md
================================================

```markdown
@author jackzhenguo
@desc rank排名
@tag
@version 
@date 2020/03/19
```

第 181 个小例子：**rank排名**

rank 方法，生成数值排名，ascending 为False，考试分数越高，排名越靠前：

```python
In [36]: df = pd.DataFrame({'a':[46, 98,99, 60, 43]} )) 
In [53]: df['a'].rank(ascending=False)                   
Out[53]: 
0    4.0
1    2.0
2    1.0
3    3.0
4    5.0
```

<center>[上一个例子](180.md)    [下一个例子](182.md)</center>

================================================
FILE: md/182.md
================================================

```markdown
@author jackzhenguo
@desc 完成数据下采样，调整步长由小时为天
@tag
@version 
@date 2020/03/20
第 182 个小例子：**完成数据下采样，调整步长由小时为天**
```

步长为小时的时间序列数据，有没有小技巧，快速完成下采样，采集成按天的数据呢？先生成测试数据：

```python
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(1,10,size=(240,3)), \
columns = ['商品编码','商品销量','商品库存'])
```

```python
df.index = pd.util.testing.makeDateIndex(240,freq='H')
df
使用 resample 方法，合并为天(D)
```

小技巧，使用 resample 方法，合并为天(D)
```python
day_df = df.resample("D")["商品销量"].sum().to_frame()
day_df
```

果如下，10行，240小时，正好为 10 days:


<center>[上一个例子](181.md)    [下一个例子](183.md)</center>

================================================
FILE: md/183.md
================================================

```markdown
@author jackzhenguo
@desc 如何用 Pandas 快速生成时间序列数据？
@tag
@version 
@date 2020/03/21
```

### 第183个小例子：如何用 Pandas 快速生成时间序列数据？

与时间序列相关的问题，平时还是挺常见的。

介绍一个小技巧，使用 `pd.util.testing.makeTimeDataFrame`

只需要一行代码，便能生成一个 index 为时间序列的 DataFrame:

```python
import pandas as pd

pd.util.testing.makeTimeDataFrame(10)
```

结果：

```markdown
A B C D
2000-01-03 0.932776 -1.509302 0.285825 0.941729
2000-01-04 0.565230 -1.598449 -0.786274 -0.221476
2000-01-05 -0.152743 -0.392053 -0.127415 0.841907
2000-01-06 1.321998 -0.927537 0.205666 -0.041110
2000-01-07 0.324359 1.512743 0.553633 0.392068
2000-01-10 -0.566780 0.201565 -0.801172 -1.165768
2000-01-11 -0.259348 -0.035893 -1.363496 0.475600
2000-01-12 -0.341700 -1.438874 -0.260598 -0.283653
2000-01-13 -1.085183 0.286239 2.475605 -1.068053
2000-01-14 -0.057128 -0.602625 0.461550 0.033472
```

时间序列的间隔还能配置，默认的 A B C D 四列也支持配置。

```python
import numpy as np

df = pd.DataFrame(np.random.randint(1,1000,size=(10,3)),
                  columns = ['商品编码','商品销量','商品库存'])
df.index = pd.util.testing.makeDateIndex(10,freq='H')
```

结果：

```markdown
 商品编码 商品销量 商品库存
2000-01-01 00:00:00 99 264 98
2000-01-01 01:00:00 294 406 827
2000-01-01 02:00:00 89 221 931
2000-01-01 03:00:00 962 153 956
2000-01-01 04:00:00 538 46 374
2000-01-01 05:00:00 226 973 750
2000-01-01 06:00:00 193 866 7
2000-01-01 07:00:00 300 129 474
2000-01-01 08:00:00 966 372 835
2000-01-01 09:00:00 687 493 910
```

<center>[上一个例子](182.md)    [下一个例子](184.md)</center>

================================================
FILE: md/184.md
================================================

```markdown
@author jackzhenguo
@desc 如何快速找出 DataFrame 所有列 null 值个数
@tag
@version 
@date 2020/12/10
```

### 第184个小例子：如何快速找出 DataFrame 所有列 null 值个数？

实际使用的数据，null 值在所难免。如何快速找出 DataFrame 所有列的 null 值个数？

使用 Pandas 能非常方便实现，只需下面一行代码：

```python
data.isnull().sum()
```

data.isnull(): 逐行逐元素查找元素值是否为 null.

.sum(): 默认在 axis 为 0 上完成一次 reduce 求和。

上手实际数据，使用这个小技巧，很爽。

读取泰坦尼克预测生死的数据集

```python
data = pd.read_csv('titanicdataset-traincsv/train.csv')
```

检查 null 值:

```python
data.isnull().sum()
```

结果：

```python
PassengerId      0
Survived         0
Pclass           0
Name             0
Sex              0
Age            177
SibSp            0
Parch            0
Ticket           0
Fare             0
Cabin          687
Embarked         2
dtype: int64
```

Age 列 177 个 null 值

Cabin 列 687 个 null 值

Embarked 列 2 个 null 值

<center>[上一个例子](183.md)    [下一个例子](185.md)</center>

================================================
FILE: md/185.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/23
```

### 第185个小例子：重新排序 DataFrame 的列

下面给出 2 种简便的小技巧。先构造数据：

```python
df = pd.DataFrame(np.random.randint(0,20,size=(5,7)) \
,columns=list('ABCDEFG'))
df
```

方法1，直接了当：

```python
df2 = df[["A", "C", "D", "F", "E", "G", "B"]]
df2
```

方法2，也了解下：

```python
cols = df.columns[[0, 2 , 3, 5, 4, 6, 1]]
df3 = df[cols]
df3
```

也能得到方法1的结果。

<center>[上一个例子](184.md)    [下一个例子](186.md)</center>

================================================
FILE: md/186.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/24
```

### 第186个小例子：使用 count 统计词条 出现次数

读入 IMDB-Movie-Data 数据集，1000行数据：

```python
df = pd.read_csv("../input/imdb-data/IMDB-Movie-Data.csv")
df['Title']
```

打印 `Title` 列：

```python
0      Guardians of the Galaxy
1                   Prometheus
2                        Split
3                         Sing
4                Suicide Squad
                ...
995       Secret in Their Eyes
996            Hostel: Part II
997     Step Up 2: The Streets
998               Search Party
999                 Nine Lives
Name: Title, Length: 1000, dtype: object
```

标题是由几个单词组成，用空格分隔。

```python
df["words_count"] = df["Title"].str.count(" ") + 1
df[["Title","words_count"]]
```


<center>[上一个例子](185.md)    [下一个例子](187.md)</center>

================================================
FILE: md/187.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/25
```

### 第187个小例子：split 求时分(HH:mm)的分钟差

split 是更加高效的实现，同样需要先转化为 str 类型：

```python
df['a'] = df['a'].astype(str)
df['b'] = df['b'].astype(str)
```

其次 split：

```python
df['asplit'] = df['a'].str.split(':')
df['bsplit'] = df['b'].str.split(':')
```

使用 apply 操作每个元素，转化为分钟数：

```python
df['amins'] = df['asplit'].apply(lambda x: int(x[0])*60 + int(x[1]))
df['bmins'] = df['bsplit'].apply(lambda x: int(x[0])*60 + int(x[1]))
```

<center>[上一个例子](186.md)    [下一个例子](188.md)</center>

================================================
FILE: md/188.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/26
```

### 第188个小例子：melt透视数据小技巧

melt 方法固定某列为一个维度，组合其他列名为另一个维度，实现宽表融化为长表：

```python
   zip_code  factory  warehouse  retail
0     12345      100        200       1
1     56789      400        300       2
2    101112      500        400       3
3    131415      600        500       4
```

固定列`zip_code`，组合`factory`，`warehouse`，`retail` 三个列名为一个维度，按照这种方法凑齐两个维度后，数据一定变长。

pandas 的 melt 方法演示如下：

```python
In [49]: df = df.melt(id_vars = "zip_code") 
```

若melt方法，参数`value_vars`不赋值，默认剩余所有列都是value_vars，所以结果如下：

```python
    zip_code   variable  value
0      12345    factory    100
1      56789    factory    400
2     101112    factory    500
3     131415    factory    600
4      12345  warehouse    200
5      56789  warehouse    300
6     101112  warehouse    400
7     131415  warehouse    500
8      12345     retail      1
9      56789     retail      2
10    101112     retail      3
11    131415     retail      4
```

若只想查看 factory 和 retail，则 `value_vars` 赋值为它们即可：

```python
In [62]: df_melt2 = df.melt(id_vars = "zip_code",value_vars=['factory','reta
    ...: il'])  
```

结果：

```python
zip_code variable  value
0     12345  factory    100
1     56789  factory    400
2    101112  factory    500
3    131415  factory    600
4     12345   retail      1
5     56789   retail      2
6    101112   retail      3
7    131415   retail      4
```

melt 透视数据后，因为组合多个列为1列，所以数据一定变长。

<center>[上一个例子](187.md)    [下一个例子](189.md)</center>

================================================
FILE: md/189.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/27
```

### 第189个小例子： pivot 透视小技巧

melt 是融化数据，而 `pivot` 结冰数据，它们是一对互逆操作。

这是上面 melt 后的数据：

```python
zip_code variable  value
0     12345  factory    100
1     56789  factory    400
2    101112  factory    500
3    131415  factory    600
4     12345   retail      1
5     56789   retail      2
6    101112   retail      3
7    131415   retail      4
```

现在想要还原为：

```python
variable factory retail
zip_code               
12345        100      1
56789        400      2
101112       500      3
131415       600      4
```

如何实现？

使用 `pivot` 方法很容易做到：

```python
df_melt2.pivot(index='zip_code',columns='variable')
```

index 设定第一个轴，为 zip_code，columns 设定哪些列或哪个列的不同取值组合为一个轴，此处设定为 variable 列，它一共有 2 种不同的取值，分别为 factory, retail，pivot 透视后变为列名，也就是 axis = 1 的轴

pivot 方法没有聚合功能，它的升级版为 `pivot_table` 方法，能对数据聚合。

<center>[上一个例子](188.md)    [下一个例子](190.md)</center>

================================================
FILE: md/19.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/20
```

#### 19 拿来就用的排序函数

排序：

```python
In [1]: a = [1,4,2,3,1]

In [2]: sorted(a,reverse=True)
Out[2]: [4, 3, 2, 1, 1]

In [3]: a = [{'name':'xiaoming','age':18,'gender':'male'},{'name':'
     ...: xiaohong','age':20,'gender':'female'}]
In [4]: sorted(a,key=lambda x: x['age'],reverse=False)
Out[4]:
[{'name': 'xiaoming', 'age': 18, 'gender': 'male'},
 {'name': 'xiaohong', 'age': 20, 'gender': 'female'}]
```

<center>[上一个例子](18.md)    [下一个例子](20.md)</center>

================================================
FILE: md/190.md
================================================

```markdown
@author jackzhenguo
@desc 随机读取文件的K行，生成N个
@tag
@version 
@date 2020/03/28
```

### 第190个小例子： 随机读取文件的K行，生成N个

```python
def random_lines_save(filename,gen_file_cnt=10):
    """
    随机选取文件的某些行并保存，想要生成这类文件的个数由参数
    @param: gen_file_cnt 指定 
    
    @param: filename 读入文件的完整路径
    @param: gen_file_cnt 想要产生的文件个数
    """
    df = pd.read_excel(filename)
    for i in range(gen_file_cnt):
        n = random.randint(1,len(df))
        dfs = df.sample(n)
        dfs.to_excel(str(n)+".xlsx",index=False)
        print(str(n)+".xlsx")
```

这是一个很实用的函数，用于随机生成K行N个文件，使用场景：原来的文件行数较多，想从中随机提取组合N个文件时。

<center>[上一个例子](189.md)    [下一个例子](191.md)</center>

================================================
FILE: md/191.md
================================================

```markdown
@author jackzhenguo
@desc 格式化Pandas的时间列
@tag
@version 
@date 2020/03/29
```

### 第191个小例子： 格式化Pandas的时间列


```python
import pandas as pd 
from datetime import datetime, time 

def series_dt_fmt(s:pd.Series,fmt:str)-> pd.Series: 
        """
        根据fmt格式，格式化s列
        s列是datetime 或者 datetime的str类型，如'2020-12-30 11:44:00' 
        """
        st = pd.to_datetime(s)
        return st.apply(lambda t: datetime.strftime(t,fmt))
```

别看只有两行代码，却能实现更加丰富的功能，相比pandas，支持直接返回时分等格式：

```python
s = pd.Series(['2020-12-30 11:44:00','2020-12-30 11:20:10'])

# 只保留时分
fmt = '%H:%M'
series_dt_fmt(s,fmt)

# 输出结果
0    11:44
1    11:20
dtype: object
```

<center>[上一个例子](190.md)    [下一个例子](192.md)</center>

================================================
FILE: md/192.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/30
```

### 192: 创建SQLite连接

编写一个Python程序，创建一个SQLite数据库，并与数据库连接，打印SQLite数据库的版本

一种解决方法：

```python
import sqlite3
try:
   sqlite_Connection = sqlite3.connect('temp.db')
   conn = sqlite_Connection.cursor()
   print("连接到 SQLite.")
   sqlite_select_Query = "select sqlite_version();"
   conn.execute(sqlite_select_Query)
   record = conn.fetchall()
   print("SQLite 数据库的版本是 ", record)
   conn.close()
except sqlite3.Error as error:
   print("连接到SQLite出错：", error)
finally:
   if (sqlite_Connection):
       sqlite_Connection.close()
       print("关闭SQLite连接")
```

以上就是第192例，希望对你有用，欢迎点赞支持。

<center>[上一个例子](191.md)    [下一个例子](193.md)</center>

================================================
FILE: md/193.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/03/31
```

### 193 json对象转python对象
python的`json`模块`loads`方法将json对象转为字典，如下所示：

```python
In [1]: import json                                                             

In [2]: json_obj =  '{ "Name":"David", "Class":"I", "Age":6 }'                  

In [3]: python_obj = json.loads(json_obj)                                       

In [4]: type(python_obj)                                                        
Out[4]: dict
```

打印查看相关属性
```python
print("\nJSON data:")
print(python_obj)
print("\nName: ",python_obj["Name"])
print("Class: ",python_obj["Class"])
print("Age: ",python_obj["Age"]) 
```

<center>[上一个例子](192.md)    [下一个例子](194.md)</center>

================================================
FILE: md/194.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/01
```
#### 194 python对象转json对象

```python
import json
# a Python object (dict):
python_obj = {
  "name": "David",
  "class":"I",
  "age": 6  
}
print(type(python_obj))
```

使用`json.dumps`方法转化为json对象：
```
# convert into JSON:
j_data = json.dumps(python_obj)

# result is a JSON string:
print(j_data)
```

##### 带格式转为json

若字典转化为json对象后，保证键有序，且缩进4格，如何做到？

```python
json.dumps(j_str, sort_keys=True, indent=4)
```

例子：

```python
import json
j_str = {'4': 5, '6': 7, '1': 3, '2': 4}
print(json.dumps(j_str, sort_keys=True, indent=4))
```


<center>[上一个例子](193.md)    [下一个例子](195.md)</center>


================================================
FILE: md/195.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/02
```
#### 195 发现列表前3个最大或最小数

使用堆模块 heapq 里的 nlargest 方法：

```python
import heapq as hq
nums_list = [25, 35, 22, 85, 14, 65, 75, 22, 58]

# Find three largest values
largest_nums = hq.nlargest(3, nums_list)
print(largest_nums)
```

相应的求最小3个数，使用堆模块 heapq 里的 nsmallest 方法：

```python
import heapq as hq
nums_list = [25, 35, 22, 85, 14, 65, 75, 22, 58]
smallest_nums = hq.nsmallest(3, nums_list)
print("\nThree smallest numbers are:", smallest_nums)
```


<center>[上一个例子](194.md)    [下一个例子](196.md)</center>

================================================
FILE: md/196.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/03
```

### 196 使用堆排序列表为升序

使用 heapq 模块，首先对列表建堆，默认建立小根堆，调用len(nums) 次heapop：

```python
import heapq as hq

nums_list = [18, 14, 10, 9, 8, 7, 9, 3, 2, 4, 1]
hq.heapify(nums_list)
s_result = [hq.heappop(nums_list) for _ in range(len(nums_list))]
print(s_result)
```


<center>[上一个例子](195.md)    [下一个例子](197.md)</center>

================================================
FILE: md/197.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/04
```

下面正则适用于提取正整数和大于0的浮点数，参看我的，若有疏漏欢迎补充。

```python
>>> import re
>>> pat_integ = '[1-9]+\d*'
>>> pat_float0 = '0\.\d+[1-9]'
>>> pat_float1 = '[1-9]\d*\.d+'
>>> pat = 'r%s|%s|%s'%(pat_float_0,pat_float_1,pat_integ)
>>> re.findall(pat, r)
['0.78', '3446.73', '0.91', '13642.95', '1.06', '2672.12', '3000']
```

排除这些串：

000
000100
0.00
000.00


解释：`*`表示前一个字符出现0次或多次，`+`表示前一个字符出现1次或多次，`\d`表示数字[0-9]，`[1-9]`表示1,2,3,4,5,6,7,8,9，`\.`表示小数点

主要考虑：正整数最左侧一位大于0，大于1的浮点数必须以[1-9]开始，大于0小于1的浮点数小数点前只有1个0.


Day163：使用Python正则 提取出输入一段文字中的所有浮点数和整数 #Python拆书1# 

例如： 截至收盘，上证指数涨0.78%，报3446.73点，深证成指涨0.91%，报13642.95点，创业板指涨1.06%，报2672.12点。指数午后震荡走高，碳中和概念强者恒强，板块内上演涨停潮，环保、物业、特高压板块午后涨幅扩大，数字货币板块尾盘冲高，钢铁、煤炭、有色板块全天较为低迷，题材股午后整体回暖，两市上涨个股逾3000家，赚钱效益较好。

提取出所有浮点数和整数： 0.78, 3446.73, 0.91,13642.95 等


<center>[上一个例子](196.md)    [下一个例子](198.md)</center>


================================================
FILE: md/198.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/05
```
## 1 常见算术运算

```python
x, y = 3, 2
print(x + y) # = 5 
print(x - y) # = 1 
print(x * y) # = 6 
print(x / y) # = 1.5 
print(x // y) # = 1 
print(x % y) # = 1 
print(-x) # = -3 
print(abs(-x)) # = 3 
print(int(3.9)) # = 3 
print(float(x)) # = 3.0 
print(x ** y) # = 9
```

大多数操作符都是不言自明的。注意，`//`运算符执行整数除法。结果是一个向下舍入的整数值（例如，3//2==1）		     

<center>[上一个例子](197.md)    [下一个例子](199.md)</center>


================================================
FILE: md/199.md
================================================

```markdown
@author jackzhenguo
@desc 求两点球面距离
@tag
@version 
@date 2020/04/06
```

```python
EARTH_RADIUS = 6378.137

import math
# 角度弧度计算公式
def get_radian(degree):
    return degree * 3.1415926 / 180.0
# 根据经纬度计算两点之间的距离，得到的单位是 千米
def get_distance(lat1,lng1,lat2,lng2):
    radLat1 = get_radian(lat1)
    radLat2 = get_radian(lat2)
    a = radLat1 - radLat2 # 两点纬度差
    b = get_radian(lng1) - get_radian(lng2); # 两点的经度差
    s = 2 * math.asin(math.sqrt(math.pow(math.sin(a / 2), 2) + 
                                math.cos(radLat1) * math.cos(radLat2) * math.pow(math.sin(b / 2), 2)));
    s = s * EARTH_RADIUS
    return s
```		     

<center>[上一个例子](198.md)    [下一个例子](200.md)</center>

================================================
FILE: md/2.md
================================================
```markdown
@author jackzhenguo
@desc 进制转化
@date 2019/2/10
```

#### 2  进制转化

十进制转换为二进制：
```python
In [1]: bin(10)                                                                 
Out[1]: '0b1010'
```

十进制转换为八进制：
```python
In [2]: oct(9)                                                                  
Out[2]: '0o11'
```

十进制转换为十六进制：
```python
In [3]: hex(15)                                                                 
Out[3]: '0xf'
```


<center>[上一个例子](1.md)    [下一个例子](3.md)</center>

================================================
FILE: md/20.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/20
```

#### 20 求和函数

求和：

```python
In [181]: a = [1,4,2,3,1]

In [182]: sum(a)
Out[182]: 11

In [185]: sum(a,10) #求和的初始值为10
Out[185]: 21
```

<center>[上一个例子](19.md)    [下一个例子](21.md)</center>

================================================
FILE: md/200.md
================================================

```markdown
@author jackzhenguo
@desc 获取文件编码
@tag
@version 
@date 2020/04/07
```

```python
import chardet
from chardet import UniversalDetector

def get_encoding(file):
    with open(file, "rb") as f:
        cs = chardet.detect(f.read())
        return cs['encoding']
    
    detector = UniversalDetector()
    with open(file, "rb") as f:
        for line in f.readlines():
            detector.feed(line)
            if detector.done:
                break
        detector.close()
    return detector.result
```

<center>[上一个例子](199.md)    [下一个例子](201.md)</center>

================================================
FILE: md/201.md
================================================

```markdown
@author jackzhenguo
@desc 格式化json串
@tag
@version 
@date 2020/04/08
```
```python
import json


def format_json(json_str: str):
    dic = json.loads(json_str)

    js = json.dumps(dic,
                    sort_keys=True,
                    ensure_ascii=False,
                    indent=4,
                    separators=(', ', ': '))
    return js
```
		     

<center>[上一个例子](200.md)    [下一个例子](202.md)</center>

================================================
FILE: md/202.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/09
```
		     

<center>[上一个例子](201.md)    [下一个例子](203.md)</center>

================================================
FILE: md/203.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/10
```
		     

<center>[上一个例子](202.md)    [下一个例子](204.md)</center>

================================================
FILE: md/204.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/11
```
		     

<center>[上一个例子](203.md)    [下一个例子](205.md)</center>

================================================
FILE: md/205.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/12
```
		     

<center>[上一个例子](204.md)    [下一个例子](206.md)</center>

================================================
FILE: md/206.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/13
```
		     

<center>[上一个例子](205.md)    [下一个例子](207.md)</center>

================================================
FILE: md/207.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/14
```
		     

<center>[上一个例子](206.md)    [下一个例子](208.md)</center>

================================================
FILE: md/208.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/15
```
		     

<center>[上一个例子](207.md)    [下一个例子](209.md)</center>

================================================
FILE: md/209.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/16
```
		     

<center>[上一个例子](208.md)    [下一个例子](210.md)</center>

================================================
FILE: md/21.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/20
```

#### 21 nonlocal用于内嵌函数中

关键词`nonlocal`常用于函数嵌套中，声明变量`i`为非局部变量；
如果不声明，`i+=1`表明`i`为函数`wrapper`内的局部变量，因为在`i+=1`引用(reference)时,i未被声明，所以会报`unreferenced variable`的错误。

```python
def excepter(f):
    i = 0
    t1 = time.time()
    def wrapper(): 
        try:
            f()
        except Exception as e:
            nonlocal i
            i += 1
            print(f'{e.args[0]}: {i}')
            t2 = time.time()
            if i == n:
                print(f'spending time:{round(t2-t1,2)}')
    return wrapper
```

<center>[上一个例子](20.md)    [下一个例子](22.md)</center>

================================================
FILE: md/210.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/17
```
		     

<center>[上一个例子](209.md)    [下一个例子](211.md)</center>

================================================
FILE: md/211.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/18
```
		     

<center>[上一个例子](210.md)    [下一个例子](212.md)</center>

================================================
FILE: md/212.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/19
```
		     

<center>[上一个例子](211.md)    [下一个例子](213.md)</center>

================================================
FILE: md/213.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/20
```
		     

<center>[上一个例子](212.md)    [下一个例子](214.md)</center>

================================================
FILE: md/214.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/21
```
		     

<center>[上一个例子](213.md)    [下一个例子](215.md)</center>

================================================
FILE: md/215.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/22
```
		     

<center>[上一个例子](214.md)    [下一个例子](216.md)</center>

================================================
FILE: md/216.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/23
```
		     

<center>[上一个例子](215.md)    [下一个例子](217.md)</center>

================================================
FILE: md/217.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/24
```
		     

<center>[上一个例子](216.md)    [下一个例子](218.md)</center>

================================================
FILE: md/218.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/25
```
		     

<center>[上一个例子](217.md)    [下一个例子](219.md)</center>

================================================
FILE: md/219.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/26
```
		     

<center>[上一个例子](218.md)    [下一个例子](220.md)</center>

================================================
FILE: md/22.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/23
```

#### 22 global 声明全局变量

先回答为什么要有`global`，一个变量被多个函数引用，想让全局变量被所有函数共享。有的伙伴可能会想这还不简单，这样写：

```python
i = 5
def f():
    print(i)

def g():
    print(i)
    pass

f()
g()

```

f和g两个函数都能共享变量`i`，程序没有报错，所以他们依然不明白为什么要用`global`.

但是，如果我想要有个函数对`i`递增，这样：

```python
def h():
    i += 1

h()
```

此时执行程序，bang, 出错了！ 抛出异常：`UnboundLocalError`，原来编译器在解释`i+=1`时会把`i`解析为函数`h()`内的局部变量，很显然在此函数内，编译器找不到对变量`i`的定义，所以会报错。

`global`就是为解决此问题而被提出，在函数h内，显式地告诉编译器`i`为全局变量，然后编译器会在函数外面寻找`i`的定义，执行完`i+=1`后，`i`还为全局变量，值加1：

```python
i = 0
def h():
    global i
    i += 1

h()
print(i)
```

<center>[上一个例子](21.md)    [下一个例子](23.md)</center>

================================================
FILE: md/220.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/27
```
		     

<center>[上一个例子](219.md)    [下一个例子](221.md)</center>

================================================
FILE: md/221.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/28
```
		     

<center>[上一个例子](220.md)    [下一个例子](222.md)</center>

================================================
FILE: md/222.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/29
```
		     

<center>[上一个例子](221.md)    [下一个例子](223.md)</center>

================================================
FILE: md/223.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/04/30
```
		     

<center>[上一个例子](222.md)    [下一个例子](224.md)</center>

================================================
FILE: md/224.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/01
```
		     

<center>[上一个例子](223.md)    [下一个例子](225.md)</center>

================================================
FILE: md/225.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/02
```
		     

<center>[上一个例子](224.md)    [下一个例子](226.md)</center>

================================================
FILE: md/226.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/03
```
		     

<center>[上一个例子](225.md)    [下一个例子](227.md)</center>

================================================
FILE: md/227.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/04
```
		     

<center>[上一个例子](226.md)    [下一个例子](228.md)</center>

================================================
FILE: md/228.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/05
```
		     

<center>[上一个例子](227.md)    [下一个例子](229.md)</center>

================================================
FILE: md/229.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/06
```
		     

<center>[上一个例子](228.md)    [下一个例子](230.md)</center>

================================================
FILE: md/23.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/23
```

#### 23 交换两元素

理解交换两个元素，需要首先明白什么是pack，什么是unpack

```python
In [1]: a=[3,1]                                                                 

# unpack
In [2]: a0,a1 = a                                                               

In [3]: a0                                                                      
Out[3]: 3

In [4]: a1                                                                      
Out[4]: 1

# pack
In [5]: b = a0, a1                                                              

In [6]: b                                                                       
Out[6]: (3, 1)

```

所以下面 `b,a = a,b` 交换2个元素的过程，实际是先pack a,b为元组 (a,b)，然后再unpack (a,b) 给 b, a的过程

```python
def swap(a, b):
    return b, a


print(swap(1, 0))  # (0,1)
```

<center>[上一个例子](22.md)    [下一个例子](24.md)</center>


================================================
FILE: md/230.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/07
```
		     

<center>[上一个例子](229.md)    [下一个例子](231.md)</center>

================================================
FILE: md/231.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/08
```
		     

<center>[上一个例子](230.md)    [下一个例子](232.md)</center>

================================================
FILE: md/232.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/09
```
		     

<center>[上一个例子](231.md)    [下一个例子](233.md)</center>

================================================
FILE: md/233.md
================================================

```markdown
@author jackzhenguo
@desc
@tag
@version 
@date 2020/05/10
```
		     

<center>[上一个例子](232.md)    [下一个例子](234.md)</center>

================================================
FILE: md/24.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/25
```

#### 24 操作函数对象

```python
In [31]: def f():
    ...:     print('i\'m f')

In [32]: def g():
    ...:     print('i\'m g')

In [33]: [f,g][1]()
i'm g
```

创建函数对象的list，根据想要调用的index，方便统一调用。

<center>[上一个例子](23.md)    [下一个例子](25.md)</center>

================================================
FILE: md/25.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/25
```

#### 25 生成逆序序列

```python
list(range(10,-1,-1))
# [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
```

第三个参数为负时，表示从第一个参数开始递减，终止到第二个参数(不包括此边界)

<center>[上一个例子](24.md)    [下一个例子](26.md)</center>

================================================
FILE: md/26.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/27
```

#### 26 函数的五类参数使用例子

python五类参数：位置参数，关键字参数，默认参数，可变位置或关键字参数的使用。

```python
def f(a,*b,c=10,**d):
  print(f'a:{a},b:{b},c:{c},d:{d}')
```

*默认参数`c`不能位于可变关键字参数`d`后.*

调用f:

```python
In [10]: f(1,2,5,width=10,height=20)
a:1,b:(2, 5),c:10,d:{'width': 10, 'height': 20}
```

可变位置参数`b`实参后被解析为元组`(2,5)`;而c取得默认值10; d被解析为字典.

再次调用f:

```python
In [11]: f(a=1,c=12)
a:1,b:(),c:12,d:{}
```

a=1传入时a就是关键字参数，b,d都未传值，c被传入12，而非默认值。

注意观察参数`a`, 既可以`f(1)`,也可以`f(a=1)` 其可读性比第一种更好，建议使用f(a=1)。如果要强制使用`f(a=1)`，需要在前面添加一个**星号**:

```python
def f(*,a,**b):
  print(f'a:{a},b:{b}')
```

此时f(1)调用，将会报错：`TypeError: f() takes 0 positional arguments but 1 was given`

只能`f(a=1)`才能OK.

说明前面的`*`发挥作用，它变为只能传入关键字参数，那么如何查看这个参数的类型呢？借助python的`inspect`模块：

```python
In [22]: for name,val in signature(f).parameters.items():
    ...:     print(name,val.kind)
    ...:
a KEYWORD_ONLY
b VAR_KEYWORD
```

可看到参数`a`的类型为`KEYWORD_ONLY`，也就是仅仅为关键字参数。

但是，如果f定义为：

```python
def f(a,*b):
  print(f'a:{a},b:{b}')
```

查看参数类型：

```python
In [24]: for name,val in signature(f).parameters.items():
    ...:     print(name,val.kind)
    ...:
a POSITIONAL_OR_KEYWORD
b VAR_POSITIONAL
```

可以看到参数`a`既可以是位置参数也可是关键字参数。

<center>[上一个例子](25.md)    [下一个例子](27.md)</center>

================================================
FILE: md/27.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/27
```

#### 27  使用slice对象

生成关于蛋糕的序列cake1：

```python
In [1]: cake1 = list(range(5,0,-1))

In [2]: b = cake1[1:10:2]

In [3]: b
Out[3]: [4, 2]

In [4]: cake1
Out[4]: [5, 4, 3, 2, 1]
```

再生成一个序列：

```python
In [5]: from random import randint
   ...: cake2 = [randint(1,100) for _ in range(100)]
   ...: # 同样以间隔为2切前10个元素，得到切片d
   ...: d = cake2[1:10:2]
In [6]: d
Out[6]: [75, 33, 63, 93, 15]
```

你看，我们使用同一种切法，分别切开两个蛋糕cake1,cake2. 后来发现这种切法`极为经典`，又拿它去切更多的容器对象。

那么，为什么不把这种切法封装为一个对象呢？于是就有了slice对象。

定义slice对象极为简单，如把上面的切法定义成slice对象：

```python
perfect_cake_slice_way = slice(1,10,2)
#去切cake1
cake1_slice = cake1[perfect_cake_slice_way] 
cake2_slice = cake2[perfect_cake_slice_way]

In [11]: cake1_slice
Out[11]: [4, 2]

In [12]: cake2_slice
Out[12]: [75, 33, 63, 93, 15]
```

与上面的结果一致。

对于逆向序列切片，`slice`对象一样可行：

```python
a = [1,3,5,7,9,0,3,5,7]
a_ = a[5:1:-1]

named_slice = slice(5,1,-1)
a_slice = a[named_slice] 

In [14]: a_
Out[14]: [0, 9, 7, 5]

In [15]: a_slice
Out[15]: [0, 9, 7, 5]
```

频繁使用同一切片的操作可使用slice对象抽出来，复用的同时还能提高代码可读性。

<center>[上一个例子](26.md)    [下一个例子](28.md)</center>

================================================
FILE: md/28.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/2
```

#### 28  lambda 函数的动画演示

有些读者反映，`lambda`函数不太会用，问我能不能解释一下。

比如，下面求这个 `lambda`函数：

```python
def max_len(*lists):
    return max(*lists, key=lambda v: len(v))
```

有两点疑惑：

- 参数`v`的取值？ 
- `lambda`函数有返回值吗？如果有，返回值是多少？

调用上面函数，求出以下三个最长的列表：

```python
r = max_len([1, 2, 3], [4, 5, 6, 7], [8])
print(f'更长的列表是{r}')
```


结论：

- 参数v的可能取值为`*lists`，也就是 `tuple` 的一个元素。

- `lambda`函数返回值，等于`lambda v`冒号后表达式的返回值。

<center>[上一个例子](27.md)    [下一个例子](29.md)</center>

================================================
FILE: md/29.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/2
```

#### 29  转为字典　　

创建数据字典

```python
# 方法1：使用dict
In [1]: dict()
Out[1]: {}
In [2]: dict(a='a',b='b')
Out[2]: {'a': 'a', 'b': 'b'}

# 方法2：zip
In [3]: dict(zip(['a','b'],[1,2]))
Out[3]: {'a': 1, 'b': 2}

# 方法3：嵌入元组的列表
In [4]: dict([('a',1),('b',2)])
Out[4]: {'a': 1, 'b': 2}

# 方法4：自典型字符串
In [1]: s = "{'a':1, 'b':2}"                                                    
In [2]: eval(s)                                                                 
Out[2]: {'a': 1, 'b': 2}
```

<center>[上一个例子](28.md)    [下一个例子](30.md)</center>


================================================
FILE: md/3.md
================================================
```markdown
@author jackzhenguo
@desc 整数和ASCII互转
@date 2019/2/10
```

#### 3 整数和ASCII互转

十进制整数对应的`ASCII字符`
```python
In [1]: chr(65)
Out[1]: 'A'
```

查看某个`ASCII字符`对应的十进制数
```python
In [1]: ord('A')
Out[1]: 65
```

<center>[上一个例子](2.md)    [下一个例子](4.md)</center>

================================================
FILE: md/30.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 30 冻结集合　　

创建一个不可修改的集合。

```python
In [1]: frozenset([1,1,3,2,3])
Out[1]: frozenset({1, 2, 3})
```

因为不可修改，所以没有像`set`那样的`add`和`pop`方法

<center>[上一个例子](29.md)    [下一个例子](31.md)</center>

================================================
FILE: md/31.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 31 转为集合类型

返回一个set对象，集合内不允许有重复元素：

```python
In [1]: a = [1,4,2,3,1]

In [2]: set(a)
Out[2]: {1, 2, 3, 4}

In [3]: b = set(a)

In [4]: b.add(5)

In [5]: b
Out[5]: {1, 2, 3, 4, 5}

In [6]: b.pop()
Out[6]: 1

In [7]: b
Out[7]: {2, 3, 4, 5}

In [8]: b.pop()
Out[8]: 2

In [9]: b
Out[9]: {3, 4, 5}

# 注意pop删除集合内任意一个元素
In [10]: help(b.pop)
Help on built-in function pop:

pop(...) method of builtins.set instance
    Remove and return an arbitrary set element.
    Raises KeyError if the set is empty.
```

<center>[上一个例子](30.md)    [下一个例子](32.md)</center>


================================================
FILE: md/32.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 32 转元组

 `tuple()` 将对象转为一个不可变的序列类型

 ```python
 In [16]: i_am_list = [1,3,5]
 In [17]: i_am_tuple = tuple(i_am_list)
 In [18]: i_am_tuple
 Out[18]: (1, 3, 5)
 ```

<center>[上一个例子](31.md)    [下一个例子](33.md)</center>

================================================
FILE: md/33.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/6
```

#### 33 对象是否可调用　　

检查对象是否可被调用

```python
In [1]: callable(str)
Out[1]: True

In [2]: callable(int)
Out[2]: True
```

```python
In [18]: class Student(): 
    ...:     def __init__(self,id,name): 
    ...:         self.id = id 
    ...:         self.name = name 
    ...:     def __repr__(self): 
    ...:         return 'id = '+self.id +', name = '+self.name 
    ...

In [19]: xiaoming = Student('001','xiaoming')                                   

In [20]: callable(xiaoming)                                                     
Out[20]: False
```
如果能调用`xiaoming()`, 需要重写`Student`类的`__call__`方法：

```python
In [1]: class Student():
    ...:     def __init__(self,id,name):
    ...:         self.id = id
    ...:         self.name = name
    ...:     def __repr__(self):
    ...:         return 'id = '+self.id +', name = '+self.name
    ...:     def __call__(self):
    ...:         print('I can be called')
    ...:         print(f'my name is {self.name}')
    ...: 

In [2]: t = Student('001','xiaoming')

In [3]: t()
I can be called
my name is xiaoming
```

<center>[上一个例子](32.md)    [下一个例子](34.md)</center>

================================================
FILE: md/34.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 34  ascii 展示对象　　

调用对象的 `__repr__` 方法，获得该方法的返回值，如下例子返回值为字符串

```python
>>> class Student():
    def __init__(self,id,name):
        self.id = id
        self.name = name
    def __repr__(self):
        return 'id = '+self.id +', name = '+self.name
```
调用：
```python
>>> xiaoming = Student(id='1',name='xiaoming')
>>> xiaoming
id = 1, name = xiaoming
>>> ascii(xiaoming)
'id = 1, name = xiaoming'
```

<center>[上一个例子](33.md)    [下一个例子](35.md)</center>

================================================
FILE: md/35.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 35 类方法　

`classmethod` 装饰器对应的函数不需要实例化，不需要 `self `参数。

但第一个参数需要是表示自身类的 `cls` 参数，可以来调用类的属性，类的方法，实例化对象等。

```python
In [1]: class Student():
    ...:     def __init__(self,id,name):
    ...:         self.id = id
    ...:         self.name = name
    ...:     def __repr__(self):
    ...:         return 'id = '+self.id +', name = '+self.name
    ...:     @classmethod
    ...:     def f(cls):
    ...:         print(cls)
```

<center>[上一个例子](34.md)    [下一个例子](36.md)</center>

================================================
FILE: md/36.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 36 动态删除属性　　

```python
>>> class Student():
    def __init__(self,id,name):
        self.id = id
        self.name = name
    def __repr__(self):
        return 'id = '+self.id +', name = '+self.name
```
调用：
```python
>>> xiaoming = Student(id='1',name='xiaoming')
>>> xiaoming
id = 1, name = xiaoming
>>> ascii(xiaoming)
'id = 1, name = xiaoming'
```

删除对象的属性

```python
In [1]: delattr(xiaoming,'id')

In [2]: hasattr(xiaoming,'id')
Out[2]: False
```

<center>[上一个例子](35.md)    [下一个例子](37.md)</center>

================================================
FILE: md/37.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 37 一键查看对象所有方法　

不带参数时返回`当前范围`内的变量、方法和定义的类型列表；带参数时返回`参数`的属性，方法列表。

```python
In [96]: dir(xiaoming)
Out[96]:
['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 
 'name']
```

<center>[上一个例子](36.md)    [下一个例子](38.md)</center>

================================================
FILE: md/38.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 38 动态获取对象属性　

获取对象的属性

```python
In [1]: class Student():
   ...:     def __init__(self,id,name):
   ...:         self.id = id
   ...:         self.name = name
   ...:     def __repr__(self):
   ...:         return 'id = '+self.id +', name = '+self.name

In [2]: xiaoming = Student(id='001',name='xiaoming')
In [3]: getattr(xiaoming,'name') # 获取xiaoming这个实例的name属性值
Out[3]: 'xiaoming'
```

<center>[上一个例子](37.md)    [下一个例子](39.md)</center>

================================================
FILE: md/39.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 39 对象是否有某个属性

```python
In [1]: class Student():
   ...:     def __init__(self,id,name):
   ...:         self.id = id
   ...:         self.name = name
   ...:     def __repr__(self):
   ...:         return 'id = '+self.id +', name = '+self.name

In [2]: xiaoming = Student(id='001',name='xiaoming')
In [3]: hasattr(xiaoming,'name')
Out[3]: True

In [4]: hasattr(xiaoming,'address')
Out[4]: False
```

<center>[上一个例子](38.md)    [下一个例子](40.md)</center>

================================================
FILE: md/4.md
================================================
```markdown
@author jackzhenguo
@desc 元素都为真检查
@date 2019/2/10
```

#### 4 元素都为真检查

所有元素都为真，返回 `True`，否则为`False`
```python
In [5]: all([1,0,3,6])                                                          
Out[5]: False
```
```python
In [6]: all([1,2,3])                                                            
Out[6]: True
```

<center>[上一个例子](3.md)    [下一个例子](5.md)</center>

================================================
FILE: md/40.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 40 对象门牌号　

```python
In [1]: class Student():
   ...:     def __init__(self,id,name):
   ...:         self.id = id
   ...:         self.name = name
   ...:     def __repr__(self):
   ...:         return 'id = '+self.id +', name = '+self.name

In [2]: xiaoming = Student(id='001',name='xiaoming')
```

返回对象的内存地址

```python
In [1]: id(xiaoming)
Out[1]: 98234208
```

<center>[上一个例子](39.md)    [下一个例子](41.md)</center>

================================================
FILE: md/41.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/13
```

#### 41 isinstance

判断*object*是否为类*classinfo*的实例，是返回true

```python
In [1]: class Student():
   ...:     def __init__(self,id,name):
   ...:         self.id = id
   ...:         self.name = name
   ...:     def __repr__(self):
   ...:         return 'id = '+self.id +', name = '+self.name

In [2]: xiaoming = Student(id='001',name='xiaoming')

In [3]: isinstance(xiaoming,Student)
Out[3]: True
```

<center>[上一个例子](40.md)    [下一个例子](42.md)</center>

================================================
FILE: md/42.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/13
```

#### 42 issubclass父子关系鉴定

```python
In [1]: class undergraduate(Student):
    ...:     def studyClass(self):
    ...:         pass
    ...:     def attendActivity(self):
    ...:         pass

In [2]: issubclass(undergraduate,Student)
Out[2]: True

In [3]: issubclass(object,Student)
Out[3]: False

In [4]: issubclass(Student,object)
Out[4]: True
```

如果class是classinfo元组中某个元素的子类，也会返回True

```python
In [1]: issubclass(int,(int,float))
Out[1]: True
```

<center>[上一个例子](41.md)    [下一个例子](43.md)</center>

================================================
FILE: md/43.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/15
```

#### 43 所有对象之根

object 是所有类的基类

```python
In [1]: o = object()

In [2]: type(o)
Out[2]: object
```

<center>[上一个例子](42.md)    [下一个例子](44.md)</center>

================================================
FILE: md/44.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/16
```

#### 44  创建属性的两种方式

返回 property 属性，典型的用法：

```python
class C:
    def __init__(self):
        self._x = None

    def getx(self):
        return self._x

    def setx(self, value):
        self._x = value

    def delx(self):
        del self._x
    # 使用property类创建 property 属性
    x = property(getx, setx, delx, "I'm the 'x' property.")
```

使用python装饰器，实现与上完全一样的效果代码：

```python
class C:
    def __init__(self):
        self._x = None

    @property
    def x(self):
        return self._x

    @x.setter
    def x(self, value):
        self._x = value

    @x.deleter
    def x(self):
        del self._x
```

<center>[上一个例子](43.md)    [下一个例子](45.md)</center>

================================================
FILE: md/45.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 45 查看对象类型

*class* `type`(*name*, *bases*, *dict*)

传入一个参数时，返回 *object* 的类型：

```python
In [1]: class Student():
   ...:     def __init__(self,id,name):
   ...:         self.id = id
   ...:         self.name = name
   ...:     def __repr__(self):
   ...:         return 'id = '+self.id +', name = '+self.name
   ...: 

In [2]: xiaoming = Student(id='001',name='xiaoming')
In [3]: type(xiaoming)
Out[3]: __main__.Student

In [4]: type(tuple())
Out[4]: tuple
```

<center>[上一个例子](44.md)    [下一个例子](46.md)</center>

================================================
FILE: md/46.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 46 元类使用介绍

`xiaoming`, `xiaohong`, `xiaozhang` 都是学生，这类群体叫做 `Student`. 

Python 定义类的常见方法，使用关键字 `class`

```python
In [36]: class Student(object):
    ...:     pass
```

`xiaoming`, `xiaohong`, `xiaozhang` 是类的实例，则：

```python
xiaoming = Student()
xiaohong = Student()
xiaozhang = Student()
```

创建后，xiaoming 的 `__class__` 属性，返回的便是 `Student`类

```python
In [38]: xiaoming.__class__
Out[38]: __main__.Student
```

问题在于，`Student` 类有 `__class__`属性，如果有，返回的又是什么？

```python
In [39]: xiaoming.__class__.__class__
Out[39]: type
```

哇，程序没报错，返回 `type`

那么，我们不妨猜测：`Student` 类，类型就是 `type`

换句话说，`Student`类就是一个**对象**，它的类型就是 `type`

所以，Python 中一切皆对象，**类也是对象**

Python 中，将描述 `Student` 类的类被称为：元类。

按照此逻辑延伸，描述元类的类被称为：*元元类*，开玩笑了~ 描述元类的类也被称为元类。

聪明的朋友会问了，既然 `Student` 类可创建实例，那么 `type` 类可创建实例吗？ 如果能，它创建的实例就叫：类 了。 你们真聪明！

说对了，`type` 类一定能创建实例，比如 `Student` 类了。

```python
In [40]: Student = type('Student',(),{})

In [41]: Student
Out[41]: __main__.Student
```

它与使用 `class` 关键字创建的 `Student` 类一模一样。

Python 的类，因为又是对象，所以和 `xiaoming`，`xiaohong` 对象操作相似。支持：

- 赋值
- 拷贝
- 添加属性
- 作为函数参数

```python
In [43]: StudentMirror = Student # 类直接赋值 # 类直接赋值
In [44]: Student.class_property = 'class_property' # 添加类属性
In [46]: hasattr(Student, 'class_property')
Out[46]: True
```

元类，确实使用不是那么多，也许先了解这些，就能应付一些场合。就连 Python 界的领袖 `Tim Peters` 都说：

“元类就是深度的魔法，99%的用户应该根本不必为此操心。

<center>[上一个例子](45.md)    [下一个例子](47.md)</center>

================================================
FILE: md/47.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 47 枚举对象　　

返回一个可以枚举的对象，该对象的next()方法将返回一个元组。

```python
In [1]: s = ["a","b","c"]
    ...: for i ,v in enumerate(s,1):
    ...:     print(i,v)
    ...:
1 a
2 b
3 c
```

<center>[上一个例子](46.md)    [下一个例子](48.md)</center>

================================================
FILE: md/48.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 48 查看变量所占字节数
`getsizeof`查看变量占用字节数
看到：字典比列表占用更多空间

```python
In [1]: import sys

In [3]: a = [('a',1),('b',2)]                                                                                                             

In [5]: sys.getsizeof(a)                                                        
Out[5]: 88


In [6]: a = {'a':1,'b':2.0}                                                     
In [7]: sys.getsizeof(a)                                                        
Out[7]: 248

```

<center>[上一个例子](47.md)    [下一个例子](49.md)</center>


================================================
FILE: md/49.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 49 过滤器　　

在函数中设定过滤条件，迭代元素，保留返回值为`True`的元素：

```python
In [1]: fil = filter(lambda x: x>10,[1,11,2,45,7,6,13])

In [2]: list(fil)
Out[2]: [11, 45, 13]
```

<center>[上一个例子](48.md)    [下一个例子](50.md)</center>

================================================
FILE: md/5.md
================================================
```markdown
@author jackzhenguo
@desc 至少一个为真检查
@date 2019/2/10
```

#### 5 至少一个为真检查　

至少有一个元素为真返回`True`，否则`False`
```python
In [7]: any([0,0,0,[]])                                                         
Out[7]: False
```

```python
In [8]: any([0,0,1])                                                            
Out[8]: True
```

<center>[上一个例子](4.md)    [下一个例子](6.md)</center>

================================================
FILE: md/50.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 50 返回对象哈希值　　

返回对象的哈希值，值得注意的是自定义的实例都是可哈希的，`list`, `dict`, `set`等可变对象都是不可哈希的(unhashable)

  ```python
In [1]: hash(xiaoming)
Out[1]: 6139638

In [2]: hash([1,2,3])
# TypeError: unhashable type: 'list'
  ```

<center>[上一个例子](49.md)    [下一个例子](51.md)</center>

================================================
FILE: md/51.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

#### 51  help 一键帮助　

返回对象的帮助文档

```python
In [1]: help(xiaoming)
Help on Student in module __main__ object:

class Student(builtins.object)
 |  Methods defined here:
 |
 |  __init__(self, id, name)
 |
 |  __repr__(self)
 |
 |  Data descriptors defined here:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)
```

<center>[上一个例子](50.md)    [下一个例子](52.md)</center>

================================================
FILE: md/52.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/4
```

### 52 获取用户输入　

获取用户输入内容

```python
In [1]: input()
aa
Out[1]: 'aa'
```

<center>[上一个例子](51.md)    [下一个例子](53.md)</center>

================================================
FILE: md/53.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/8
```

#### 53 创建迭代器

使用`iter(obj, sentinel)`, 返回一个可迭代对象, sentinel可省略(一旦迭代到此元素，立即终止)

```python
In [1]: lst = [1,3,5]

In [2]: for i in iter(lst):
    ...:     print(i)
    ...:
1
3
5
```

```python
In [1]: class TestIter(object):
    ...:     def __init__(self):
    ...:         self.l=[1,3,2,3,4,5]
    ...:         self.i=iter(self.l)
    ...:     def __call__(self):  #定义了__call__方法的类的实例是可调用的
    ...:         item = next(self.i)
    ...:         print ("__call__ is called,fowhich would return",item)
    ...:         return item
    ...:     def __iter__(self): #支持迭代协议(即定义有__iter__()函数)
    ...:         print ("__iter__ is called!!")
    ...:         return iter(self.l)
In [2]: t = TestIter()
In [3]: t() # 因为实现了__call__，所以t实例能被调用
__call__ is called,which would return 1
Out[3]: 1

In [4]: for e in TestIter(): # 因为实现了__iter__方法，所以t能被迭代
    ...:     print(e)
    ...: 
__iter__ is called!!
1
3
2
3
4
5
```

<center>[上一个例子](52.md)    [下一个例子](54.md)</center>

================================================
FILE: md/54.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/20
```

#### 54 文件读写和mode 取值表

返回文件对象

```python
In [1]: fo = open('D:/a.txt',mode='r', encoding='utf-8')

In [2]: fo.read()
Out[2]: '\ufefflife is not so long,\nI use Python to play.'
In [3]: fo.close() # 关闭文件对象
```

mode 取值表：

| 字符  | 意义                             |
| :---- | :------------------------------- |
| `'r'` | 读取（默认）                     |
| `'w'` | 写入，并先截断文件               |
| `'x'` | 排它性创建，如果文件已存在则失败 |
| `'a'` | 写入，如果文件存在则在末尾追加   |
| `'b'` | 二进制模式                       |
| `'t'` | 文本模式（默认）                 |
| `'+'` | 打开用于更新（读取与写入）       |

文件读操作

```python
import os
# 创建文件夹

def mkdir(path):
    isexists = os.path.exists(path)
    if not isexists:
        os.mkdir(path)
# 读取文件信息

def openfile(filename):
    f = open(filename)
    fllist = f.read()
    f.close()
    return fllist  # 返回读取内容
```

文件写操作

```python
# 写入文件信息
# example1
# w写入，如果文件存在，则清空内容后写入，不存在则创建
f = open(r"./data/test.txt", "w", encoding="utf-8")
print(f.write("测试文件写入"))
f.close

# example2
# a写入，文件存在，则在文件内容后追加写入，不存在则创建
f = open(r"./data/test.txt", "a", encoding="utf-8")
print(f.write("测试文件写入"))
f.close

# example3
# with关键字系统会自动关闭文件和处理异常
with open(r"./data/test.txt", "w") as f:
    f.write("hello world!")
```

<center>[上一个例子](53.md)    [下一个例子](55.md)</center>

================================================
FILE: md/55.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/10
```

#### 55 创建range序列

1) range(stop)
2) range(start, stop[,step])

生成一个不可变序列：

```python
In [1]: range(11)
Out[1]: range(0, 11)

In [2]: range(0,11,1)
Out[2]: range(0, 11)
```

<center>[上一个例子](54.md)    [下一个例子](56.md)</center>

================================================
FILE: md/56.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/20
```

#### 56 反向迭代器reversed

```python
In [1]: rev = reversed([1,4,2,3,1])

In [2]: for i in rev:
     ...:     print(i)
     ...:
1
3
2
4
1
```

<center>[上一个例子](55.md)    [下一个例子](57.md)</center>

================================================
FILE: md/57.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/20
```

#### 57 zip迭代器

创建一个聚合了来自每个可迭代对象中的元素的迭代器：

```python
In [1]: x = [3,2,1]
In [2]: y = [4,5,6]
In [3]: list(zip(y,x))
Out[3]: [(4, 3), (5, 2), (6, 1)]

In [4]: a = range(5)
In [5]: b = list('abcde')
In [6]: b
Out[6]: ['a', 'b', 'c', 'd', 'e']
In [7]: [str(y) + str(x) for x,y in zip(a,b)]
Out[7]: ['a0', 'b1', 'c2', 'd3', 'e4']
```

<center>[上一个例子](56.md)    [下一个例子](58.md)</center>

================================================
FILE: md/58.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/23
```

#### 58 operator使用举例

```python
from operator import (add, sub)


def add_or_sub(a, b, oper):
    return (add if oper == '+' else sub)(a, b)


add_or_sub(1, 2, '-')  # -1
```

<center>[上一个例子](57.md)    [下一个例子](59.md)</center>

================================================
FILE: md/59.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/25
```

#### 60 传输json对象

对象序列化，是指将内存中的对象转化为可存储或传输的过程。很多场景，直接一个类对象，传输不方便。

但是，当对象序列化后，就会更加方便，因为约定俗成的，接口间的调用或者发起的 web 请求，一般使用 json 串传输。

实际使用中，一般对类对象序列化。先创建一个 Student 类型，并创建两个实例。

```python
class Student():
    def __init__(self,**args):
        self.ids = args['ids']
        self.name = args['name']
        self.address = args['address']
xiaoming = Student(ids = 1,name = 'xiaoming',address = '北京')
xiaohong = Student(ids = 2,name = 'xiaohong',address = '南京')
```

导入 json 模块，调用 dump 方法，就会将列表对象 [xiaoming,xiaohong]，序列化到文件 json.txt 中。

```python
import json

with open('json.txt', 'w') as f:
    json.dump([xiaoming,xiaohong], f, default=lambda obj: obj.__dict__, ensure_ascii=False, indent=2, sort_keys=True)
```

生成的文件内容，如下：

```json
[
    {
        "address":"北京",
        "ids":1,
        "name":"xiaoming"
    },
    {
        "address":"南京",
        "ids":2,
        "name":"xiaohong"
    }
]
```

<center>[上一个例子](58.md)    [下一个例子](60.md)</center>

================================================
FILE: md/6.md
================================================
```markdown
@author jackzhenguo
@desc 判断是真是假
@date 2019/2/12
```

#### 6 判断是真是假　　

测试一个对象是True, 还是False.
```python
In [9]: bool([0,0,0])                                                           
Out[9]: True

In [10]: bool([])                                                               
Out[10]: False

In [11]: bool([1,0,1])                                                          
Out[11]: True
```


<center>[上一个例子](5.md)    [下一个例子](7.md)</center>

================================================
FILE: md/60.md
================================================
#### 61 不用else和if实现计算器

```python
from operator import *


def calculator(a, b, k):
    return {
        '+': add,
        '-': sub,
        '*': mul,
        '/': truediv,
        '**': pow
    }[k](a, b)


calculator(1, 2, '+')  # 3
calculator(3, 4, '**')  # 81
```

<center>[上一个例子](59.md)    [下一个例子](61.md)</center>

================================================
FILE: md/61.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/3/25
```

#### 61 去最求平均

```python
def score_mean(lst):
    lst.sort()
    lst2=lst[1:(len(lst)-1)]
    return round((sum(lst2)/len(lst2)),1)

lst=[9.1, 9.0,8.1, 9.7, 19,8.2, 8.6,9.8]
score_mean(lst) # 9.1
```

<center>[上一个例子](60.md)    [下一个例子](62.md)</center>

================================================
FILE: md/62.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/2
```

#### 62 打印99乘法表

打印出如下格式的乘法表

```python
1*1=1
1*2=2   2*2=4
1*3=3   2*3=6   3*3=9
1*4=4   2*4=8   3*4=12  4*4=16
1*5=5   2*5=10  3*5=15  4*5=20  5*5=25
1*6=6   2*6=12  3*6=18  4*6=24  5*6=30  6*6=36
1*7=7   2*7=14  3*7=21  4*7=28  5*7=35  6*7=42  7*7=49
1*8=8   2*8=16  3*8=24  4*8=32  5*8=40  6*8=48  7*8=56  8*8=64
1*9=9   2*9=18  3*9=27  4*9=36  5*9=45  6*9=54  7*9=63  8*9=72  9*9=81
```

一共有10 行，第`i`行的第`j`列等于：`j*i`，

其中,

 `i`取值范围：`1<=i<=9`

 `j`取值范围：`1<=j<=i`

根据`例子分析`的语言描述，转化为如下代码：

```python
for i in range(1, 10):
    for j in range(1, i+1):
        print('%d * %d = %d' % (j, i, j * i) , end="\t")
    print()
```

<center>[上一个例子](61.md)    [下一个例子](63.md)</center>

================================================
FILE: md/63.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/2
```

#### 63 递归版flatten函数

对于如下数组：

```
[[[1,2,3],[4,5]]]
```

如何完全展开成一维的。这个小例子实现的`flatten`是递归版，两个参数分别表示带展开的数组，输出数组。

```python
from collections.abc import *

def flatten(lst, out_lst=None):
    if out_lst is None:
        out_lst = []
    for i in lst:
        if isinstance(i, Iterable): # 判断i是否可迭代
            flatten(i, out_lst)  # 尾数递归
        else:
            out_lst.append(i)    # 产生结果
    return out_lst
```

调用`flatten`:

```python
print(flatten([[1,2,3],[4,5]]))
print(flatten([[1,2,3],[4,5]], [6,7]))
print(flatten([[[1,2,3],[4,5,6]]]))
# 结果：
[1, 2, 3, 4, 5]
[6, 7, 1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]
```

numpy里的`flatten`与上面的函数实现有些微妙的不同：

```python
import numpy
b = numpy.array([[1,2,3],[4,5]])
b.flatten()
array([list([1, 2, 3]), list([4, 5])], dtype=object)
```

<center>[上一个例子](62.md)    [下一个例子](64.md)</center>

================================================
FILE: md/64.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/2
```

#### 64 列表等分

```python
from math import ceil

def divide(lst, size):
    if size <= 0:
        return [lst]
    return [lst[i * size:(i+1)*size] for i in range(0, ceil(len(lst) / size))]
```

测试举例：

```python
r = divide([1, 3, 5, 7, 9], 2)
print(r)  # [[1, 3], [5, 7], [9]]

r = divide([1, 3, 5, 7, 9], 0)
print(r)  # [[1, 3, 5, 7, 9]]

r = divide([1, 3, 5, 7, 9], -3)
print(r)  # [[1, 3, 5, 7, 9]]
```


<center>[上一个例子](63.md)    [下一个例子](65.md)</center>

================================================
FILE: md/65.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/6
```

#### 65 压缩列表

```python
def filter_false(lst):
    return list(filter(bool, lst))


r = filter_false([None, 0, False, '', [], 'ok', [1, 2]])
print(r)  # ['ok', [1, 2]]

```

<center>[上一个例子](64.md)    [下一个例子](66.md)</center>

================================================
FILE: md/66.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/6
```

#### 66 求更长的列表

```python
def max_length(*lst):
    return max(*lst, key=lambda v: len(v))


r = max_length([1, 2, 3], [4, 5, 6, 7], [8])
print(f'更长的列表是{r}')  # [4, 5, 6, 7]

r = max_length([1, 2, 3], [4, 5, 6, 7], [8, 9])
print(f'更长的列表是{r}')  # [4, 5, 6, 7]
```

<center>[上一个例子](65.md)    [下一个例子](67.md)</center>

================================================
FILE: md/67.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/6
```

#### 67 求众数

```python
def top1(lst):
    return max(lst, default='列表为空', key=lambda v: lst.count(v))
```

测试举例：

```python
lst = [1, 3, 3, 2, 1, 1, 2]
r = top1(lst)
print(f'{lst}中出现次数最多的元素为:{r}')  
# [1, 3, 3, 2, 1, 1, 2]中出现次数最多的元素为:1
```


<center>[上一个例子](66.md)    [下一个例子](68.md)</center>

================================================
FILE: md/68.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/10
```

#### 68 所有多个列表的最大值
```python 
def max_lists(*lst):
    return max(max(*lst, key=lambda v: max(v)))


r = max_lists([1, 2, 3], [6, 7, 8], [4, 5])
print(r)  # 8
```

<center>[上一个例子](67.md)    [下一个例子](69.md)</center>

================================================
FILE: md/69.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/10
```

#### 69 列表检查重复

```python
def has_duplicates(lst):
    return len(lst) == len(set(lst))


x = [1, 1, 2, 2, 3, 2, 3, 4, 5, 6]
y = [1, 2, 3, 4, 5]
has_duplicates(x)  # False
has_duplicates(y)  # True
```

<center>[上一个例子](68.md)    [下一个例子](70.md)</center>

================================================
FILE: md/7.md
================================================
```markdown
@author jackzhenguo
@desc 创建复数
@date 2019/2/13
```

#### 7  创建复数

创建一个复数

```python
In [1]: complex(1,2)
Out[1]: (1+2j)
```


<center>[上一个例子](6.md)    [下一个例子](8.md)</center>

================================================
FILE: md/70.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/10
```

#### 70 一行代码实现列表反转

```python
def reverse(lst):
    return lst[::-1]


r = reverse([1, -2, 3, 4, 1, 2])
print(r)  # [2, 1, 4, 3, -2, 1]
```

<center>[上一个例子](69.md)    [下一个例子](71.md)</center>

================================================
FILE: md/71.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/14
```

#### 71 浮点数等差数列

```python
def float_range(start, stop, n):
    start,stop,n = float('%.2f' % start), float('%.2f' % stop),int('%.d' % n)
    step = (stop-start)/n
    lst = [start]
    while n > 0:
        start,n = start+step,n-1
        lst.append(round((start), 2))
    return lst
```

测试举例：

```python
float_range(1, 8, 10) 
# [1.0, 1.7, 2.4, 3.1, 3.8, 4.5, 5.2, 5.9, 6.6, 7.3, 8.0]
```


<center>[上一个例子](70.md)    [下一个例子](72.md)</center>

================================================
FILE: md/72.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/15
```

#### 72 按条件分组

```python
def bif_by(lst, f):
    return [ [x for x in lst if f(x)],[x for x in lst if not f(x)]]
```

测试举例：

```python
records = [25,89,31,34] 
bif_by(records, lambda x: x<80) # [[25, 31, 34], [89]]
```


<center>[上一个例子](71.md)    [下一个例子](73.md)</center>

================================================
FILE: md/73.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/15
```

#### 73 map实现向量运算

多序列运算函数

`map(function,iterabel,iterable2)`

```python
lst1=[1,2,3,4,5,6]
lst2=[3,4,5,6,3,2]
list(map(lambda x,y:x*y+1,lst1,lst2))
### [4, 9, 16, 25, 16, 13]
```

<center>[上一个例子](72.md)    [下一个例子](74.md)</center>

================================================
FILE: md/74.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/15
```

#### 74 值最大的字典

```python
def max_pairs(dic):
    if len(dic) == 0:
        return dic
    max_val = max(map(lambda v: v[1], dic.items()))
    return [item for item in dic.items() if item[1] == max_val]
```

测试举例：

```python
r = max_pairs({'a': -10, 'b': 5, 'c': 3, 'd': 5})
print(r)  # [('b', 5), ('d', 5)]
```


<center>[上一个例子](73.md)    [下一个例子](75.md)</center>

================================================
FILE: md/75.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/15
```

#### 75 合并两个字典

Python 3.5 后支持的一行代码实现合并字典

```python
def merge_dict(dic1, dic2):
    return {**dic1, **dic2} 
```

测试：

```python
merge_dict({'a': 1, 'b': 2}, {'c': 3}) 
# {'a': 1, 'b': 2, 'c': 3}
```


<center>[上一个例子](74.md)    [下一个例子](76.md)</center>

================================================
FILE: md/76.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/19
```

#### 76 Topn 字典

返回字典d前n个最大值对应的键

```python
from heapq import nlargest
def topn_dict(d, n):
    return nlargest(n, d, key=lambda k: d[k])
```

测试：

```python
topn_dict({'a': 10, 'b': 8, 'c': 9, 'd': 10}, 3)  
# ['a', 'd', 'c']
```


<center>[上一个例子](75.md)    [下一个例子](77.md)</center>

================================================
FILE: md/77.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/19
```

#### 77 异位词

两个字符串含有相同字母，但排序不同，简称：互为变位词

```python
from collections import Counter

# 

def anagram(str1, str2):
    return Counter(str1) == Counter(str2)

anagram('eleven+two', 'twelve+one')  # True 这是一对神器的变位词
anagram('eleven', 'twelve')  # False
```

<center>[上一个例子](76.md)    [下一个例子](78.md)</center>

================================================
FILE: md/78.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/23
```

#### 78 逻辑上合并字典
(1) 两种合并字典方法
这是一般的字典合并写法

```python
dic1 = {'x': 1, 'y': 2 }
dic2 = {'y': 3, 'z': 4 }
merged1 = {**dic1, **dic2} # {'x': 1, 'y': 3, 'z': 4}
```

修改merged['x']=10，dic1中的x值`不变`，`merged`是重新生成的一个`新字典`。

但是，`ChainMap`却不同，它在内部创建了一个容纳这些字典的列表。因此使用ChainMap合并字典，修改merged['x']=10后，dic1中的x值`改变`，如下所示：

```python
from collections import ChainMap
merged2 = ChainMap(dic1,dic2)
print(merged2) # ChainMap({'x': 1, 'y': 2}, {'y': 3, 'z': 4})
```

<center>[上一个例子](77.md)    [下一个例子](79.md)</center>

================================================
FILE: md/79.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/4/26
```

#### 79  带名字的元组

定义名字为Point的元祖，字段属性有`x`,`y`,`z`

```python
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y', 'z'])  
lst = [Point(1.5, 2, 3.0), Point(-0.3, -1.0, 2.1), Point(1.3, 2.8, -2.5)]
print(lst[0].y - lst[1].y)
```

使用命名元组写出来的代码可读性更好，尤其处理上百上千个属性时作用更加凸显。

<center>[上一个例子](78.md)    [下一个例子](80.md)</center>

================================================
FILE: md/8.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/2/10
```

#### 8 取商和余数　　

分别取商和余数

```python
In [1]: divmod(10,3)
Out[1]: (3, 1)
```


<center>[上一个例子](7.md)    [下一个例子](9.md)</center>

================================================
FILE: md/80.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/1
```

#### 80 sample 样本抽样

使用`sample`抽样，如下例子从100个样本中随机抽样10个。

```python
from random import randint,sample
lst = [randint(0,50) for _ in range(100)]
print(lst[:5])# [38, 19, 11, 3, 6]
lst_sample = sample(lst,10)
print(lst_sample) # [33, 40, 35, 49, 24, 15, 48, 29, 37, 24]
```

<center>[上一个例子](79.md)    [下一个例子](81.md)</center>

================================================
FILE: md/81.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/1
```

#### 81 shuffle 重洗数据集

使用`shuffle`用来重洗数据集，值得注意`shuffle`是对lst就地(in place)洗牌，节省存储空间。

```python
from random import shuffle
lst = [randint(0,50) for _ in range(100)]
shuffle(lst)
print(lst[:5]) # [50, 3, 48, 1, 26]
```

<center>[上一个例子](80.md)    [下一个例子](82.md)</center>

================================================
FILE: md/82.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/1
```

#### 82 10个均匀分布的坐标点

random模块中的`uniform(a,b)`生成[a,b)内的一个随机数。

如下生成10个均匀分布的二维坐标点

```python
from random import uniform
In [1]: [(uniform(0,10),uniform(0,10)) for _ in range(10)]
Out[1]: 
[(9.244361194237328, 7.684326645514235),
 (8.129267671737324, 9.988395854203773),
 (9.505278771040661, 2.8650440524834107),
 (3.84320100484284, 1.7687190176304601),
 (6.095385729409376, 2.377133802224657),
 (8.522913365698605, 3.2395995841267844),
 (8.827829601859406, 3.9298809217233766),
 (1.4749644859469302, 8.038753079253127),
 (9.005430657826324, 7.58011186920019),
 (8.700789540392917, 1.2217577293254112)]
```

<center>[上一个例子](81.md)    [下一个例子](83.md)</center>

================================================
FILE: md/83.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/3
```

#### 83 10个高斯分布的坐标点

random模块中的`gauss(u,sigma)`生成均值为u, 标准差为sigma的满足高斯分布的值，如下生成10个二维坐标点，样本误差(y-2*x-1)满足均值为0，标准差为1的高斯分布：

```python
from random import gauss
x = range(10)
y = [2*xi+1+gauss(0,1) for xi in x]
points = list(zip(x,y))
### 10个二维点：
[(0, -0.86789025305992),
 (1, 4.738439437453464),
 (2, 5.190278040856102),
 (3, 8.05270893133576),
 (4, 9.979481700775292),
 (5, 11.960781766216384),
 (6, 13.025427054303737),
 (7, 14.02384035204836),
 (8, 15.33755823101161),
 (9, 17.565074449028497)]
```

<center>[上一个例子](82.md)    [下一个例子](84.md)</center>

================================================
FILE: md/84.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/4
```

#### 84 chain串联小容器为大容器

`chain`函数串联a和b，兼顾内存效率同时写法更加优雅。

```python
from itertools import chain
a = [1,3,5,0]
b = (2,4,6)

for i in chain(a,b):
  print(i)
### 结果
1
3
5
0
2
4
6
```

<center>[上一个例子](83.md)    [下一个例子](85.md)</center>

================================================
FILE: md/85.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/5
```

#### 85 product 案例

```python
def product(*args, repeat=1):
    pools = [tuple(pool) for pool in args] * repeat
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)
```


调用函数：

```python
rtn = product('xyz', '12', repeat=3)
print(list(rtn))
```

<center>[上一个例子](84.md)    [下一个例子](86.md)</center>

================================================
FILE: md/86.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/6
```

#### 86 反转字符串的两个方法

```python
st="python"
```

方法1：

```python
''.join(reversed(st))
```

方法2:

```python
st[::-1]
```


<center>[上一个例子](85.md)    [下一个例子](87.md)</center>

================================================
FILE: md/87.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/7
```

#### 87 join 串联字符串

用逗号连接字符串

```python
In [4]: mystr = ['1','2','java','4','python','java','7','8','java','python','11','java','13','14']

In [5]: ','.join(mystr) #用逗号连接字符串
Out[5]: '1,2,java,4,python,java,7,8,java,python,11,java,13,14'
```

<center>[上一个例子](86.md)    [下一个例子](88.md)</center>

================================================
FILE: md/88.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/3
```

#### 88 字符串字节长度

```python
def str_byte_len(mystr):
    return (len(mystr.encode('utf-8')))
```

测试：

```python
str_byte_len('i love python')  # 13(个字节)
str_byte_len('字符')  # 6(个字节)
```


<center>[上一个例子](87.md)    [下一个例子](89.md)</center>

================================================
FILE: md/89.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/10
```

### 89 正则中字符 `r`是干啥的？

经常见过正则表达式前有一个字符 `r`，它的作用是告诉解释器后面的一串是原生字符串，按照字面意思解释即可。如：

```python
s1 = r'\n.*'
print(s1) 
```

它告诉编译器s串第一个字符是`\`，第二个字符是`n`.打印的结果就是它本身：

```python
\n.*
```

而如果不带前缀字符`r`，即：

```python
s2 = '\n.*'
print(s2)
```

解释器认为前两个字符`\n`为转义字符，一个新行的意思，打印结果为一个换行加.*，如下所示：

```python
.*
```

<center>[上一个例子](88.md)    [下一个例子](90.md)</center>

================================================
FILE: md/9.md
================================================
```markdown
@author jackzhenguo
@desc 转为浮点类型　
@date 2019/2/15
```
#### 9 转为浮点类型　

将一个整数或数值型字符串转换为浮点数

```python
In [1]: float(3)
Out[1]: 3.0
```

```python
In [1]: float('3')
Out[1]: 3.0
```

浮点数最大值
```python
import sys

In[4]: sys.float_info.max                                                      
Out[4]: 1.7976931348623157e+308
```

正无穷大、负无穷大
```python
float('inf') # 正无穷大
float('-inf') # 负无穷大
```

如果不能转化为浮点数，则会报`ValueError`:

```python
In [2]: float('a')
# ValueError: could not convert string to float: 'a'
```

<center>[上一个例子](8.md)    [下一个例子](10.md)</center>


================================================
FILE: md/90.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/5/15
```

### 90 什么是一个原子操作？

微观世界中，如果定义原子是组成事物的最基本单元，那么就可理解为原子不能再分了。同理此处，正则的原子操作是指不能再被分割的正则表达式操作。

如正则中的`+`指前面的一个原子操作出现至少1次。

例如：`66+`表示第一个字符为6，第二个字符6和第三个字符+联合起来表示至少出现1次字符6，因此综合起来至少要有2个6紧邻的串才能满足此正则表达式(下面会详细讲到)。

`\w+`表示*字母数字下划线*中的任意一个字符(`\w`指代的)至少出现1次，那么`\w`就是一个原子操作。

因此，普通字符是原子，正则中的通用字符(下面会讲到)也是原子。

大家记住*原子*这个概念。

<center>[上一个例子](89.md)    [下一个例子](91.md)</center>

================================================
FILE: md/91.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/6/3
```

### 91 怎么理解正则中的转义？

正则世界中，重新定义了几个新的转义字符。

一个转义字符`\`+一个字符，转义后会改变原字符的意义，它不再是它，而是赋予一个新的含义。

例如，`w`本身就是一个英文字符`w`，没有其他任何含义。但是，前面加一个转义字符 `\` 后，含义发生重大改变，`w`它不再是`w`，而是`\`要与`w`连在一起，被解释器解释为匹配以下字符集合中的任意一个：

```python
pat = '\w'
```

等于：

```python
pat = '[0123456789
      AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz
      _]'
```

即匹配数字、大小写字母和下划线`_`字符集合中的任意一个。

你看，一个通用转义字符`\w`直接就指代上面这一大串，写法多么简便，同时在正则的世界里又经常被用到，故被称为：**通用正则字符**

类似的通用正则字符还有几个，下面也会讲到。

做一件事前，把规则弄清，触类旁通，相信大家理解其他几个也没问题。

<center>[上一个例子](90.md)    [下一个例子](92.md)</center>

================================================
FILE: md/92.md
================================================
### 92 正则最普通查找

最普通查找就是需要找啥就写啥，没有使用正则的规则。如下是关于小说《灿烂千阳》中的一段话，从中找出单词`friendship`，可能出现多次：

```
s = """
# Mariam is only fifteen 
# when she is sent to Kabul to marry the troubled and bitter Rasheed,
# who is thirty years her senior. 
# Nearly two decades later, 
# in a climate of growing unrest, tragedy strikes fifteen-year-old Laila, 
# who must leave her home and join Mariam's unhappy household. 
# Laila and Mariam are to find consolation in each other, 
# their friendship to grow as deep as the bond between sisters, 
# as strong as the ties between mother and daughter. 
# With the passing of time comes Taliban rule over Afghanistan, 
# the streets of Kabul loud with the sound of gunfire and bombs, 
# life a desperate struggle against starvation, brutality and fear, 
# the women's endurance tested beyond their worst imaginings. 
# Yet love can move a person to act in unexpected ways, 
# lead them to overcome the most daunting obstacles with a startling heroism. 
# In the end it is love that triumphs over death and destruction. 
# A Thousand Splendid Suns is an unforgettable portrait of a wounded country and
#  a deeply moving story of family and friendship. 
#  It is a beautiful, heart-wrenching story of an unforgiving time, 
#  an unlikely bond and an indestructible love.
"""
```

使用正则前，先导入re模块，再定义正则表达式，然后使用`findall`方法找出所有匹配

```python
import re
pat = 'friendship'
result = re.findall(pat,s)
print(result) 

# 共找到两处：
# ['friendship', 'friendship']
```

以上就是使用正则的最普通例子。如果要找出前缀为grow的单词，比如可能为grows, growing 等，最普通查找实现起来就不方便。

然而，借助于下面介绍的元字符、通用字符和捕获组合起来，便能应对解决复杂的匹配查找问题。

<center>[上一个例子](91.md)    [下一个例子](93.md)</center>

================================================
FILE: md/93.md
================================================
### 93 使用通用字符查找

在正则的世界里，通用字符指帮助我们更加简便的写出匹配规则的字符。

如上面文字，下面正则匹配串能找出以d开始，[a-z]表示的任意一个小写英文字符，{7}表示小写英文字符出现7次(下面情况3会说到)，也就是匹配出来的子串长度为1+7=8:

```python
pat = 'd[a-z]{7}'
result = re.findall(pat,s)
```

匹配结果为：

```python
['daughter', 'desperat', 'daunting', 'destruct', 'destruct']
```

同理，模式串`pat = 'd[a-z]{10}'`匹配的结果为：

```
['destruction', 'destructibl']
```

模式串`pat = 'd[a-z]{11}'`匹配的结果为：

```
[ 'destructible']
```

你看，通用字符`[a-z]`使用真方便，5个字符一下就表达了所有26个小写的字符，但是注意`[a-z]`匹配26个小写字符的任意**一个**.

类似功能的通用字符还包括：

```
[A-Z]  匹配大写英文字母
[0-9]  匹配一个0-9之间的数字
```

还有更加强大的通用字符：

```
\s  匹配空白字符，如\n \t \b等
\w  匹配任意字母、数字、下划线 
\d  匹配十进制数字0-9
```

而\S, \W, \D 分别对应 \s, \w, \d匹配字符集的补集，例如\S 的意思是匹配 \s 以外的其他任意字符。


<center>[上一个例子](92.md)    [下一个例子](94.md)</center>

================================================
FILE: md/94.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/7/3
```

### 94 使用元字符查找

`元`的含义大家不妨理解为用来描述它后面事物的类，如*元类*用来创建描述类的类，*元模型*描述一个模型的模型，因此推而广之，*元字符*用来描述字符的字符。

理解以上后，你再看正则中使用最普遍的一个元字符 `+`，它是用来描述前面一个原子出现次数的字符，表示前一个原子出现1次或多次都可。

例如，在寻找手机靓号时，正则表达式`66+`，表示前一个原子`6`至少出现1次，因此连上第一个6，表示电话号码中至少有两个66紧邻。因此，电话号码`18612652166`、`17566665656`都满足要求，而号码`18616161616`不符合要求。

类似功能的元字符，还包括如下。功能相似，不再赘述：

```
* 前面的原子重复0次、1次、多次 
? 前面的原子重复0次或者1次 
+ 前面的原子重复1次或多次
{n} 前面的原子出现了 n 次
{n,} 前面的原子至少出现 n 次
{n,m} 前面的原子出现次数介于 n-m 之间
```

<center>[上一个例子](93.md)    [下一个例子](95.md)</center>

================================================
FILE: md/95.md
================================================
## 95 捕获子串

今天以我写过的《Python 60天》专栏中的一段文字，提取出里面的链接为例，阐述提取子串的实用性。

先贴上文字(有删减改动)，将这段文字赋值给变量 `urls`：

```
urls = """
基于 Python 的包更是枝繁叶茂，遍地开花，“Tiobe 编程语言排行榜”最新统计显示 Python 是增长最快的语言。

![image-20200131192231967](https://images.gitbook.cn/2020-02-05-014719.png)

接下来，与大家，还有远在美国做 AI 博士后研究的 Alicia，一起开始我们的 60 天 Python 探索之旅吧。

所有的这些考虑，都是为了让大家在短时间内掌握 Python 技术栈，多一个生存的本领。拿到理想的 Offer 后，早日过上自己想要的生活。

让我们开始吧。

如下，按照是否为静态/动态语言，弱类型/强类型两个维度，

总结常用的语言分类。

![image-20200205155429583](https://images.gitbook.cn/2020-02-05-080211.png) ### 四大基本语法
"""
```

你可能很快写出如下的正则表达式：

```
# 元字符.表示匹配除\n字符外的任意一个字符
# 元字符*表示匹配前面一个原子0次或多次
pat = r'https:.*' 
```

然后导入`re`模块，使用`findall`方法找出所有匹配：

```python
import re
result = re.findall(pat,urls)
print(result)
```

运行结果显示如下，观察发现2个匹配，但是每个匹配链接都包括冗余字符，因此匹配错误：

```
['https://images.gitbook.cn
/2020-02-05-014719.png)',

'https://images.gitbook.cn
/2020-02-05-080211.png) ### 四大基本语法']
```

我们再稍微优化原正则表达式为：

```python
# 添加 \) 表示待匹配子串以右括号结尾
pat = r'https:.*\)'
```

打印结果显示如下，结果确实好一点，但是依然包括右括号，结果还是错误的：

```
['https://images.gitbook.cn/
2020-02-05-014719.png)', 

'https://images.gitbook.cn/
2020-02-05-080211.png)']
```

所以掌握**提取**子串的技能就很重要，实现提取子串也很简单，只需把想要返回的子串加上一对括号就行，如下所示：

```python
# 把想要返回的子串外面添加一对括号

pat = r'(https:.*)\)'
```

此时返回结果完全正确，无任何多余字符。想要返回的子串外面添加一对括号还有个专业叫法：**捕获**或**分组**。

<center>[上一个例子](94.md)    [下一个例子](96.md)</center>

================================================
FILE: md/96.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/7/21
```

#### 96 贪心捕获和非贪心捕获

捕获功能非常实用，使用它需要区分一点，贪婪捕获和非贪婪捕获。前者指在满足匹配模式前提下，返回包括尽可能多的字符匹配模式；后者指满足匹配条件下，尽可能少的捕获。

我们伪造一个理想状况下的案例：

```
htmlContent = """
        <div><div><h2>这是二级标题</h2></div><div><p> 这是一个段落>/p></div></div>
"""
```

贪心捕获使用`(.*)`，如下所示：

```
pat = r"<div>(.*)</div>"

result = re.findall(pat,htmlContent)
```

结果为如下，尽可能长的捕获，而不是遇到第一个`</div>`时就终止：

```
['<div><h2>这是二级标题</h2></div><div><p> 这是一个段落>/p></div>']
```

而非贪心捕获的正则表达式为`<div>(.*?)</div>"`，如下：

```
pat = r"<div>(.*?)</div>"

result = re.findall(pat,htmlContent)

print(result)
```

结果为两个元素，遇到第一个`</div>`时终止，然后继续捕获出第二子串：

```
['<div><h2>这是二级标题</h2>', 
  '<p> 这是一个段落>/p>']
```

以上例子仅仅用作演示两者区别，实际的html结构含有换行符等，环境比上面要复杂的多，贪心和非贪心捕获的写法可能不会导致结果不同，但是我们依然需要理解它们的区别。

<center>[上一个例子](95.md)    [下一个例子](97.md)</center>

================================================
FILE: md/97.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/8/3
```

#### 97 使用正则做密码安全检查

密码安全要求：

1)要求密码为6到20位; 

2)密码只包含英文字母和数字

```python
pat = re.compile(r'\w{6,20}') # 这是错误的，因为\w通配符匹配的是字母，数字和下划线，题目要求不能含有下划线
# 使用最稳的方法：\da-zA-Z满足`密码只包含英文字母和数字`
pat = re.compile(r'[\da-zA-Z]{6,20}')
```
选用最保险的`fullmatch`方法，查看是否整个字符串都匹配：
```python
pat.fullmatch('qaz12') # 返回 None, 长度小于6
pat.fullmatch('qaz12wsxedcrfvtgb67890942234343434') # None 长度大于22
pat.fullmatch('qaz_231') # None 含有下划线
pat.fullmatch('n0passw0Rd')
Out[4]: <re.Match object; span=(0, 10), match='n0passw0Rd'>
```

<center>[上一个例子](96.md)    [下一个例子](98.md)</center>

================================================
FILE: md/98.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/8/8
```

#### 98 爬取百度首页标题

```python
import re
from urllib import request

#爬虫爬取百度首页内容
data=request.urlopen("http://www.baidu.com/").read().decode()

#分析网页,确定正则表达式
pat=r'<title>(.*?)</title>'

result=re.search(pat,data)
print(result) <re.Match object; span=(1358, 1382), match='<title>百度一下，你就知道</title>'>

result.group() # 百度一下，你就知道
```

<center>[上一个例子](97.md)    [下一个例子](99.md)</center>

================================================
FILE: md/99.md
================================================
```markdown
@author jackzhenguo
@desc 
@date 2019/8/12
```

#### 99 批量转化为驼峰格式(Camel)

数据库字段名批量转化为驼峰格式

分析过程

```python
# 用到的正则串讲解
# \s 指匹配： [ \t\n\r\f\v]
# A|B：表示匹配A串或B串
# re.sub(pattern, newchar, string): 
# substitue代替，用newchar字符替代与pattern匹配的字符所有.
```


```python
# title(): 转化为大写，例子：
# 'Hello world'.title() # 'Hello World'
```


```python
# print(re.sub(r"\s|_|", "", "He llo_worl\td"))
s = re.sub(r"(\s|_|-)+", " ",
           'some_database_field_name').title().replace(" ", "")  
#结果： SomeDatabaseFieldName
```


```python
# 可以看到此时的第一个字符为大写，需要转化为小写
s = s[0].lower()+s[1:]  # 最终结果
```

 
整理以上分析得到如下代码：

```python
import re
def camel(s):
    s = re.sub(r"(\s|_|-)+", " ", s).title().replace(" ", "")
    return s[0].lower() + s[1:]

# 批量转化
def batch_camel(slist):
    return [camel(s) for s in slist]
```

测试结果：

```python
s = batch_camel(['student_id', 'student\tname', 'student-add'])
print(s)
# 结果
['studentId', 'studentName', 'studentAdd']
```

<center>[上一个例子](98.md)    [下一个例子](100.md)</center>

================================================
FILE: md/batch.py
================================================
"""
@author jackzhenguo
@desc 批量生成模板文件
@tag datetime file
@version v1.2
@date 2020/8/23
"""

import os
import calendar
from datetime import date,datetime

def getEverydaySince(year,month,day,n=10):
    i = 0
    _, days = calendar.monthrange(year, month)
    while i < n: 
        d = date(year,month,day)    
        if day == days:
            month,day = month+1,0
            _, days = calendar.monthrange(year, month)
            if month == 13:
                year,month = year+1,1
                _, days = calendar.monthrange(year, month)
        yield d
        day += 1
        i += 1


def batchCreate(ps='.md',start=100,n=100,path='.',year=2020,month=2,day=1):
  
    for i,day in zip(range(start,start+n),\
                     getEverydaySince(year,month,day,n) \
                    ):
        with open(path+'/'+str(i)+ps,'w') as fw:
            
            fw.write("""
```markdown
@author jackzhenguo
@desc
@tag
@version 
@date {}
```
		     """.format(datetime.strftime(day,'%Y/%m/%d'))\
		     )
            print(day)
    print('done')


================================================
FILE: script/add_nav.py
================================================
# function: give each *.md example to a navigation in bottom of file
# author: zhenguo
# date: 2021.2.27
# version: 1.0

import os

for file in os.listdir('../md'):
    if os.path.splitext(file)[-1] == '.md':
        with open('../md/' + file, 'a') as f:
            file_name = os.path.splitext(file)[0]
            try:
                c = '\n\n<center>[上一个例子](%s.md)    [下一个例子](%s.md)</center>' % (str(int(file_name) - 1), str(int(file_name) + 1))
                f.write(c)
                print('文件%s写入成功' % (file,))
            except:
                print(ex)