Repository: jackzhenguo/python-small-examples Branch: master Commit: aa11da6ac252 Files: 238 Total size: 177.6 KB Directory structure: gitextract_72lj_okl/ ├── .gitignore ├── README.md ├── dev/ │ └── python-dev.md ├── md/ │ ├── 1.md │ ├── 10.md │ ├── 100.md │ ├── 101.md │ ├── 102.md │ ├── 103.md │ ├── 104.md │ ├── 105.md │ ├── 106.md │ ├── 107.md │ ├── 108.md │ ├── 109.md │ ├── 11.md │ ├── 110.md │ ├── 111.md │ ├── 112.md │ ├── 113.md │ ├── 114.md │ ├── 115.md │ ├── 116.md │ ├── 117.md │ ├── 118.md │ ├── 119.md │ ├── 12.md │ ├── 120.md │ ├── 121.md │ ├── 122.md │ ├── 123.md │ ├── 124.md │ ├── 125.md │ ├── 126.md │ ├── 127.md │ ├── 128.md │ ├── 129.md │ ├── 13.md │ ├── 130.md │ ├── 131.md │ ├── 132.md │ ├── 133.md │ ├── 134.md │ ├── 135.md │ ├── 136.md │ ├── 137.md │ ├── 138.md │ ├── 139.md │ ├── 14.md │ ├── 140.md │ ├── 141.md │ ├── 142.md │ ├── 143.md │ ├── 144.md │ ├── 145.md │ ├── 146.md │ ├── 147.md │ ├── 148.md │ ├── 149.md │ ├── 15.md │ ├── 150.md │ ├── 151.md │ ├── 152.md │ ├── 153.md │ ├── 154.md │ ├── 155.md │ ├── 156.md │ ├── 157.md │ ├── 158.md │ ├── 159.md │ ├── 16.md │ ├── 160.md │ ├── 161.md │ ├── 162.md │ ├── 163.md │ ├── 164.md │ ├── 165.md │ ├── 166.md │ ├── 167.md │ ├── 168.md │ ├── 169.md │ ├── 17.md │ ├── 170.md │ ├── 171.md │ ├── 172.md │ ├── 173.md │ ├── 174.md │ ├── 175.md │ ├── 176.md │ ├── 177.md │ ├── 178.md │ ├── 179.md │ ├── 18.md │ ├── 180.md │ ├── 181.md │ ├── 182.md │ ├── 183.md │ ├── 184.md │ ├── 185.md │ ├── 186.md │ ├── 187.md │ ├── 188.md │ ├── 189.md │ ├── 19.md │ ├── 190.md │ ├── 191.md │ ├── 192.md │ ├── 193.md │ ├── 194.md │ ├── 195.md │ ├── 196.md │ ├── 197.md │ ├── 198.md │ ├── 199.md │ ├── 2.md │ ├── 20.md │ ├── 200.md │ ├── 201.md │ ├── 202.md │ ├── 203.md │ ├── 204.md │ ├── 205.md │ ├── 206.md │ ├── 207.md │ ├── 208.md │ ├── 209.md │ ├── 21.md │ ├── 210.md │ ├── 211.md │ ├── 212.md │ ├── 213.md │ ├── 214.md │ ├── 215.md │ ├── 216.md │ ├── 217.md │ ├── 218.md │ ├── 219.md │ ├── 22.md │ ├── 220.md │ ├── 221.md │ ├── 222.md │ ├── 223.md │ ├── 224.md │ ├── 225.md │ ├── 226.md │ ├── 227.md │ ├── 228.md │ ├── 229.md │ ├── 23.md │ ├── 230.md │ ├── 231.md │ ├── 232.md │ ├── 233.md │ ├── 24.md │ ├── 25.md │ ├── 26.md │ ├── 27.md │ ├── 28.md │ ├── 29.md │ ├── 3.md │ ├── 30.md │ ├── 31.md │ ├── 32.md │ ├── 33.md │ ├── 34.md │ ├── 35.md │ ├── 36.md │ ├── 37.md │ ├── 38.md │ ├── 39.md │ ├── 4.md │ ├── 40.md │ ├── 41.md │ ├── 42.md │ ├── 43.md │ ├── 44.md │ ├── 45.md │ ├── 46.md │ ├── 47.md │ ├── 48.md │ ├── 49.md │ ├── 5.md │ ├── 50.md │ ├── 51.md │ ├── 52.md │ ├── 53.md │ ├── 54.md │ ├── 55.md │ ├── 56.md │ ├── 57.md │ ├── 58.md │ ├── 59.md │ ├── 6.md │ ├── 60.md │ ├── 61.md │ ├── 62.md │ ├── 63.md │ ├── 64.md │ ├── 65.md │ ├── 66.md │ ├── 67.md │ ├── 68.md │ ├── 69.md │ ├── 7.md │ ├── 70.md │ ├── 71.md │ ├── 72.md │ ├── 73.md │ ├── 74.md │ ├── 75.md │ ├── 76.md │ ├── 77.md │ ├── 78.md │ ├── 79.md │ ├── 8.md │ ├── 80.md │ ├── 81.md │ ├── 82.md │ ├── 83.md │ ├── 84.md │ ├── 85.md │ ├── 86.md │ ├── 87.md │ ├── 88.md │ ├── 89.md │ ├── 9.md │ ├── 90.md │ ├── 91.md │ ├── 92.md │ ├── 93.md │ ├── 94.md │ ├── 95.md │ ├── 96.md │ ├── 97.md │ ├── 98.md │ ├── 99.md │ └── batch.py └── script/ └── add_nav.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ .idea .vscode .github img/*.html notebook venv ================================================ FILE: README.md ================================================
at 0x0000000005DE75D0, file "", line 1>
In [4]: exec(r)
helloworld
s = """
def f():
a = 100 % 52
print(a)
f()
"""
r = compile(s,"", "exec")
exec(r)
```
输出
48
[上一个例子](15.md) [下一个例子](17.md)
================================================
FILE: md/160.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/02/27
```
### 列表快速复制之坑
在python中`*`与列表操作,实现快速元素复制:
```python
a = [1,3,5] * 3 # [1,3,5,1,3,5,1,3,5]
a[0] = 10 # [10, 2, 3, 1, 2, 3, 1, 2, 3]
```
如果列表元素为列表或字典等复合类型:
```python
a = [[1,3,5],[2,4]] * 3 # [[1, 3, 5], [2, 4], [1, 3, 5], [2, 4], [1, 3, 5], [2, 4]]
a[0][0] = 10 #
```
结果可能出乎你的意料,其他`a[1[0]`等也被修改为10
```python
[[10, 3, 5], [2, 4], [10, 3, 5], [2, 4], [10, 3, 5], [2, 4]]
```
这是因为*复制的复合对象都是浅引用,也就是说id(a[0])与id(a[2])门牌号是相等的。如果想要实现深复制效果,这么做:
```python
a = [[] for _ in range(3)]
```
[上一个例子](159.md) [下一个例子](161.md)
================================================
FILE: md/161.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/02/28
```
### 字符串驻留
```python
In [1]: a = 'something'
...: b = 'some'+'thing'
...: id(a)==id(b)
Out[1]: True
```
如果上面例子返回`True`,但是下面例子为什么是`False`:
```python
In [1]: a = '@zglg.com'
In [2]: b = '@zglg'+'.com'
In [3]: id(a)==id(b)
Out[3]: False
```
这与Cpython 编译优化相关,行为称为`字符串驻留`,但驻留的字符串中只包含字母,数字或下划线。
[上一个例子](160.md) [下一个例子](162.md)
================================================
FILE: md/162.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/02/29
```
### 相同值的不可变对象
```python
In [5]: d = {}
...: d[1] = 'java'
...: d[1.0] = 'python'
In [6]: d
Out[6]: {1: 'python'}
### key=1,value=java的键值对神奇消失了
In [7]: d[1]
Out[7]: 'python'
In [8]: d[1.0]
Out[8]: 'python'
```
这是因为具有相同值的不可变对象在Python中始终具有`相同的哈希值`
由于存在`哈希冲突`,不同值的对象也可能具有相同的哈希值。
[上一个例子](161.md) [下一个例子](163.md)
================================================
FILE: md/163.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/01
```
### 对象销毁顺序
创建一个类`SE`:
```python
class SE(object):
def __init__(self):
print('init')
def __del__(self):
print('del')
```
创建两个SE实例,使用`is`判断:
```python
In [63]: SE() is SE()
init
init
del
del
Out[63]: False
```
创建两个SE实例,使用`id`判断:
```python
In [64]: id(SE()) == id(SE())
init
del
init
del
Out[64]: True
```
调用`id`函数, Python 创建一个 SE 类的实例,并使用`id`函数获得内存地址后,销毁内存丢弃这个对象。
当连续两次进行此操作, Python会将相同的内存地址分配给第二个对象,所以两个对象的id值是相同的.
但是is行为却与之不同,通过打印顺序就可以看到。
[上一个例子](162.md) [下一个例子](164.md)
================================================
FILE: md/164.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/02
```
### 充分认识for
```python
In [65]: for i in range(5):
...: print(i)
...: i = 10
0
1
2
3
4
```
为什么不是执行一次就退出?
按照for在Python中的工作方式, i = 10 并不会影响循环。range(5)生成的下一个元素就被解包,并赋值给目标列表的变量`i`.
[上一个例子](163.md) [下一个例子](165.md)
================================================
FILE: md/165.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/03
```
### 认识执行时机
```python
array = [1, 3, 5]
g = (x for x in array if array.count(x) > 0)
```
`g`为生成器,list(g)后返回`[1,3,5]`,因为每个元素肯定至少都出现一次。所以这个结果这不足为奇。但是,请看下例:
```python
array = [1, 3, 5]
g = (x for x in array if array.count(x) > 0)
array = [5, 7, 9]
```
请问,list(g)等于多少?这不是和上面那个例子结果一样吗,结果也是`[1,3,5]`,但是:
```python
In [74]: list(g)
Out[74]: [5]
```
这有些不可思议~~ 原因在于:
生成器表达式中, in 子句在声明时执行, 而条件子句则是在运行时执行。
所以代码:
```python
array = [1, 3, 5]
g = (x for x in array if array.count(x) > 0)
array = [5, 7, 9]
```
等价于:
```python
g = (x for x in [1,3,5] if [5,7,9].count(x) > 0)
```
[上一个例子](164.md) [下一个例子](166.md)
================================================
FILE: md/166.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/04
```
### 创建空集合错误
这是Python的一个集合:`{1,3,5}`,它里面没有重复元素,在去重等场景有重要应用。下面这样创建空集合是错误的:
```python
empty = {} #NO!
```
cpython会解释它为字典
使用内置函数`set()`创建空集合:
```python
empty = set() #YES!
```
[上一个例子](165.md) [下一个例子](167.md)
================================================
FILE: md/167.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/05
```
### pyecharts传入Numpy数据绘图失败
echarts使用广泛,echarts+python结合后的包:pyecharts,同样可很好用,但是传入Numpy的数据,像下面这样绘图会失败:
```python
from pyecharts.charts import Bar
import pyecharts.options as opts
import numpy as np
c = (
Bar()
.add_xaxis([1, 2, 3, 4, 5])
# 传入Numpy数据绘图失败!
.add_yaxis("商家A", np.array([0.1, 0.2, 0.3, 0.4, 0.5]))
)
c.render()
```
由此可见pyecharts对Numpy数据绘图不支持,传入原生Python的list:
```python
from pyecharts.charts import Bar
import pyecharts.options as opts
import numpy as np
c = (
Bar()
.add_xaxis([1, 2, 3, 4, 5])
# 传入Python原生list
.add_yaxis("商家A", np.array([0.1, 0.2, 0.3, 0.4, 0.5]).tolist())
)
c.render()
```
[上一个例子](166.md) [下一个例子](168.md)
================================================
FILE: md/168.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/06
```
### 优化代码异常输出包
一行代码优化输出的异常信息
```python
pip install pretty-errors
```
写一个函数测试:
```python
def divided_zero():
for i in range(10, -1, -1):
print(10/i)
divided_zero()
```
在没有import这个`pretty-errors`前,输出的错误信息有些冗余:
```python
Traceback (most recent call last):
File "c:\Users\HUAWEI\.vscode\extensions\ms-python.python-2019.11.50794\pythonFiles\ptvsd_launcher.py", line 43, in
main(ptvsdArgs)
File "c:\Users\HUAWEI\.vscode\extensions\ms-python.python-2019.11.50794\pythonFiles\lib\python\old_ptvsd\ptvsd\__main__.py",
line 432, in main
run()
File "c:\Users\HUAWEI\.vscode\extensions\ms-python.python-2019.11.50794\pythonFiles\lib\python\old_ptvsd\ptvsd\__main__.py",
line 316, in run_file
runpy.run_path(target, run_name='__main__')
File "D:\anaconda3\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "D:\anaconda3\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "D:\anaconda3\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "d:\source\sorting-visualizer-master\sorting\debug_test.py", line 6, in
divided_zero()
File "d:\source\sorting-visualizer-master\sorting\debug_test.py", line 3, in divided_zero
print(10/i)
ZeroDivisionError: division by zero
```
我们使用刚安装的`pretty_errors`,`import`下:
```python
import pretty_errors
def divided_zero():
for i in range(10, -1, -1):
print(10/i)
divided_zero()
```
此时看看输出的错误信息,非常精简只有2行,去那些冗余信息:
```python
ZeroDivisionError:
division by zero
```
完整的输出信息如下图片所示:
[上一个例子](167.md) [下一个例子](169.md)
================================================
FILE: md/169.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/07
```
#### 图像处理包pillow
两行代码实现旋转和缩放图像
首先安装pillow:
```python
pip install pillow
```
旋转图像下面图像45度:
```python
In [1]: from PIL import Image
In [2]: im = Image.open('./img/plotly2.png')
In [4]: im.rotate(45).show()
```
旋转45度后的效果图
等比例缩放图像:
```python
im.thumbnail((128,72),Image.ANTIALIAS)
```
缩放后的效果图:

过滤图像后的效果图:
```python
from PIL import ImageFilter
im.filter(ImageFilter.CONTOUR).show()
```
[上一个例子](168.md) [下一个例子](170.md)
================================================
FILE: md/17.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/15
```
#### 17 计算表达式
将字符串str 当成有效的表达式来求值并返回计算结果取出字符串中内容
```python
In [1]: s = "1 + 3 +5"
In [2]: eval(s)
Out[2]: 9
s = ["{'小汽车':10, '面包车':8}", "{'面包车':5}"]
from collections import defaultdict
d = defaultdict(int)
for item in s:
my_dict = eval(item)
print(type(my_dict))
for key in my_dict:
d[key] += my_dict[key]
print(d)
defaultdict(, {'小汽车': 10, '面包车': 13})
```
[上一个例子](16.md) [下一个例子](18.md)
================================================
FILE: md/170.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/08
```
### 一行代码找到编码
兴高采烈地,从网页上抓取一段 `content`
但是,一 `print ` 就不那么兴高采烈了,结果看到一串这个:
```markdown
b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python'
```
这是啥? 又 x 又 c 的!
再一看,哦,原来是十六进制字节串 (`bytes`),`\x` 表示十六进制
接下来,你一定想转化为人类能看懂的语言,想到 `decode`:
```python
In [3]: b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python'.decode()
UnicodeDecodeError Traceback (most recent call last)
in
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc8 in position 0: invalid continuation byte
```
马上,一盆冷水泼头上,抛异常了。。。。。
根据提示,`UnicodeDecodeError`,这是 unicode 解码错误。
原来,`decode` 默认的编码方法:`utf-8`
所以排除 b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python' 使用 `utf-8` 的编码方式
可是,这不是四选一选择题啊,逐个排除不正确的!
编码方式几十种,不可能逐个排除吧。
那就猜吧!!!!!!!!!!!!!
**人生苦短,我用Python**
**Python, 怎忍心让你受累呢~**
尽量三行代码解决问题
**第一步,安装 chardet** 它是 char detect 的缩写。
**第二步,pip install chardet**
**第三步,出结果**
```python
In [6]: chardet.detect(b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python')
Out[6]: {'encoding': 'GB2312', 'confidence': 0.99, 'language': 'Chinese'}
```
编码方法:gb2312
解密字节串:
```python
In [7]: b'\xc8\xcb\xc9\xfa\xbf\xe0\xb6\xcc\xa3\xac\xce\xd2\xd3\xc3Python'.decode('gb2312')
Out[7]: '人生苦短,我用Python'
```
目前,`chardet` 包支持的检测编码几十种。
[上一个例子](169.md) [下一个例子](171.md)
================================================
FILE: md/171.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/09
```
问:子类继承父类的方法吗?
答:子类的实例继承了父类的static_method静态方法,调用该方法,还是调用的父类的方法和类属性。
```python
# coding:utf-8
class Foo(object):
X = 1
Y = 2
@staticmethod
def averag(*mixes):
return sum(mixes) / len(mixes)
@staticmethod
def static_method():
return Foo.averag(Foo.X, Foo.Y)
@classmethod
def class_method(cls):
return cls.averag(cls.X, cls.Y)
class Son(Foo):
X = 3
Y = 5
@staticmethod
def averag(*mixes):
return sum(mixes) / 3
p = Son()
print(p.static_method())
print(p.class_method())
# 1.5
# 2.6666666666666665
```
[上一个例子](170.md) [下一个例子](172.md)
================================================
FILE: md/172.md
================================================
```markdown
@author jackzhenguo
@desc NumPy 的 pad 填充方法
@tag NumPy
@version v1.0
@date 2020/11/27
```
今天介绍 NumPy 一个实用的方法 `pad`,实现数组周围向外扩展层的功能。
```python
In [1]: import numpy as np
In [2]: help(np.pad)
In [4]: a = np.ones((3,4))
Out[4]:
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
```
np.pad 默认在原数组周边向外扩展 pad_width 层:
```python
In [6]: np.pad(a,pad_width=2)
Out[6]:
array([[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 1., 1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 0., 0.],
[0., 0., 1., 1., 1., 1., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0.]])
```
此函数在为数组充填值,卷积中有重要应用。
以上就是《python-small-examples》第 172 个小例子:NumPy 的 pad 填充方法。
[上一个例子](171.md) [下一个例子](173.md)
================================================
FILE: md/173.md
================================================
```markdown
@author jackzhenguo
@desc 创建一个下对角线为1、2、3、4的对角矩阵
@tag
@version
@date 2020/03/11
```
```python
In [1]: import numpy as np
In [2]: Z = np.diag(1+np.arange(4),k=-1)
...: print(Z)
[[0 0 0 0 0]
[1 0 0 0 0]
[0 2 0 0 0]
[0 0 3 0 0]
[0 0 0 4 0]]
```
其中,k 参数:大于0,表示与主对角线上移k,小于0下移k
[上一个例子](172.md) [下一个例子](174.md)
================================================
FILE: md/174.md
================================================
```markdown
@author jackzhenguo
@desc cut 数据分箱
@tag
@version
@date 2020/11/28
```
第174个小例子:cut 数据分箱
将百分制分数转为A,B,C,D四个等级,bins 被分为 [0,60,75,90,100],labels 等于['D', 'C', 'B', 'A']:
```python
# 生成20个[0,100]的随机整数
In [30]: a = np.random.randint(1,100,20)
In [31]: a
Out[31]:
array([48, 22, 46, 84, 13, 52, 36, 35, 27, 99, 31, 37, 15, 31, 5, 46, 98,99, 60, 43])
# cut分箱
In [33]: pd.cut(a, [0,60,75,90,100], labels = ['D', 'C', 'B', 'A'])
Out[33]:
[D, D, D, B, D, ..., D, A, A, D, D]
Length: 20
Categories (4, object): [D < C < B < A]
```
分箱后,48分对应D,22分对应D,46对应D,84分对应B,...
[上一个例子](173.md) [下一个例子](175.md)
================================================
FILE: md/175.md
================================================
```markdown
@author jackzhenguo
@desc 丢弃空值和填充空值
@tag
@version
@date 2020/03/13
```
丢弃空值
np.nan 是 pandas 中常见空值,使用 dropna 过滤空值,axis 0 表示按照行,1 表示按列,how 默认为 any ,意思是只要有一个 nan 就过滤某行或某列,all 所有都为 nan
```python
# axis 0 表示按照行,all 此行所有值都为 nan
df.dropna(axis=0, how='all')
```
充填空值
空值一般使用某个统计值填充,如平均数、众数、中位数等,使用函数 fillna:
```python
# 使用a列平均数填充列的空值,inplace true表示就地填充
df["a"].fillna(df["a"].mean(), inplace=True)
```
[上一个例子](174.md) [下一个例子](176.md)
================================================
FILE: md/176.md
================================================
```markdown
@author jackzhenguo
@desc 一行代码让 pip 安装加速 100 倍
@tag
@version
@date 2020/03/14
```
pip 安装普通方法:
```python
pip install scrapy
```
这个安装可能是龟速,甚至直接抛出 timeout 异常,然后可能你会加长 socket 延时,通过设置 `defualt-timeout` 参数:
```python
pip --defualt-timeout = 600 install scrapy
```
但是这不会加快安装速度,直接添加一个参数:
```python
-i https://pypi.tuna.tsinghua.edu.cn/simple
```
完整安装命令:
```python
pip --defualt-timeout = 600 install scrapy -i https://pypi.tuna.tsinghua.edu.cn/simple
```
后面安装你可以直接复制我这行命令,安装包的速度会快很多。
[上一个例子](175.md) [下一个例子](177.md)
================================================
FILE: md/177.md
================================================
```markdown
@author jackzhenguo
@desc 数据分析神器:deepnote
@tag
@version
@date 2020/03/15
```
一个和 jupyter notebook很像的神器:deepnote
jupyter notebook 是运行 python 非常好用的笔记本之一,尤其作数据分析、数据科学领域应用广泛。最近发现一款兼容 jupyter notebook,极好用的notebook: deepnote
使用也是免费!https://deepnote.com/
它的特点:时事协作,运行在云端
上手使用一下,使用shift+enter 执行代码
------
执行代码,体验很好,很香:
邀请伙伴直接进入你的notebook,多人协作,开发更快:
多了一种选择,调换着使用它们会很不错!
[上一个例子](176.md) [下一个例子](178.md)
================================================
FILE: md/178.md
================================================
```markdown
@author jackzhenguo
@desc apply 方法去掉特殊字符
@tag
@version
@date 2020/03/16
```
### apply 方法去掉特殊字符
某列单元格含有特殊字符,如标点符号,使用元素级操作方法 apply 干掉它们:
```python
import string
exclude = set(string.punctuation)
def remove_punctuation(x):
x = ''.join(ch for ch in x if ch not in exclude)
return x
# 原df
Out[26]:
a b
0 c,d edc.rc
1 3 3
2 d ef 4
# 过滤a列标点
In [27]: df.a = df.a.apply(remove_punctuation)
In [28]: df
Out[28]:
a b
0 cd edc.rc
1 3 3
2 d ef 4
```
[上一个例子](177.md) [下一个例子](179.md)
================================================
FILE: md/179.md
================================================
```markdown
@author jackzhenguo
@desc 使用map对列做特征工程
@tag
@version
@date 2020/03/17
```
**使用map对列做特征工程**
先生成数据:
```python
d = {
"gender":["male", "female", "male","female"],
"color":["red", "green", "blue","green"],
"age":[25, 30, 15, 32]
}
df = pd.DataFrame(d)
df
```
在 `gender` 列上,使用 map 方法,快速完成如下映射:
```python
d = {"male": 0, "female": 1}
df["gender2"] = df["gender"].map(d)
```
[上一个例子](178.md) [下一个例子](180.md)
================================================
FILE: md/18.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/16
```
#### 18 字符串格式化
格式化输出字符串,format(value, format_spec)实质上是调用了value的__format__(format_spec)方法。
```
In [104]: print("i am {0},age{1}".format("tom",18))
i am tom,age18
```
| 3.1415926 | {:.2f} | 3.14 | 保留小数点后两位 |
| ---------- | ------- | --------- | ---------------------------- |
| 3.1415926 | {:+.2f} | +3.14 | 带符号保留小数点后两位 |
| -1 | {:+.2f} | -1.00 | 带符号保留小数点后两位 |
| 2.71828 | {:.0f} | 3 | 不带小数 |
| 5 | {:0>2d} | 05 | 数字补零 (填充左边, 宽度为2) |
| 5 | {:x<4d} | 5xxx | 数字补x (填充右边, 宽度为4) |
| 10 | {:x<4d} | 10xx | 数字补x (填充右边, 宽度为4) |
| 1000000 | {:,} | 1,000,000 | 以逗号分隔的数字格式 |
| 0.25 | {:.2%} | 25.00% | 百分比格式 |
| 1000000000 | {:.2e} | 1.00e+09 | 指数记法 |
| 18 | {:>10d} | ' 18' | 右对齐 (默认, 宽度为10) |
| 18 | {:<10d} | '18 ' | 左对齐 (宽度为10) |
| 18 | {:^10d} | ' 18 ' | 中间对齐 (宽度为10) |
[上一个例子](17.md) [下一个例子](19.md)
================================================
FILE: md/180.md
================================================
```markdown
@author jackzhenguo
@desc category列转数值
@tag
@version
@date 2020/03/18
```
第 180 个小例子:**category列转数值**
某列取值只可能为有限个枚举值,往往需要转为数值,使用get_dummies,或自己定义函数:
```python
pd.get_dummies(df['a'])
```
自定义函数,结合 apply:
```python
def c2n(x):
if x=='A':
return 95
if x=='B':
return 80
df['a'].apply(c2n)
```
[上一个例子](179.md) [下一个例子](181.md)
================================================
FILE: md/181.md
================================================
```markdown
@author jackzhenguo
@desc rank排名
@tag
@version
@date 2020/03/19
```
第 181 个小例子:**rank排名**
rank 方法,生成数值排名,ascending 为False,考试分数越高,排名越靠前:
```python
In [36]: df = pd.DataFrame({'a':[46, 98,99, 60, 43]} ))
In [53]: df['a'].rank(ascending=False)
Out[53]:
0 4.0
1 2.0
2 1.0
3 3.0
4 5.0
```
[上一个例子](180.md) [下一个例子](182.md)
================================================
FILE: md/182.md
================================================
```markdown
@author jackzhenguo
@desc 完成数据下采样,调整步长由小时为天
@tag
@version
@date 2020/03/20
第 182 个小例子:**完成数据下采样,调整步长由小时为天**
```
步长为小时的时间序列数据,有没有小技巧,快速完成下采样,采集成按天的数据呢?先生成测试数据:
```python
import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randint(1,10,size=(240,3)), \
columns = ['商品编码','商品销量','商品库存'])
```
```python
df.index = pd.util.testing.makeDateIndex(240,freq='H')
df
使用 resample 方法,合并为天(D)
```
小技巧,使用 resample 方法,合并为天(D)
```python
day_df = df.resample("D")["商品销量"].sum().to_frame()
day_df
```
果如下,10行,240小时,正好为 10 days:
[上一个例子](181.md) [下一个例子](183.md)
================================================
FILE: md/183.md
================================================
```markdown
@author jackzhenguo
@desc 如何用 Pandas 快速生成时间序列数据?
@tag
@version
@date 2020/03/21
```
### 第183个小例子:如何用 Pandas 快速生成时间序列数据?
与时间序列相关的问题,平时还是挺常见的。
介绍一个小技巧,使用 `pd.util.testing.makeTimeDataFrame`
只需要一行代码,便能生成一个 index 为时间序列的 DataFrame:
```python
import pandas as pd
pd.util.testing.makeTimeDataFrame(10)
```
结果:
```markdown
A B C D
2000-01-03 0.932776 -1.509302 0.285825 0.941729
2000-01-04 0.565230 -1.598449 -0.786274 -0.221476
2000-01-05 -0.152743 -0.392053 -0.127415 0.841907
2000-01-06 1.321998 -0.927537 0.205666 -0.041110
2000-01-07 0.324359 1.512743 0.553633 0.392068
2000-01-10 -0.566780 0.201565 -0.801172 -1.165768
2000-01-11 -0.259348 -0.035893 -1.363496 0.475600
2000-01-12 -0.341700 -1.438874 -0.260598 -0.283653
2000-01-13 -1.085183 0.286239 2.475605 -1.068053
2000-01-14 -0.057128 -0.602625 0.461550 0.033472
```
时间序列的间隔还能配置,默认的 A B C D 四列也支持配置。
```python
import numpy as np
df = pd.DataFrame(np.random.randint(1,1000,size=(10,3)),
columns = ['商品编码','商品销量','商品库存'])
df.index = pd.util.testing.makeDateIndex(10,freq='H')
```
结果:
```markdown
商品编码 商品销量 商品库存
2000-01-01 00:00:00 99 264 98
2000-01-01 01:00:00 294 406 827
2000-01-01 02:00:00 89 221 931
2000-01-01 03:00:00 962 153 956
2000-01-01 04:00:00 538 46 374
2000-01-01 05:00:00 226 973 750
2000-01-01 06:00:00 193 866 7
2000-01-01 07:00:00 300 129 474
2000-01-01 08:00:00 966 372 835
2000-01-01 09:00:00 687 493 910
```
[上一个例子](182.md) [下一个例子](184.md)
================================================
FILE: md/184.md
================================================
```markdown
@author jackzhenguo
@desc 如何快速找出 DataFrame 所有列 null 值个数
@tag
@version
@date 2020/12/10
```
### 第184个小例子:如何快速找出 DataFrame 所有列 null 值个数?
实际使用的数据,null 值在所难免。如何快速找出 DataFrame 所有列的 null 值个数?
使用 Pandas 能非常方便实现,只需下面一行代码:
```python
data.isnull().sum()
```
data.isnull(): 逐行逐元素查找元素值是否为 null.
.sum(): 默认在 axis 为 0 上完成一次 reduce 求和。
上手实际数据,使用这个小技巧,很爽。
读取泰坦尼克预测生死的数据集
```python
data = pd.read_csv('titanicdataset-traincsv/train.csv')
```
检查 null 值:
```python
data.isnull().sum()
```
结果:
```python
PassengerId 0
Survived 0
Pclass 0
Name 0
Sex 0
Age 177
SibSp 0
Parch 0
Ticket 0
Fare 0
Cabin 687
Embarked 2
dtype: int64
```
Age 列 177 个 null 值
Cabin 列 687 个 null 值
Embarked 列 2 个 null 值
[上一个例子](183.md) [下一个例子](185.md)
================================================
FILE: md/185.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/23
```
### 第185个小例子:重新排序 DataFrame 的列
下面给出 2 种简便的小技巧。先构造数据:
```python
df = pd.DataFrame(np.random.randint(0,20,size=(5,7)) \
,columns=list('ABCDEFG'))
df
```
方法1,直接了当:
```python
df2 = df[["A", "C", "D", "F", "E", "G", "B"]]
df2
```
方法2,也了解下:
```python
cols = df.columns[[0, 2 , 3, 5, 4, 6, 1]]
df3 = df[cols]
df3
```
也能得到方法1的结果。
[上一个例子](184.md) [下一个例子](186.md)
================================================
FILE: md/186.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/24
```
### 第186个小例子:使用 count 统计词条 出现次数
读入 IMDB-Movie-Data 数据集,1000行数据:
```python
df = pd.read_csv("../input/imdb-data/IMDB-Movie-Data.csv")
df['Title']
```
打印 `Title` 列:
```python
0 Guardians of the Galaxy
1 Prometheus
2 Split
3 Sing
4 Suicide Squad
...
995 Secret in Their Eyes
996 Hostel: Part II
997 Step Up 2: The Streets
998 Search Party
999 Nine Lives
Name: Title, Length: 1000, dtype: object
```
标题是由几个单词组成,用空格分隔。
```python
df["words_count"] = df["Title"].str.count(" ") + 1
df[["Title","words_count"]]
```
[上一个例子](185.md) [下一个例子](187.md)
================================================
FILE: md/187.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/25
```
### 第187个小例子:split 求时分(HH:mm)的分钟差
split 是更加高效的实现,同样需要先转化为 str 类型:
```python
df['a'] = df['a'].astype(str)
df['b'] = df['b'].astype(str)
```
其次 split:
```python
df['asplit'] = df['a'].str.split(':')
df['bsplit'] = df['b'].str.split(':')
```
使用 apply 操作每个元素,转化为分钟数:
```python
df['amins'] = df['asplit'].apply(lambda x: int(x[0])*60 + int(x[1]))
df['bmins'] = df['bsplit'].apply(lambda x: int(x[0])*60 + int(x[1]))
```
[上一个例子](186.md) [下一个例子](188.md)
================================================
FILE: md/188.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/26
```
### 第188个小例子:melt透视数据小技巧
melt 方法固定某列为一个维度,组合其他列名为另一个维度,实现宽表融化为长表:
```python
zip_code factory warehouse retail
0 12345 100 200 1
1 56789 400 300 2
2 101112 500 400 3
3 131415 600 500 4
```
固定列`zip_code`,组合`factory`,`warehouse`,`retail` 三个列名为一个维度,按照这种方法凑齐两个维度后,数据一定变长。
pandas 的 melt 方法演示如下:
```python
In [49]: df = df.melt(id_vars = "zip_code")
```
若melt方法,参数`value_vars`不赋值,默认剩余所有列都是value_vars,所以结果如下:
```python
zip_code variable value
0 12345 factory 100
1 56789 factory 400
2 101112 factory 500
3 131415 factory 600
4 12345 warehouse 200
5 56789 warehouse 300
6 101112 warehouse 400
7 131415 warehouse 500
8 12345 retail 1
9 56789 retail 2
10 101112 retail 3
11 131415 retail 4
```
若只想查看 factory 和 retail,则 `value_vars` 赋值为它们即可:
```python
In [62]: df_melt2 = df.melt(id_vars = "zip_code",value_vars=['factory','reta
...: il'])
```
结果:
```python
zip_code variable value
0 12345 factory 100
1 56789 factory 400
2 101112 factory 500
3 131415 factory 600
4 12345 retail 1
5 56789 retail 2
6 101112 retail 3
7 131415 retail 4
```
melt 透视数据后,因为组合多个列为1列,所以数据一定变长。
[上一个例子](187.md) [下一个例子](189.md)
================================================
FILE: md/189.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/27
```
### 第189个小例子: pivot 透视小技巧
melt 是融化数据,而 `pivot` 结冰数据,它们是一对互逆操作。
这是上面 melt 后的数据:
```python
zip_code variable value
0 12345 factory 100
1 56789 factory 400
2 101112 factory 500
3 131415 factory 600
4 12345 retail 1
5 56789 retail 2
6 101112 retail 3
7 131415 retail 4
```
现在想要还原为:
```python
variable factory retail
zip_code
12345 100 1
56789 400 2
101112 500 3
131415 600 4
```
如何实现?
使用 `pivot` 方法很容易做到:
```python
df_melt2.pivot(index='zip_code',columns='variable')
```
index 设定第一个轴,为 zip_code,columns 设定哪些列或哪个列的不同取值组合为一个轴,此处设定为 variable 列,它一共有 2 种不同的取值,分别为 factory, retail,pivot 透视后变为列名,也就是 axis = 1 的轴
pivot 方法没有聚合功能,它的升级版为 `pivot_table` 方法,能对数据聚合。
[上一个例子](188.md) [下一个例子](190.md)
================================================
FILE: md/19.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/20
```
#### 19 拿来就用的排序函数
排序:
```python
In [1]: a = [1,4,2,3,1]
In [2]: sorted(a,reverse=True)
Out[2]: [4, 3, 2, 1, 1]
In [3]: a = [{'name':'xiaoming','age':18,'gender':'male'},{'name':'
...: xiaohong','age':20,'gender':'female'}]
In [4]: sorted(a,key=lambda x: x['age'],reverse=False)
Out[4]:
[{'name': 'xiaoming', 'age': 18, 'gender': 'male'},
{'name': 'xiaohong', 'age': 20, 'gender': 'female'}]
```
[上一个例子](18.md) [下一个例子](20.md)
================================================
FILE: md/190.md
================================================
```markdown
@author jackzhenguo
@desc 随机读取文件的K行,生成N个
@tag
@version
@date 2020/03/28
```
### 第190个小例子: 随机读取文件的K行,生成N个
```python
def random_lines_save(filename,gen_file_cnt=10):
"""
随机选取文件的某些行并保存,想要生成这类文件的个数由参数
@param: gen_file_cnt 指定
@param: filename 读入文件的完整路径
@param: gen_file_cnt 想要产生的文件个数
"""
df = pd.read_excel(filename)
for i in range(gen_file_cnt):
n = random.randint(1,len(df))
dfs = df.sample(n)
dfs.to_excel(str(n)+".xlsx",index=False)
print(str(n)+".xlsx")
```
这是一个很实用的函数,用于随机生成K行N个文件,使用场景:原来的文件行数较多,想从中随机提取组合N个文件时。
[上一个例子](189.md) [下一个例子](191.md)
================================================
FILE: md/191.md
================================================
```markdown
@author jackzhenguo
@desc 格式化Pandas的时间列
@tag
@version
@date 2020/03/29
```
### 第191个小例子: 格式化Pandas的时间列
```python
import pandas as pd
from datetime import datetime, time
def series_dt_fmt(s:pd.Series,fmt:str)-> pd.Series:
"""
根据fmt格式,格式化s列
s列是datetime 或者 datetime的str类型,如'2020-12-30 11:44:00'
"""
st = pd.to_datetime(s)
return st.apply(lambda t: datetime.strftime(t,fmt))
```
别看只有两行代码,却能实现更加丰富的功能,相比pandas,支持直接返回时分等格式:
```python
s = pd.Series(['2020-12-30 11:44:00','2020-12-30 11:20:10'])
# 只保留时分
fmt = '%H:%M'
series_dt_fmt(s,fmt)
# 输出结果
0 11:44
1 11:20
dtype: object
```
[上一个例子](190.md) [下一个例子](192.md)
================================================
FILE: md/192.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/30
```
### 192: 创建SQLite连接
编写一个Python程序,创建一个SQLite数据库,并与数据库连接,打印SQLite数据库的版本
一种解决方法:
```python
import sqlite3
try:
sqlite_Connection = sqlite3.connect('temp.db')
conn = sqlite_Connection.cursor()
print("连接到 SQLite.")
sqlite_select_Query = "select sqlite_version();"
conn.execute(sqlite_select_Query)
record = conn.fetchall()
print("SQLite 数据库的版本是 ", record)
conn.close()
except sqlite3.Error as error:
print("连接到SQLite出错:", error)
finally:
if (sqlite_Connection):
sqlite_Connection.close()
print("关闭SQLite连接")
```
以上就是第192例,希望对你有用,欢迎点赞支持。
[上一个例子](191.md) [下一个例子](193.md)
================================================
FILE: md/193.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/03/31
```
### 193 json对象转python对象
python的`json`模块`loads`方法将json对象转为字典,如下所示:
```python
In [1]: import json
In [2]: json_obj = '{ "Name":"David", "Class":"I", "Age":6 }'
In [3]: python_obj = json.loads(json_obj)
In [4]: type(python_obj)
Out[4]: dict
```
打印查看相关属性
```python
print("\nJSON data:")
print(python_obj)
print("\nName: ",python_obj["Name"])
print("Class: ",python_obj["Class"])
print("Age: ",python_obj["Age"])
```
[上一个例子](192.md) [下一个例子](194.md)
================================================
FILE: md/194.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/01
```
#### 194 python对象转json对象
```python
import json
# a Python object (dict):
python_obj = {
"name": "David",
"class":"I",
"age": 6
}
print(type(python_obj))
```
使用`json.dumps`方法转化为json对象:
```
# convert into JSON:
j_data = json.dumps(python_obj)
# result is a JSON string:
print(j_data)
```
##### 带格式转为json
若字典转化为json对象后,保证键有序,且缩进4格,如何做到?
```python
json.dumps(j_str, sort_keys=True, indent=4)
```
例子:
```python
import json
j_str = {'4': 5, '6': 7, '1': 3, '2': 4}
print(json.dumps(j_str, sort_keys=True, indent=4))
```
[上一个例子](193.md) [下一个例子](195.md)
================================================
FILE: md/195.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/02
```
#### 195 发现列表前3个最大或最小数
使用堆模块 heapq 里的 nlargest 方法:
```python
import heapq as hq
nums_list = [25, 35, 22, 85, 14, 65, 75, 22, 58]
# Find three largest values
largest_nums = hq.nlargest(3, nums_list)
print(largest_nums)
```
相应的求最小3个数,使用堆模块 heapq 里的 nsmallest 方法:
```python
import heapq as hq
nums_list = [25, 35, 22, 85, 14, 65, 75, 22, 58]
smallest_nums = hq.nsmallest(3, nums_list)
print("\nThree smallest numbers are:", smallest_nums)
```
[上一个例子](194.md) [下一个例子](196.md)
================================================
FILE: md/196.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/03
```
### 196 使用堆排序列表为升序
使用 heapq 模块,首先对列表建堆,默认建立小根堆,调用len(nums) 次heapop:
```python
import heapq as hq
nums_list = [18, 14, 10, 9, 8, 7, 9, 3, 2, 4, 1]
hq.heapify(nums_list)
s_result = [hq.heappop(nums_list) for _ in range(len(nums_list))]
print(s_result)
```
[上一个例子](195.md) [下一个例子](197.md)
================================================
FILE: md/197.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/04
```
下面正则适用于提取正整数和大于0的浮点数,参看我的,若有疏漏欢迎补充。
```python
>>> import re
>>> pat_integ = '[1-9]+\d*'
>>> pat_float0 = '0\.\d+[1-9]'
>>> pat_float1 = '[1-9]\d*\.d+'
>>> pat = 'r%s|%s|%s'%(pat_float_0,pat_float_1,pat_integ)
>>> re.findall(pat, r)
['0.78', '3446.73', '0.91', '13642.95', '1.06', '2672.12', '3000']
```
排除这些串:
000
000100
0.00
000.00
解释:`*`表示前一个字符出现0次或多次,`+`表示前一个字符出现1次或多次,`\d`表示数字[0-9],`[1-9]`表示1,2,3,4,5,6,7,8,9,`\.`表示小数点
主要考虑:正整数最左侧一位大于0,大于1的浮点数必须以[1-9]开始,大于0小于1的浮点数小数点前只有1个0.
Day163:使用Python正则 提取出输入一段文字中的所有浮点数和整数 #Python拆书1#
例如: 截至收盘,上证指数涨0.78%,报3446.73点,深证成指涨0.91%,报13642.95点,创业板指涨1.06%,报2672.12点。指数午后震荡走高,碳中和概念强者恒强,板块内上演涨停潮,环保、物业、特高压板块午后涨幅扩大,数字货币板块尾盘冲高,钢铁、煤炭、有色板块全天较为低迷,题材股午后整体回暖,两市上涨个股逾3000家,赚钱效益较好。
提取出所有浮点数和整数: 0.78, 3446.73, 0.91,13642.95 等
[上一个例子](196.md) [下一个例子](198.md)
================================================
FILE: md/198.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/05
```
## 1 常见算术运算
```python
x, y = 3, 2
print(x + y) # = 5
print(x - y) # = 1
print(x * y) # = 6
print(x / y) # = 1.5
print(x // y) # = 1
print(x % y) # = 1
print(-x) # = -3
print(abs(-x)) # = 3
print(int(3.9)) # = 3
print(float(x)) # = 3.0
print(x ** y) # = 9
```
大多数操作符都是不言自明的。注意,`//`运算符执行整数除法。结果是一个向下舍入的整数值(例如,3//2==1)
[上一个例子](197.md) [下一个例子](199.md)
================================================
FILE: md/199.md
================================================
```markdown
@author jackzhenguo
@desc 求两点球面距离
@tag
@version
@date 2020/04/06
```
```python
EARTH_RADIUS = 6378.137
import math
# 角度弧度计算公式
def get_radian(degree):
return degree * 3.1415926 / 180.0
# 根据经纬度计算两点之间的距离,得到的单位是 千米
def get_distance(lat1,lng1,lat2,lng2):
radLat1 = get_radian(lat1)
radLat2 = get_radian(lat2)
a = radLat1 - radLat2 # 两点纬度差
b = get_radian(lng1) - get_radian(lng2); # 两点的经度差
s = 2 * math.asin(math.sqrt(math.pow(math.sin(a / 2), 2) +
math.cos(radLat1) * math.cos(radLat2) * math.pow(math.sin(b / 2), 2)));
s = s * EARTH_RADIUS
return s
```
[上一个例子](198.md) [下一个例子](200.md)
================================================
FILE: md/2.md
================================================
```markdown
@author jackzhenguo
@desc 进制转化
@date 2019/2/10
```
#### 2 进制转化
十进制转换为二进制:
```python
In [1]: bin(10)
Out[1]: '0b1010'
```
十进制转换为八进制:
```python
In [2]: oct(9)
Out[2]: '0o11'
```
十进制转换为十六进制:
```python
In [3]: hex(15)
Out[3]: '0xf'
```
[上一个例子](1.md) [下一个例子](3.md)
================================================
FILE: md/20.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/20
```
#### 20 求和函数
求和:
```python
In [181]: a = [1,4,2,3,1]
In [182]: sum(a)
Out[182]: 11
In [185]: sum(a,10) #求和的初始值为10
Out[185]: 21
```
[上一个例子](19.md) [下一个例子](21.md)
================================================
FILE: md/200.md
================================================
```markdown
@author jackzhenguo
@desc 获取文件编码
@tag
@version
@date 2020/04/07
```
```python
import chardet
from chardet import UniversalDetector
def get_encoding(file):
with open(file, "rb") as f:
cs = chardet.detect(f.read())
return cs['encoding']
detector = UniversalDetector()
with open(file, "rb") as f:
for line in f.readlines():
detector.feed(line)
if detector.done:
break
detector.close()
return detector.result
```
[上一个例子](199.md) [下一个例子](201.md)
================================================
FILE: md/201.md
================================================
```markdown
@author jackzhenguo
@desc 格式化json串
@tag
@version
@date 2020/04/08
```
```python
import json
def format_json(json_str: str):
dic = json.loads(json_str)
js = json.dumps(dic,
sort_keys=True,
ensure_ascii=False,
indent=4,
separators=(', ', ': '))
return js
```
[上一个例子](200.md) [下一个例子](202.md)
================================================
FILE: md/202.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/09
```
[上一个例子](201.md) [下一个例子](203.md)
================================================
FILE: md/203.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/10
```
[上一个例子](202.md) [下一个例子](204.md)
================================================
FILE: md/204.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/11
```
[上一个例子](203.md) [下一个例子](205.md)
================================================
FILE: md/205.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/12
```
[上一个例子](204.md) [下一个例子](206.md)
================================================
FILE: md/206.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/13
```
[上一个例子](205.md) [下一个例子](207.md)
================================================
FILE: md/207.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/14
```
[上一个例子](206.md) [下一个例子](208.md)
================================================
FILE: md/208.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/15
```
[上一个例子](207.md) [下一个例子](209.md)
================================================
FILE: md/209.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/16
```
[上一个例子](208.md) [下一个例子](210.md)
================================================
FILE: md/21.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/20
```
#### 21 nonlocal用于内嵌函数中
关键词`nonlocal`常用于函数嵌套中,声明变量`i`为非局部变量;
如果不声明,`i+=1`表明`i`为函数`wrapper`内的局部变量,因为在`i+=1`引用(reference)时,i未被声明,所以会报`unreferenced variable`的错误。
```python
def excepter(f):
i = 0
t1 = time.time()
def wrapper():
try:
f()
except Exception as e:
nonlocal i
i += 1
print(f'{e.args[0]}: {i}')
t2 = time.time()
if i == n:
print(f'spending time:{round(t2-t1,2)}')
return wrapper
```
[上一个例子](20.md) [下一个例子](22.md)
================================================
FILE: md/210.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/17
```
[上一个例子](209.md) [下一个例子](211.md)
================================================
FILE: md/211.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/18
```
[上一个例子](210.md) [下一个例子](212.md)
================================================
FILE: md/212.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/19
```
[上一个例子](211.md) [下一个例子](213.md)
================================================
FILE: md/213.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/20
```
[上一个例子](212.md) [下一个例子](214.md)
================================================
FILE: md/214.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/21
```
[上一个例子](213.md) [下一个例子](215.md)
================================================
FILE: md/215.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/22
```
[上一个例子](214.md) [下一个例子](216.md)
================================================
FILE: md/216.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/23
```
[上一个例子](215.md) [下一个例子](217.md)
================================================
FILE: md/217.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/24
```
[上一个例子](216.md) [下一个例子](218.md)
================================================
FILE: md/218.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/25
```
[上一个例子](217.md) [下一个例子](219.md)
================================================
FILE: md/219.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/26
```
[上一个例子](218.md) [下一个例子](220.md)
================================================
FILE: md/22.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/23
```
#### 22 global 声明全局变量
先回答为什么要有`global`,一个变量被多个函数引用,想让全局变量被所有函数共享。有的伙伴可能会想这还不简单,这样写:
```python
i = 5
def f():
print(i)
def g():
print(i)
pass
f()
g()
```
f和g两个函数都能共享变量`i`,程序没有报错,所以他们依然不明白为什么要用`global`.
但是,如果我想要有个函数对`i`递增,这样:
```python
def h():
i += 1
h()
```
此时执行程序,bang, 出错了! 抛出异常:`UnboundLocalError`,原来编译器在解释`i+=1`时会把`i`解析为函数`h()`内的局部变量,很显然在此函数内,编译器找不到对变量`i`的定义,所以会报错。
`global`就是为解决此问题而被提出,在函数h内,显式地告诉编译器`i`为全局变量,然后编译器会在函数外面寻找`i`的定义,执行完`i+=1`后,`i`还为全局变量,值加1:
```python
i = 0
def h():
global i
i += 1
h()
print(i)
```
[上一个例子](21.md) [下一个例子](23.md)
================================================
FILE: md/220.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/27
```
[上一个例子](219.md) [下一个例子](221.md)
================================================
FILE: md/221.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/28
```
[上一个例子](220.md) [下一个例子](222.md)
================================================
FILE: md/222.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/29
```
[上一个例子](221.md) [下一个例子](223.md)
================================================
FILE: md/223.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/04/30
```
[上一个例子](222.md) [下一个例子](224.md)
================================================
FILE: md/224.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/01
```
[上一个例子](223.md) [下一个例子](225.md)
================================================
FILE: md/225.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/02
```
[上一个例子](224.md) [下一个例子](226.md)
================================================
FILE: md/226.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/03
```
[上一个例子](225.md) [下一个例子](227.md)
================================================
FILE: md/227.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/04
```
[上一个例子](226.md) [下一个例子](228.md)
================================================
FILE: md/228.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/05
```
[上一个例子](227.md) [下一个例子](229.md)
================================================
FILE: md/229.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/06
```
[上一个例子](228.md) [下一个例子](230.md)
================================================
FILE: md/23.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/23
```
#### 23 交换两元素
理解交换两个元素,需要首先明白什么是pack,什么是unpack
```python
In [1]: a=[3,1]
# unpack
In [2]: a0,a1 = a
In [3]: a0
Out[3]: 3
In [4]: a1
Out[4]: 1
# pack
In [5]: b = a0, a1
In [6]: b
Out[6]: (3, 1)
```
所以下面 `b,a = a,b` 交换2个元素的过程,实际是先pack a,b为元组 (a,b),然后再unpack (a,b) 给 b, a的过程
```python
def swap(a, b):
return b, a
print(swap(1, 0)) # (0,1)
```
[上一个例子](22.md) [下一个例子](24.md)
================================================
FILE: md/230.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/07
```
[上一个例子](229.md) [下一个例子](231.md)
================================================
FILE: md/231.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/08
```
[上一个例子](230.md) [下一个例子](232.md)
================================================
FILE: md/232.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/09
```
[上一个例子](231.md) [下一个例子](233.md)
================================================
FILE: md/233.md
================================================
```markdown
@author jackzhenguo
@desc
@tag
@version
@date 2020/05/10
```
[上一个例子](232.md) [下一个例子](234.md)
================================================
FILE: md/24.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/25
```
#### 24 操作函数对象
```python
In [31]: def f():
...: print('i\'m f')
In [32]: def g():
...: print('i\'m g')
In [33]: [f,g][1]()
i'm g
```
创建函数对象的list,根据想要调用的index,方便统一调用。
[上一个例子](23.md) [下一个例子](25.md)
================================================
FILE: md/25.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/25
```
#### 25 生成逆序序列
```python
list(range(10,-1,-1))
# [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
```
第三个参数为负时,表示从第一个参数开始递减,终止到第二个参数(不包括此边界)
[上一个例子](24.md) [下一个例子](26.md)
================================================
FILE: md/26.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/27
```
#### 26 函数的五类参数使用例子
python五类参数:位置参数,关键字参数,默认参数,可变位置或关键字参数的使用。
```python
def f(a,*b,c=10,**d):
print(f'a:{a},b:{b},c:{c},d:{d}')
```
*默认参数`c`不能位于可变关键字参数`d`后.*
调用f:
```python
In [10]: f(1,2,5,width=10,height=20)
a:1,b:(2, 5),c:10,d:{'width': 10, 'height': 20}
```
可变位置参数`b`实参后被解析为元组`(2,5)`;而c取得默认值10; d被解析为字典.
再次调用f:
```python
In [11]: f(a=1,c=12)
a:1,b:(),c:12,d:{}
```
a=1传入时a就是关键字参数,b,d都未传值,c被传入12,而非默认值。
注意观察参数`a`, 既可以`f(1)`,也可以`f(a=1)` 其可读性比第一种更好,建议使用f(a=1)。如果要强制使用`f(a=1)`,需要在前面添加一个**星号**:
```python
def f(*,a,**b):
print(f'a:{a},b:{b}')
```
此时f(1)调用,将会报错:`TypeError: f() takes 0 positional arguments but 1 was given`
只能`f(a=1)`才能OK.
说明前面的`*`发挥作用,它变为只能传入关键字参数,那么如何查看这个参数的类型呢?借助python的`inspect`模块:
```python
In [22]: for name,val in signature(f).parameters.items():
...: print(name,val.kind)
...:
a KEYWORD_ONLY
b VAR_KEYWORD
```
可看到参数`a`的类型为`KEYWORD_ONLY`,也就是仅仅为关键字参数。
但是,如果f定义为:
```python
def f(a,*b):
print(f'a:{a},b:{b}')
```
查看参数类型:
```python
In [24]: for name,val in signature(f).parameters.items():
...: print(name,val.kind)
...:
a POSITIONAL_OR_KEYWORD
b VAR_POSITIONAL
```
可以看到参数`a`既可以是位置参数也可是关键字参数。
[上一个例子](25.md) [下一个例子](27.md)
================================================
FILE: md/27.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/27
```
#### 27 使用slice对象
生成关于蛋糕的序列cake1:
```python
In [1]: cake1 = list(range(5,0,-1))
In [2]: b = cake1[1:10:2]
In [3]: b
Out[3]: [4, 2]
In [4]: cake1
Out[4]: [5, 4, 3, 2, 1]
```
再生成一个序列:
```python
In [5]: from random import randint
...: cake2 = [randint(1,100) for _ in range(100)]
...: # 同样以间隔为2切前10个元素,得到切片d
...: d = cake2[1:10:2]
In [6]: d
Out[6]: [75, 33, 63, 93, 15]
```
你看,我们使用同一种切法,分别切开两个蛋糕cake1,cake2. 后来发现这种切法`极为经典`,又拿它去切更多的容器对象。
那么,为什么不把这种切法封装为一个对象呢?于是就有了slice对象。
定义slice对象极为简单,如把上面的切法定义成slice对象:
```python
perfect_cake_slice_way = slice(1,10,2)
#去切cake1
cake1_slice = cake1[perfect_cake_slice_way]
cake2_slice = cake2[perfect_cake_slice_way]
In [11]: cake1_slice
Out[11]: [4, 2]
In [12]: cake2_slice
Out[12]: [75, 33, 63, 93, 15]
```
与上面的结果一致。
对于逆向序列切片,`slice`对象一样可行:
```python
a = [1,3,5,7,9,0,3,5,7]
a_ = a[5:1:-1]
named_slice = slice(5,1,-1)
a_slice = a[named_slice]
In [14]: a_
Out[14]: [0, 9, 7, 5]
In [15]: a_slice
Out[15]: [0, 9, 7, 5]
```
频繁使用同一切片的操作可使用slice对象抽出来,复用的同时还能提高代码可读性。
[上一个例子](26.md) [下一个例子](28.md)
================================================
FILE: md/28.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/2
```
#### 28 lambda 函数的动画演示
有些读者反映,`lambda`函数不太会用,问我能不能解释一下。
比如,下面求这个 `lambda`函数:
```python
def max_len(*lists):
return max(*lists, key=lambda v: len(v))
```
有两点疑惑:
- 参数`v`的取值?
- `lambda`函数有返回值吗?如果有,返回值是多少?
调用上面函数,求出以下三个最长的列表:
```python
r = max_len([1, 2, 3], [4, 5, 6, 7], [8])
print(f'更长的列表是{r}')
```
结论:
- 参数v的可能取值为`*lists`,也就是 `tuple` 的一个元素。
- `lambda`函数返回值,等于`lambda v`冒号后表达式的返回值。
[上一个例子](27.md) [下一个例子](29.md)
================================================
FILE: md/29.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/2
```
#### 29 转为字典
创建数据字典
```python
# 方法1:使用dict
In [1]: dict()
Out[1]: {}
In [2]: dict(a='a',b='b')
Out[2]: {'a': 'a', 'b': 'b'}
# 方法2:zip
In [3]: dict(zip(['a','b'],[1,2]))
Out[3]: {'a': 1, 'b': 2}
# 方法3:嵌入元组的列表
In [4]: dict([('a',1),('b',2)])
Out[4]: {'a': 1, 'b': 2}
# 方法4:自典型字符串
In [1]: s = "{'a':1, 'b':2}"
In [2]: eval(s)
Out[2]: {'a': 1, 'b': 2}
```
[上一个例子](28.md) [下一个例子](30.md)
================================================
FILE: md/3.md
================================================
```markdown
@author jackzhenguo
@desc 整数和ASCII互转
@date 2019/2/10
```
#### 3 整数和ASCII互转
十进制整数对应的`ASCII字符`
```python
In [1]: chr(65)
Out[1]: 'A'
```
查看某个`ASCII字符`对应的十进制数
```python
In [1]: ord('A')
Out[1]: 65
```
[上一个例子](2.md) [下一个例子](4.md)
================================================
FILE: md/30.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 30 冻结集合
创建一个不可修改的集合。
```python
In [1]: frozenset([1,1,3,2,3])
Out[1]: frozenset({1, 2, 3})
```
因为不可修改,所以没有像`set`那样的`add`和`pop`方法
[上一个例子](29.md) [下一个例子](31.md)
================================================
FILE: md/31.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 31 转为集合类型
返回一个set对象,集合内不允许有重复元素:
```python
In [1]: a = [1,4,2,3,1]
In [2]: set(a)
Out[2]: {1, 2, 3, 4}
In [3]: b = set(a)
In [4]: b.add(5)
In [5]: b
Out[5]: {1, 2, 3, 4, 5}
In [6]: b.pop()
Out[6]: 1
In [7]: b
Out[7]: {2, 3, 4, 5}
In [8]: b.pop()
Out[8]: 2
In [9]: b
Out[9]: {3, 4, 5}
# 注意pop删除集合内任意一个元素
In [10]: help(b.pop)
Help on built-in function pop:
pop(...) method of builtins.set instance
Remove and return an arbitrary set element.
Raises KeyError if the set is empty.
```
[上一个例子](30.md) [下一个例子](32.md)
================================================
FILE: md/32.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 32 转元组
`tuple()` 将对象转为一个不可变的序列类型
```python
In [16]: i_am_list = [1,3,5]
In [17]: i_am_tuple = tuple(i_am_list)
In [18]: i_am_tuple
Out[18]: (1, 3, 5)
```
[上一个例子](31.md) [下一个例子](33.md)
================================================
FILE: md/33.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/6
```
#### 33 对象是否可调用
检查对象是否可被调用
```python
In [1]: callable(str)
Out[1]: True
In [2]: callable(int)
Out[2]: True
```
```python
In [18]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
...
In [19]: xiaoming = Student('001','xiaoming')
In [20]: callable(xiaoming)
Out[20]: False
```
如果能调用`xiaoming()`, 需要重写`Student`类的`__call__`方法:
```python
In [1]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
...: def __call__(self):
...: print('I can be called')
...: print(f'my name is {self.name}')
...:
In [2]: t = Student('001','xiaoming')
In [3]: t()
I can be called
my name is xiaoming
```
[上一个例子](32.md) [下一个例子](34.md)
================================================
FILE: md/34.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 34 ascii 展示对象
调用对象的 `__repr__` 方法,获得该方法的返回值,如下例子返回值为字符串
```python
>>> class Student():
def __init__(self,id,name):
self.id = id
self.name = name
def __repr__(self):
return 'id = '+self.id +', name = '+self.name
```
调用:
```python
>>> xiaoming = Student(id='1',name='xiaoming')
>>> xiaoming
id = 1, name = xiaoming
>>> ascii(xiaoming)
'id = 1, name = xiaoming'
```
[上一个例子](33.md) [下一个例子](35.md)
================================================
FILE: md/35.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 35 类方法
`classmethod` 装饰器对应的函数不需要实例化,不需要 `self `参数。
但第一个参数需要是表示自身类的 `cls` 参数,可以来调用类的属性,类的方法,实例化对象等。
```python
In [1]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
...: @classmethod
...: def f(cls):
...: print(cls)
```
[上一个例子](34.md) [下一个例子](36.md)
================================================
FILE: md/36.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 36 动态删除属性
```python
>>> class Student():
def __init__(self,id,name):
self.id = id
self.name = name
def __repr__(self):
return 'id = '+self.id +', name = '+self.name
```
调用:
```python
>>> xiaoming = Student(id='1',name='xiaoming')
>>> xiaoming
id = 1, name = xiaoming
>>> ascii(xiaoming)
'id = 1, name = xiaoming'
```
删除对象的属性
```python
In [1]: delattr(xiaoming,'id')
In [2]: hasattr(xiaoming,'id')
Out[2]: False
```
[上一个例子](35.md) [下一个例子](37.md)
================================================
FILE: md/37.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 37 一键查看对象所有方法
不带参数时返回`当前范围`内的变量、方法和定义的类型列表;带参数时返回`参数`的属性,方法列表。
```python
In [96]: dir(xiaoming)
Out[96]:
['__class__',
'__delattr__',
'__dict__',
'__dir__',
'__doc__',
'__eq__',
'__format__',
'__ge__',
'__getattribute__',
'__gt__',
'__hash__',
'__init__',
'__init_subclass__',
'__le__',
'__lt__',
'__module__',
'__ne__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__sizeof__',
'__str__',
'__subclasshook__',
'__weakref__',
'name']
```
[上一个例子](36.md) [下一个例子](38.md)
================================================
FILE: md/38.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 38 动态获取对象属性
获取对象的属性
```python
In [1]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
In [2]: xiaoming = Student(id='001',name='xiaoming')
In [3]: getattr(xiaoming,'name') # 获取xiaoming这个实例的name属性值
Out[3]: 'xiaoming'
```
[上一个例子](37.md) [下一个例子](39.md)
================================================
FILE: md/39.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 39 对象是否有某个属性
```python
In [1]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
In [2]: xiaoming = Student(id='001',name='xiaoming')
In [3]: hasattr(xiaoming,'name')
Out[3]: True
In [4]: hasattr(xiaoming,'address')
Out[4]: False
```
[上一个例子](38.md) [下一个例子](40.md)
================================================
FILE: md/4.md
================================================
```markdown
@author jackzhenguo
@desc 元素都为真检查
@date 2019/2/10
```
#### 4 元素都为真检查
所有元素都为真,返回 `True`,否则为`False`
```python
In [5]: all([1,0,3,6])
Out[5]: False
```
```python
In [6]: all([1,2,3])
Out[6]: True
```
[上一个例子](3.md) [下一个例子](5.md)
================================================
FILE: md/40.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 40 对象门牌号
```python
In [1]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
In [2]: xiaoming = Student(id='001',name='xiaoming')
```
返回对象的内存地址
```python
In [1]: id(xiaoming)
Out[1]: 98234208
```
[上一个例子](39.md) [下一个例子](41.md)
================================================
FILE: md/41.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/13
```
#### 41 isinstance
判断*object*是否为类*classinfo*的实例,是返回true
```python
In [1]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
In [2]: xiaoming = Student(id='001',name='xiaoming')
In [3]: isinstance(xiaoming,Student)
Out[3]: True
```
[上一个例子](40.md) [下一个例子](42.md)
================================================
FILE: md/42.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/13
```
#### 42 issubclass父子关系鉴定
```python
In [1]: class undergraduate(Student):
...: def studyClass(self):
...: pass
...: def attendActivity(self):
...: pass
In [2]: issubclass(undergraduate,Student)
Out[2]: True
In [3]: issubclass(object,Student)
Out[3]: False
In [4]: issubclass(Student,object)
Out[4]: True
```
如果class是classinfo元组中某个元素的子类,也会返回True
```python
In [1]: issubclass(int,(int,float))
Out[1]: True
```
[上一个例子](41.md) [下一个例子](43.md)
================================================
FILE: md/43.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/15
```
#### 43 所有对象之根
object 是所有类的基类
```python
In [1]: o = object()
In [2]: type(o)
Out[2]: object
```
[上一个例子](42.md) [下一个例子](44.md)
================================================
FILE: md/44.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/16
```
#### 44 创建属性的两种方式
返回 property 属性,典型的用法:
```python
class C:
def __init__(self):
self._x = None
def getx(self):
return self._x
def setx(self, value):
self._x = value
def delx(self):
del self._x
# 使用property类创建 property 属性
x = property(getx, setx, delx, "I'm the 'x' property.")
```
使用python装饰器,实现与上完全一样的效果代码:
```python
class C:
def __init__(self):
self._x = None
@property
def x(self):
return self._x
@x.setter
def x(self, value):
self._x = value
@x.deleter
def x(self):
del self._x
```
[上一个例子](43.md) [下一个例子](45.md)
================================================
FILE: md/45.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 45 查看对象类型
*class* `type`(*name*, *bases*, *dict*)
传入一个参数时,返回 *object* 的类型:
```python
In [1]: class Student():
...: def __init__(self,id,name):
...: self.id = id
...: self.name = name
...: def __repr__(self):
...: return 'id = '+self.id +', name = '+self.name
...:
In [2]: xiaoming = Student(id='001',name='xiaoming')
In [3]: type(xiaoming)
Out[3]: __main__.Student
In [4]: type(tuple())
Out[4]: tuple
```
[上一个例子](44.md) [下一个例子](46.md)
================================================
FILE: md/46.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 46 元类使用介绍
`xiaoming`, `xiaohong`, `xiaozhang` 都是学生,这类群体叫做 `Student`.
Python 定义类的常见方法,使用关键字 `class`
```python
In [36]: class Student(object):
...: pass
```
`xiaoming`, `xiaohong`, `xiaozhang` 是类的实例,则:
```python
xiaoming = Student()
xiaohong = Student()
xiaozhang = Student()
```
创建后,xiaoming 的 `__class__` 属性,返回的便是 `Student`类
```python
In [38]: xiaoming.__class__
Out[38]: __main__.Student
```
问题在于,`Student` 类有 `__class__`属性,如果有,返回的又是什么?
```python
In [39]: xiaoming.__class__.__class__
Out[39]: type
```
哇,程序没报错,返回 `type`
那么,我们不妨猜测:`Student` 类,类型就是 `type`
换句话说,`Student`类就是一个**对象**,它的类型就是 `type`
所以,Python 中一切皆对象,**类也是对象**
Python 中,将描述 `Student` 类的类被称为:元类。
按照此逻辑延伸,描述元类的类被称为:*元元类*,开玩笑了~ 描述元类的类也被称为元类。
聪明的朋友会问了,既然 `Student` 类可创建实例,那么 `type` 类可创建实例吗? 如果能,它创建的实例就叫:类 了。 你们真聪明!
说对了,`type` 类一定能创建实例,比如 `Student` 类了。
```python
In [40]: Student = type('Student',(),{})
In [41]: Student
Out[41]: __main__.Student
```
它与使用 `class` 关键字创建的 `Student` 类一模一样。
Python 的类,因为又是对象,所以和 `xiaoming`,`xiaohong` 对象操作相似。支持:
- 赋值
- 拷贝
- 添加属性
- 作为函数参数
```python
In [43]: StudentMirror = Student # 类直接赋值 # 类直接赋值
In [44]: Student.class_property = 'class_property' # 添加类属性
In [46]: hasattr(Student, 'class_property')
Out[46]: True
```
元类,确实使用不是那么多,也许先了解这些,就能应付一些场合。就连 Python 界的领袖 `Tim Peters` 都说:
“元类就是深度的魔法,99%的用户应该根本不必为此操心。
[上一个例子](45.md) [下一个例子](47.md)
================================================
FILE: md/47.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 47 枚举对象
返回一个可以枚举的对象,该对象的next()方法将返回一个元组。
```python
In [1]: s = ["a","b","c"]
...: for i ,v in enumerate(s,1):
...: print(i,v)
...:
1 a
2 b
3 c
```
[上一个例子](46.md) [下一个例子](48.md)
================================================
FILE: md/48.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 48 查看变量所占字节数
`getsizeof`查看变量占用字节数
看到:字典比列表占用更多空间
```python
In [1]: import sys
In [3]: a = [('a',1),('b',2)]
In [5]: sys.getsizeof(a)
Out[5]: 88
In [6]: a = {'a':1,'b':2.0}
In [7]: sys.getsizeof(a)
Out[7]: 248
```
[上一个例子](47.md) [下一个例子](49.md)
================================================
FILE: md/49.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 49 过滤器
在函数中设定过滤条件,迭代元素,保留返回值为`True`的元素:
```python
In [1]: fil = filter(lambda x: x>10,[1,11,2,45,7,6,13])
In [2]: list(fil)
Out[2]: [11, 45, 13]
```
[上一个例子](48.md) [下一个例子](50.md)
================================================
FILE: md/5.md
================================================
```markdown
@author jackzhenguo
@desc 至少一个为真检查
@date 2019/2/10
```
#### 5 至少一个为真检查
至少有一个元素为真返回`True`,否则`False`
```python
In [7]: any([0,0,0,[]])
Out[7]: False
```
```python
In [8]: any([0,0,1])
Out[8]: True
```
[上一个例子](4.md) [下一个例子](6.md)
================================================
FILE: md/50.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 50 返回对象哈希值
返回对象的哈希值,值得注意的是自定义的实例都是可哈希的,`list`, `dict`, `set`等可变对象都是不可哈希的(unhashable)
```python
In [1]: hash(xiaoming)
Out[1]: 6139638
In [2]: hash([1,2,3])
# TypeError: unhashable type: 'list'
```
[上一个例子](49.md) [下一个例子](51.md)
================================================
FILE: md/51.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
#### 51 help 一键帮助
返回对象的帮助文档
```python
In [1]: help(xiaoming)
Help on Student in module __main__ object:
class Student(builtins.object)
| Methods defined here:
|
| __init__(self, id, name)
|
| __repr__(self)
|
| Data descriptors defined here:
|
| __dict__
| dictionary for instance variables (if defined)
|
| __weakref__
| list of weak references to the object (if defined)
```
[上一个例子](50.md) [下一个例子](52.md)
================================================
FILE: md/52.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/4
```
### 52 获取用户输入
获取用户输入内容
```python
In [1]: input()
aa
Out[1]: 'aa'
```
[上一个例子](51.md) [下一个例子](53.md)
================================================
FILE: md/53.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/8
```
#### 53 创建迭代器
使用`iter(obj, sentinel)`, 返回一个可迭代对象, sentinel可省略(一旦迭代到此元素,立即终止)
```python
In [1]: lst = [1,3,5]
In [2]: for i in iter(lst):
...: print(i)
...:
1
3
5
```
```python
In [1]: class TestIter(object):
...: def __init__(self):
...: self.l=[1,3,2,3,4,5]
...: self.i=iter(self.l)
...: def __call__(self): #定义了__call__方法的类的实例是可调用的
...: item = next(self.i)
...: print ("__call__ is called,fowhich would return",item)
...: return item
...: def __iter__(self): #支持迭代协议(即定义有__iter__()函数)
...: print ("__iter__ is called!!")
...: return iter(self.l)
In [2]: t = TestIter()
In [3]: t() # 因为实现了__call__,所以t实例能被调用
__call__ is called,which would return 1
Out[3]: 1
In [4]: for e in TestIter(): # 因为实现了__iter__方法,所以t能被迭代
...: print(e)
...:
__iter__ is called!!
1
3
2
3
4
5
```
[上一个例子](52.md) [下一个例子](54.md)
================================================
FILE: md/54.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/20
```
#### 54 文件读写和mode 取值表
返回文件对象
```python
In [1]: fo = open('D:/a.txt',mode='r', encoding='utf-8')
In [2]: fo.read()
Out[2]: '\ufefflife is not so long,\nI use Python to play.'
In [3]: fo.close() # 关闭文件对象
```
mode 取值表:
| 字符 | 意义 |
| :---- | :------------------------------- |
| `'r'` | 读取(默认) |
| `'w'` | 写入,并先截断文件 |
| `'x'` | 排它性创建,如果文件已存在则失败 |
| `'a'` | 写入,如果文件存在则在末尾追加 |
| `'b'` | 二进制模式 |
| `'t'` | 文本模式(默认) |
| `'+'` | 打开用于更新(读取与写入) |
文件读操作
```python
import os
# 创建文件夹
def mkdir(path):
isexists = os.path.exists(path)
if not isexists:
os.mkdir(path)
# 读取文件信息
def openfile(filename):
f = open(filename)
fllist = f.read()
f.close()
return fllist # 返回读取内容
```
文件写操作
```python
# 写入文件信息
# example1
# w写入,如果文件存在,则清空内容后写入,不存在则创建
f = open(r"./data/test.txt", "w", encoding="utf-8")
print(f.write("测试文件写入"))
f.close
# example2
# a写入,文件存在,则在文件内容后追加写入,不存在则创建
f = open(r"./data/test.txt", "a", encoding="utf-8")
print(f.write("测试文件写入"))
f.close
# example3
# with关键字系统会自动关闭文件和处理异常
with open(r"./data/test.txt", "w") as f:
f.write("hello world!")
```
[上一个例子](53.md) [下一个例子](55.md)
================================================
FILE: md/55.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/10
```
#### 55 创建range序列
1) range(stop)
2) range(start, stop[,step])
生成一个不可变序列:
```python
In [1]: range(11)
Out[1]: range(0, 11)
In [2]: range(0,11,1)
Out[2]: range(0, 11)
```
[上一个例子](54.md) [下一个例子](56.md)
================================================
FILE: md/56.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/20
```
#### 56 反向迭代器reversed
```python
In [1]: rev = reversed([1,4,2,3,1])
In [2]: for i in rev:
...: print(i)
...:
1
3
2
4
1
```
[上一个例子](55.md) [下一个例子](57.md)
================================================
FILE: md/57.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/20
```
#### 57 zip迭代器
创建一个聚合了来自每个可迭代对象中的元素的迭代器:
```python
In [1]: x = [3,2,1]
In [2]: y = [4,5,6]
In [3]: list(zip(y,x))
Out[3]: [(4, 3), (5, 2), (6, 1)]
In [4]: a = range(5)
In [5]: b = list('abcde')
In [6]: b
Out[6]: ['a', 'b', 'c', 'd', 'e']
In [7]: [str(y) + str(x) for x,y in zip(a,b)]
Out[7]: ['a0', 'b1', 'c2', 'd3', 'e4']
```
[上一个例子](56.md) [下一个例子](58.md)
================================================
FILE: md/58.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/23
```
#### 58 operator使用举例
```python
from operator import (add, sub)
def add_or_sub(a, b, oper):
return (add if oper == '+' else sub)(a, b)
add_or_sub(1, 2, '-') # -1
```
[上一个例子](57.md) [下一个例子](59.md)
================================================
FILE: md/59.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/25
```
#### 60 传输json对象
对象序列化,是指将内存中的对象转化为可存储或传输的过程。很多场景,直接一个类对象,传输不方便。
但是,当对象序列化后,就会更加方便,因为约定俗成的,接口间的调用或者发起的 web 请求,一般使用 json 串传输。
实际使用中,一般对类对象序列化。先创建一个 Student 类型,并创建两个实例。
```python
class Student():
def __init__(self,**args):
self.ids = args['ids']
self.name = args['name']
self.address = args['address']
xiaoming = Student(ids = 1,name = 'xiaoming',address = '北京')
xiaohong = Student(ids = 2,name = 'xiaohong',address = '南京')
```
导入 json 模块,调用 dump 方法,就会将列表对象 [xiaoming,xiaohong],序列化到文件 json.txt 中。
```python
import json
with open('json.txt', 'w') as f:
json.dump([xiaoming,xiaohong], f, default=lambda obj: obj.__dict__, ensure_ascii=False, indent=2, sort_keys=True)
```
生成的文件内容,如下:
```json
[
{
"address":"北京",
"ids":1,
"name":"xiaoming"
},
{
"address":"南京",
"ids":2,
"name":"xiaohong"
}
]
```
[上一个例子](58.md) [下一个例子](60.md)
================================================
FILE: md/6.md
================================================
```markdown
@author jackzhenguo
@desc 判断是真是假
@date 2019/2/12
```
#### 6 判断是真是假
测试一个对象是True, 还是False.
```python
In [9]: bool([0,0,0])
Out[9]: True
In [10]: bool([])
Out[10]: False
In [11]: bool([1,0,1])
Out[11]: True
```
[上一个例子](5.md) [下一个例子](7.md)
================================================
FILE: md/60.md
================================================
#### 61 不用else和if实现计算器
```python
from operator import *
def calculator(a, b, k):
return {
'+': add,
'-': sub,
'*': mul,
'/': truediv,
'**': pow
}[k](a, b)
calculator(1, 2, '+') # 3
calculator(3, 4, '**') # 81
```
[上一个例子](59.md) [下一个例子](61.md)
================================================
FILE: md/61.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/3/25
```
#### 61 去最求平均
```python
def score_mean(lst):
lst.sort()
lst2=lst[1:(len(lst)-1)]
return round((sum(lst2)/len(lst2)),1)
lst=[9.1, 9.0,8.1, 9.7, 19,8.2, 8.6,9.8]
score_mean(lst) # 9.1
```
[上一个例子](60.md) [下一个例子](62.md)
================================================
FILE: md/62.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/2
```
#### 62 打印99乘法表
打印出如下格式的乘法表
```python
1*1=1
1*2=2 2*2=4
1*3=3 2*3=6 3*3=9
1*4=4 2*4=8 3*4=12 4*4=16
1*5=5 2*5=10 3*5=15 4*5=20 5*5=25
1*6=6 2*6=12 3*6=18 4*6=24 5*6=30 6*6=36
1*7=7 2*7=14 3*7=21 4*7=28 5*7=35 6*7=42 7*7=49
1*8=8 2*8=16 3*8=24 4*8=32 5*8=40 6*8=48 7*8=56 8*8=64
1*9=9 2*9=18 3*9=27 4*9=36 5*9=45 6*9=54 7*9=63 8*9=72 9*9=81
```
一共有10 行,第`i`行的第`j`列等于:`j*i`,
其中,
`i`取值范围:`1<=i<=9`
`j`取值范围:`1<=j<=i`
根据`例子分析`的语言描述,转化为如下代码:
```python
for i in range(1, 10):
for j in range(1, i+1):
print('%d * %d = %d' % (j, i, j * i) , end="\t")
print()
```
[上一个例子](61.md) [下一个例子](63.md)
================================================
FILE: md/63.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/2
```
#### 63 递归版flatten函数
对于如下数组:
```
[[[1,2,3],[4,5]]]
```
如何完全展开成一维的。这个小例子实现的`flatten`是递归版,两个参数分别表示带展开的数组,输出数组。
```python
from collections.abc import *
def flatten(lst, out_lst=None):
if out_lst is None:
out_lst = []
for i in lst:
if isinstance(i, Iterable): # 判断i是否可迭代
flatten(i, out_lst) # 尾数递归
else:
out_lst.append(i) # 产生结果
return out_lst
```
调用`flatten`:
```python
print(flatten([[1,2,3],[4,5]]))
print(flatten([[1,2,3],[4,5]], [6,7]))
print(flatten([[[1,2,3],[4,5,6]]]))
# 结果:
[1, 2, 3, 4, 5]
[6, 7, 1, 2, 3, 4, 5]
[1, 2, 3, 4, 5, 6]
```
numpy里的`flatten`与上面的函数实现有些微妙的不同:
```python
import numpy
b = numpy.array([[1,2,3],[4,5]])
b.flatten()
array([list([1, 2, 3]), list([4, 5])], dtype=object)
```
[上一个例子](62.md) [下一个例子](64.md)
================================================
FILE: md/64.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/2
```
#### 64 列表等分
```python
from math import ceil
def divide(lst, size):
if size <= 0:
return [lst]
return [lst[i * size:(i+1)*size] for i in range(0, ceil(len(lst) / size))]
```
测试举例:
```python
r = divide([1, 3, 5, 7, 9], 2)
print(r) # [[1, 3], [5, 7], [9]]
r = divide([1, 3, 5, 7, 9], 0)
print(r) # [[1, 3, 5, 7, 9]]
r = divide([1, 3, 5, 7, 9], -3)
print(r) # [[1, 3, 5, 7, 9]]
```
[上一个例子](63.md) [下一个例子](65.md)
================================================
FILE: md/65.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/6
```
#### 65 压缩列表
```python
def filter_false(lst):
return list(filter(bool, lst))
r = filter_false([None, 0, False, '', [], 'ok', [1, 2]])
print(r) # ['ok', [1, 2]]
```
[上一个例子](64.md) [下一个例子](66.md)
================================================
FILE: md/66.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/6
```
#### 66 求更长的列表
```python
def max_length(*lst):
return max(*lst, key=lambda v: len(v))
r = max_length([1, 2, 3], [4, 5, 6, 7], [8])
print(f'更长的列表是{r}') # [4, 5, 6, 7]
r = max_length([1, 2, 3], [4, 5, 6, 7], [8, 9])
print(f'更长的列表是{r}') # [4, 5, 6, 7]
```
[上一个例子](65.md) [下一个例子](67.md)
================================================
FILE: md/67.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/6
```
#### 67 求众数
```python
def top1(lst):
return max(lst, default='列表为空', key=lambda v: lst.count(v))
```
测试举例:
```python
lst = [1, 3, 3, 2, 1, 1, 2]
r = top1(lst)
print(f'{lst}中出现次数最多的元素为:{r}')
# [1, 3, 3, 2, 1, 1, 2]中出现次数最多的元素为:1
```
[上一个例子](66.md) [下一个例子](68.md)
================================================
FILE: md/68.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/10
```
#### 68 所有多个列表的最大值
```python
def max_lists(*lst):
return max(max(*lst, key=lambda v: max(v)))
r = max_lists([1, 2, 3], [6, 7, 8], [4, 5])
print(r) # 8
```
[上一个例子](67.md) [下一个例子](69.md)
================================================
FILE: md/69.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/10
```
#### 69 列表检查重复
```python
def has_duplicates(lst):
return len(lst) == len(set(lst))
x = [1, 1, 2, 2, 3, 2, 3, 4, 5, 6]
y = [1, 2, 3, 4, 5]
has_duplicates(x) # False
has_duplicates(y) # True
```
[上一个例子](68.md) [下一个例子](70.md)
================================================
FILE: md/7.md
================================================
```markdown
@author jackzhenguo
@desc 创建复数
@date 2019/2/13
```
#### 7 创建复数
创建一个复数
```python
In [1]: complex(1,2)
Out[1]: (1+2j)
```
[上一个例子](6.md) [下一个例子](8.md)
================================================
FILE: md/70.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/10
```
#### 70 一行代码实现列表反转
```python
def reverse(lst):
return lst[::-1]
r = reverse([1, -2, 3, 4, 1, 2])
print(r) # [2, 1, 4, 3, -2, 1]
```
[上一个例子](69.md) [下一个例子](71.md)
================================================
FILE: md/71.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/14
```
#### 71 浮点数等差数列
```python
def float_range(start, stop, n):
start,stop,n = float('%.2f' % start), float('%.2f' % stop),int('%.d' % n)
step = (stop-start)/n
lst = [start]
while n > 0:
start,n = start+step,n-1
lst.append(round((start), 2))
return lst
```
测试举例:
```python
float_range(1, 8, 10)
# [1.0, 1.7, 2.4, 3.1, 3.8, 4.5, 5.2, 5.9, 6.6, 7.3, 8.0]
```
[上一个例子](70.md) [下一个例子](72.md)
================================================
FILE: md/72.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/15
```
#### 72 按条件分组
```python
def bif_by(lst, f):
return [ [x for x in lst if f(x)],[x for x in lst if not f(x)]]
```
测试举例:
```python
records = [25,89,31,34]
bif_by(records, lambda x: x<80) # [[25, 31, 34], [89]]
```
[上一个例子](71.md) [下一个例子](73.md)
================================================
FILE: md/73.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/15
```
#### 73 map实现向量运算
多序列运算函数
`map(function,iterabel,iterable2)`
```python
lst1=[1,2,3,4,5,6]
lst2=[3,4,5,6,3,2]
list(map(lambda x,y:x*y+1,lst1,lst2))
### [4, 9, 16, 25, 16, 13]
```
[上一个例子](72.md) [下一个例子](74.md)
================================================
FILE: md/74.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/15
```
#### 74 值最大的字典
```python
def max_pairs(dic):
if len(dic) == 0:
return dic
max_val = max(map(lambda v: v[1], dic.items()))
return [item for item in dic.items() if item[1] == max_val]
```
测试举例:
```python
r = max_pairs({'a': -10, 'b': 5, 'c': 3, 'd': 5})
print(r) # [('b', 5), ('d', 5)]
```
[上一个例子](73.md) [下一个例子](75.md)
================================================
FILE: md/75.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/15
```
#### 75 合并两个字典
Python 3.5 后支持的一行代码实现合并字典
```python
def merge_dict(dic1, dic2):
return {**dic1, **dic2}
```
测试:
```python
merge_dict({'a': 1, 'b': 2}, {'c': 3})
# {'a': 1, 'b': 2, 'c': 3}
```
[上一个例子](74.md) [下一个例子](76.md)
================================================
FILE: md/76.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/19
```
#### 76 Topn 字典
返回字典d前n个最大值对应的键
```python
from heapq import nlargest
def topn_dict(d, n):
return nlargest(n, d, key=lambda k: d[k])
```
测试:
```python
topn_dict({'a': 10, 'b': 8, 'c': 9, 'd': 10}, 3)
# ['a', 'd', 'c']
```
[上一个例子](75.md) [下一个例子](77.md)
================================================
FILE: md/77.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/19
```
#### 77 异位词
两个字符串含有相同字母,但排序不同,简称:互为变位词
```python
from collections import Counter
#
def anagram(str1, str2):
return Counter(str1) == Counter(str2)
anagram('eleven+two', 'twelve+one') # True 这是一对神器的变位词
anagram('eleven', 'twelve') # False
```
[上一个例子](76.md) [下一个例子](78.md)
================================================
FILE: md/78.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/23
```
#### 78 逻辑上合并字典
(1) 两种合并字典方法
这是一般的字典合并写法
```python
dic1 = {'x': 1, 'y': 2 }
dic2 = {'y': 3, 'z': 4 }
merged1 = {**dic1, **dic2} # {'x': 1, 'y': 3, 'z': 4}
```
修改merged['x']=10,dic1中的x值`不变`,`merged`是重新生成的一个`新字典`。
但是,`ChainMap`却不同,它在内部创建了一个容纳这些字典的列表。因此使用ChainMap合并字典,修改merged['x']=10后,dic1中的x值`改变`,如下所示:
```python
from collections import ChainMap
merged2 = ChainMap(dic1,dic2)
print(merged2) # ChainMap({'x': 1, 'y': 2}, {'y': 3, 'z': 4})
```
[上一个例子](77.md) [下一个例子](79.md)
================================================
FILE: md/79.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/4/26
```
#### 79 带名字的元组
定义名字为Point的元祖,字段属性有`x`,`y`,`z`
```python
from collections import namedtuple
Point = namedtuple('Point', ['x', 'y', 'z'])
lst = [Point(1.5, 2, 3.0), Point(-0.3, -1.0, 2.1), Point(1.3, 2.8, -2.5)]
print(lst[0].y - lst[1].y)
```
使用命名元组写出来的代码可读性更好,尤其处理上百上千个属性时作用更加凸显。
[上一个例子](78.md) [下一个例子](80.md)
================================================
FILE: md/8.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/2/10
```
#### 8 取商和余数
分别取商和余数
```python
In [1]: divmod(10,3)
Out[1]: (3, 1)
```
[上一个例子](7.md) [下一个例子](9.md)
================================================
FILE: md/80.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/1
```
#### 80 sample 样本抽样
使用`sample`抽样,如下例子从100个样本中随机抽样10个。
```python
from random import randint,sample
lst = [randint(0,50) for _ in range(100)]
print(lst[:5])# [38, 19, 11, 3, 6]
lst_sample = sample(lst,10)
print(lst_sample) # [33, 40, 35, 49, 24, 15, 48, 29, 37, 24]
```
[上一个例子](79.md) [下一个例子](81.md)
================================================
FILE: md/81.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/1
```
#### 81 shuffle 重洗数据集
使用`shuffle`用来重洗数据集,值得注意`shuffle`是对lst就地(in place)洗牌,节省存储空间。
```python
from random import shuffle
lst = [randint(0,50) for _ in range(100)]
shuffle(lst)
print(lst[:5]) # [50, 3, 48, 1, 26]
```
[上一个例子](80.md) [下一个例子](82.md)
================================================
FILE: md/82.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/1
```
#### 82 10个均匀分布的坐标点
random模块中的`uniform(a,b)`生成[a,b)内的一个随机数。
如下生成10个均匀分布的二维坐标点
```python
from random import uniform
In [1]: [(uniform(0,10),uniform(0,10)) for _ in range(10)]
Out[1]:
[(9.244361194237328, 7.684326645514235),
(8.129267671737324, 9.988395854203773),
(9.505278771040661, 2.8650440524834107),
(3.84320100484284, 1.7687190176304601),
(6.095385729409376, 2.377133802224657),
(8.522913365698605, 3.2395995841267844),
(8.827829601859406, 3.9298809217233766),
(1.4749644859469302, 8.038753079253127),
(9.005430657826324, 7.58011186920019),
(8.700789540392917, 1.2217577293254112)]
```
[上一个例子](81.md) [下一个例子](83.md)
================================================
FILE: md/83.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/3
```
#### 83 10个高斯分布的坐标点
random模块中的`gauss(u,sigma)`生成均值为u, 标准差为sigma的满足高斯分布的值,如下生成10个二维坐标点,样本误差(y-2*x-1)满足均值为0,标准差为1的高斯分布:
```python
from random import gauss
x = range(10)
y = [2*xi+1+gauss(0,1) for xi in x]
points = list(zip(x,y))
### 10个二维点:
[(0, -0.86789025305992),
(1, 4.738439437453464),
(2, 5.190278040856102),
(3, 8.05270893133576),
(4, 9.979481700775292),
(5, 11.960781766216384),
(6, 13.025427054303737),
(7, 14.02384035204836),
(8, 15.33755823101161),
(9, 17.565074449028497)]
```
[上一个例子](82.md) [下一个例子](84.md)
================================================
FILE: md/84.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/4
```
#### 84 chain串联小容器为大容器
`chain`函数串联a和b,兼顾内存效率同时写法更加优雅。
```python
from itertools import chain
a = [1,3,5,0]
b = (2,4,6)
for i in chain(a,b):
print(i)
### 结果
1
3
5
0
2
4
6
```
[上一个例子](83.md) [下一个例子](85.md)
================================================
FILE: md/85.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/5
```
#### 85 product 案例
```python
def product(*args, repeat=1):
pools = [tuple(pool) for pool in args] * repeat
result = [[]]
for pool in pools:
result = [x+[y] for x in result for y in pool]
for prod in result:
yield tuple(prod)
```
调用函数:
```python
rtn = product('xyz', '12', repeat=3)
print(list(rtn))
```
[上一个例子](84.md) [下一个例子](86.md)
================================================
FILE: md/86.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/6
```
#### 86 反转字符串的两个方法
```python
st="python"
```
方法1:
```python
''.join(reversed(st))
```
方法2:
```python
st[::-1]
```
[上一个例子](85.md) [下一个例子](87.md)
================================================
FILE: md/87.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/7
```
#### 87 join 串联字符串
用逗号连接字符串
```python
In [4]: mystr = ['1','2','java','4','python','java','7','8','java','python','11','java','13','14']
In [5]: ','.join(mystr) #用逗号连接字符串
Out[5]: '1,2,java,4,python,java,7,8,java,python,11,java,13,14'
```
[上一个例子](86.md) [下一个例子](88.md)
================================================
FILE: md/88.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/3
```
#### 88 字符串字节长度
```python
def str_byte_len(mystr):
return (len(mystr.encode('utf-8')))
```
测试:
```python
str_byte_len('i love python') # 13(个字节)
str_byte_len('字符') # 6(个字节)
```
[上一个例子](87.md) [下一个例子](89.md)
================================================
FILE: md/89.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/10
```
### 89 正则中字符 `r`是干啥的?
经常见过正则表达式前有一个字符 `r`,它的作用是告诉解释器后面的一串是原生字符串,按照字面意思解释即可。如:
```python
s1 = r'\n.*'
print(s1)
```
它告诉编译器s串第一个字符是`\`,第二个字符是`n`.打印的结果就是它本身:
```python
\n.*
```
而如果不带前缀字符`r`,即:
```python
s2 = '\n.*'
print(s2)
```
解释器认为前两个字符`\n`为转义字符,一个新行的意思,打印结果为一个换行加.*,如下所示:
```python
.*
```
[上一个例子](88.md) [下一个例子](90.md)
================================================
FILE: md/9.md
================================================
```markdown
@author jackzhenguo
@desc 转为浮点类型
@date 2019/2/15
```
#### 9 转为浮点类型
将一个整数或数值型字符串转换为浮点数
```python
In [1]: float(3)
Out[1]: 3.0
```
```python
In [1]: float('3')
Out[1]: 3.0
```
浮点数最大值
```python
import sys
In[4]: sys.float_info.max
Out[4]: 1.7976931348623157e+308
```
正无穷大、负无穷大
```python
float('inf') # 正无穷大
float('-inf') # 负无穷大
```
如果不能转化为浮点数,则会报`ValueError`:
```python
In [2]: float('a')
# ValueError: could not convert string to float: 'a'
```
[上一个例子](8.md) [下一个例子](10.md)
================================================
FILE: md/90.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/5/15
```
### 90 什么是一个原子操作?
微观世界中,如果定义原子是组成事物的最基本单元,那么就可理解为原子不能再分了。同理此处,正则的原子操作是指不能再被分割的正则表达式操作。
如正则中的`+`指前面的一个原子操作出现至少1次。
例如:`66+`表示第一个字符为6,第二个字符6和第三个字符+联合起来表示至少出现1次字符6,因此综合起来至少要有2个6紧邻的串才能满足此正则表达式(下面会详细讲到)。
`\w+`表示*字母数字下划线*中的任意一个字符(`\w`指代的)至少出现1次,那么`\w`就是一个原子操作。
因此,普通字符是原子,正则中的通用字符(下面会讲到)也是原子。
大家记住*原子*这个概念。
[上一个例子](89.md) [下一个例子](91.md)
================================================
FILE: md/91.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/6/3
```
### 91 怎么理解正则中的转义?
正则世界中,重新定义了几个新的转义字符。
一个转义字符`\`+一个字符,转义后会改变原字符的意义,它不再是它,而是赋予一个新的含义。
例如,`w`本身就是一个英文字符`w`,没有其他任何含义。但是,前面加一个转义字符 `\` 后,含义发生重大改变,`w`它不再是`w`,而是`\`要与`w`连在一起,被解释器解释为匹配以下字符集合中的任意一个:
```python
pat = '\w'
```
等于:
```python
pat = '[0123456789
AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz
_]'
```
即匹配数字、大小写字母和下划线`_`字符集合中的任意一个。
你看,一个通用转义字符`\w`直接就指代上面这一大串,写法多么简便,同时在正则的世界里又经常被用到,故被称为:**通用正则字符**
类似的通用正则字符还有几个,下面也会讲到。
做一件事前,把规则弄清,触类旁通,相信大家理解其他几个也没问题。
[上一个例子](90.md) [下一个例子](92.md)
================================================
FILE: md/92.md
================================================
### 92 正则最普通查找
最普通查找就是需要找啥就写啥,没有使用正则的规则。如下是关于小说《灿烂千阳》中的一段话,从中找出单词`friendship`,可能出现多次:
```
s = """
# Mariam is only fifteen
# when she is sent to Kabul to marry the troubled and bitter Rasheed,
# who is thirty years her senior.
# Nearly two decades later,
# in a climate of growing unrest, tragedy strikes fifteen-year-old Laila,
# who must leave her home and join Mariam's unhappy household.
# Laila and Mariam are to find consolation in each other,
# their friendship to grow as deep as the bond between sisters,
# as strong as the ties between mother and daughter.
# With the passing of time comes Taliban rule over Afghanistan,
# the streets of Kabul loud with the sound of gunfire and bombs,
# life a desperate struggle against starvation, brutality and fear,
# the women's endurance tested beyond their worst imaginings.
# Yet love can move a person to act in unexpected ways,
# lead them to overcome the most daunting obstacles with a startling heroism.
# In the end it is love that triumphs over death and destruction.
# A Thousand Splendid Suns is an unforgettable portrait of a wounded country and
# a deeply moving story of family and friendship.
# It is a beautiful, heart-wrenching story of an unforgiving time,
# an unlikely bond and an indestructible love.
"""
```
使用正则前,先导入re模块,再定义正则表达式,然后使用`findall`方法找出所有匹配
```python
import re
pat = 'friendship'
result = re.findall(pat,s)
print(result)
# 共找到两处:
# ['friendship', 'friendship']
```
以上就是使用正则的最普通例子。如果要找出前缀为grow的单词,比如可能为grows, growing 等,最普通查找实现起来就不方便。
然而,借助于下面介绍的元字符、通用字符和捕获组合起来,便能应对解决复杂的匹配查找问题。
[上一个例子](91.md) [下一个例子](93.md)
================================================
FILE: md/93.md
================================================
### 93 使用通用字符查找
在正则的世界里,通用字符指帮助我们更加简便的写出匹配规则的字符。
如上面文字,下面正则匹配串能找出以d开始,[a-z]表示的任意一个小写英文字符,{7}表示小写英文字符出现7次(下面情况3会说到),也就是匹配出来的子串长度为1+7=8:
```python
pat = 'd[a-z]{7}'
result = re.findall(pat,s)
```
匹配结果为:
```python
['daughter', 'desperat', 'daunting', 'destruct', 'destruct']
```
同理,模式串`pat = 'd[a-z]{10}'`匹配的结果为:
```
['destruction', 'destructibl']
```
模式串`pat = 'd[a-z]{11}'`匹配的结果为:
```
[ 'destructible']
```
你看,通用字符`[a-z]`使用真方便,5个字符一下就表达了所有26个小写的字符,但是注意`[a-z]`匹配26个小写字符的任意**一个**.
类似功能的通用字符还包括:
```
[A-Z] 匹配大写英文字母
[0-9] 匹配一个0-9之间的数字
```
还有更加强大的通用字符:
```
\s 匹配空白字符,如\n \t \b等
\w 匹配任意字母、数字、下划线
\d 匹配十进制数字0-9
```
而\S, \W, \D 分别对应 \s, \w, \d匹配字符集的补集,例如\S 的意思是匹配 \s 以外的其他任意字符。
[上一个例子](92.md) [下一个例子](94.md)
================================================
FILE: md/94.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/7/3
```
### 94 使用元字符查找
`元`的含义大家不妨理解为用来描述它后面事物的类,如*元类*用来创建描述类的类,*元模型*描述一个模型的模型,因此推而广之,*元字符*用来描述字符的字符。
理解以上后,你再看正则中使用最普遍的一个元字符 `+`,它是用来描述前面一个原子出现次数的字符,表示前一个原子出现1次或多次都可。
例如,在寻找手机靓号时,正则表达式`66+`,表示前一个原子`6`至少出现1次,因此连上第一个6,表示电话号码中至少有两个66紧邻。因此,电话号码`18612652166`、`17566665656`都满足要求,而号码`18616161616`不符合要求。
类似功能的元字符,还包括如下。功能相似,不再赘述:
```
* 前面的原子重复0次、1次、多次
? 前面的原子重复0次或者1次
+ 前面的原子重复1次或多次
{n} 前面的原子出现了 n 次
{n,} 前面的原子至少出现 n 次
{n,m} 前面的原子出现次数介于 n-m 之间
```
[上一个例子](93.md) [下一个例子](95.md)
================================================
FILE: md/95.md
================================================
## 95 捕获子串
今天以我写过的《Python 60天》专栏中的一段文字,提取出里面的链接为例,阐述提取子串的实用性。
先贴上文字(有删减改动),将这段文字赋值给变量 `urls`:
```
urls = """
基于 Python 的包更是枝繁叶茂,遍地开花,“Tiobe 编程语言排行榜”最新统计显示 Python 是增长最快的语言。

接下来,与大家,还有远在美国做 AI 博士后研究的 Alicia,一起开始我们的 60 天 Python 探索之旅吧。
所有的这些考虑,都是为了让大家在短时间内掌握 Python 技术栈,多一个生存的本领。拿到理想的 Offer 后,早日过上自己想要的生活。
让我们开始吧。
如下,按照是否为静态/动态语言,弱类型/强类型两个维度,
总结常用的语言分类。
 ### 四大基本语法
"""
```
你可能很快写出如下的正则表达式:
```
# 元字符.表示匹配除\n字符外的任意一个字符
# 元字符*表示匹配前面一个原子0次或多次
pat = r'https:.*'
```
然后导入`re`模块,使用`findall`方法找出所有匹配:
```python
import re
result = re.findall(pat,urls)
print(result)
```
运行结果显示如下,观察发现2个匹配,但是每个匹配链接都包括冗余字符,因此匹配错误:
```
['https://images.gitbook.cn
/2020-02-05-014719.png)',
'https://images.gitbook.cn
/2020-02-05-080211.png) ### 四大基本语法']
```
我们再稍微优化原正则表达式为:
```python
# 添加 \) 表示待匹配子串以右括号结尾
pat = r'https:.*\)'
```
打印结果显示如下,结果确实好一点,但是依然包括右括号,结果还是错误的:
```
['https://images.gitbook.cn/
2020-02-05-014719.png)',
'https://images.gitbook.cn/
2020-02-05-080211.png)']
```
所以掌握**提取**子串的技能就很重要,实现提取子串也很简单,只需把想要返回的子串加上一对括号就行,如下所示:
```python
# 把想要返回的子串外面添加一对括号
pat = r'(https:.*)\)'
```
此时返回结果完全正确,无任何多余字符。想要返回的子串外面添加一对括号还有个专业叫法:**捕获**或**分组**。
[上一个例子](94.md) [下一个例子](96.md)
================================================
FILE: md/96.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/7/21
```
#### 96 贪心捕获和非贪心捕获
捕获功能非常实用,使用它需要区分一点,贪婪捕获和非贪婪捕获。前者指在满足匹配模式前提下,返回包括尽可能多的字符匹配模式;后者指满足匹配条件下,尽可能少的捕获。
我们伪造一个理想状况下的案例:
```
htmlContent = """
这是二级标题
这是一个段落>/p>
"""
```
贪心捕获使用`(.*)`,如下所示:
```
pat = r"(.*)"
result = re.findall(pat,htmlContent)
```
结果为如下,尽可能长的捕获,而不是遇到第一个``时就终止:
```
['这是二级标题
这是一个段落>/p>
']
```
而非贪心捕获的正则表达式为`(.*?)"`,如下:
```
pat = r"(.*?)"
result = re.findall(pat,htmlContent)
print(result)
```
结果为两个元素,遇到第一个``时终止,然后继续捕获出第二子串:
```
['这是二级标题
',
' 这是一个段落>/p>']
```
以上例子仅仅用作演示两者区别,实际的html结构含有换行符等,环境比上面要复杂的多,贪心和非贪心捕获的写法可能不会导致结果不同,但是我们依然需要理解它们的区别。
[上一个例子](95.md) [下一个例子](97.md)
================================================
FILE: md/97.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/8/3
```
#### 97 使用正则做密码安全检查
密码安全要求:
1)要求密码为6到20位;
2)密码只包含英文字母和数字
```python
pat = re.compile(r'\w{6,20}') # 这是错误的,因为\w通配符匹配的是字母,数字和下划线,题目要求不能含有下划线
# 使用最稳的方法:\da-zA-Z满足`密码只包含英文字母和数字`
pat = re.compile(r'[\da-zA-Z]{6,20}')
```
选用最保险的`fullmatch`方法,查看是否整个字符串都匹配:
```python
pat.fullmatch('qaz12') # 返回 None, 长度小于6
pat.fullmatch('qaz12wsxedcrfvtgb67890942234343434') # None 长度大于22
pat.fullmatch('qaz_231') # None 含有下划线
pat.fullmatch('n0passw0Rd')
Out[4]:
```
[上一个例子](96.md) [下一个例子](98.md)
================================================
FILE: md/98.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/8/8
```
#### 98 爬取百度首页标题
```python
import re
from urllib import request
#爬虫爬取百度首页内容
data=request.urlopen("http://www.baidu.com/").read().decode()
#分析网页,确定正则表达式
pat=r'(.*?) '
result=re.search(pat,data)
print(result)
result.group() # 百度一下,你就知道
```
[上一个例子](97.md) [下一个例子](99.md)
================================================
FILE: md/99.md
================================================
```markdown
@author jackzhenguo
@desc
@date 2019/8/12
```
#### 99 批量转化为驼峰格式(Camel)
数据库字段名批量转化为驼峰格式
分析过程
```python
# 用到的正则串讲解
# \s 指匹配: [ \t\n\r\f\v]
# A|B:表示匹配A串或B串
# re.sub(pattern, newchar, string):
# substitue代替,用newchar字符替代与pattern匹配的字符所有.
```
```python
# title(): 转化为大写,例子:
# 'Hello world'.title() # 'Hello World'
```
```python
# print(re.sub(r"\s|_|", "", "He llo_worl\td"))
s = re.sub(r"(\s|_|-)+", " ",
'some_database_field_name').title().replace(" ", "")
#结果: SomeDatabaseFieldName
```
```python
# 可以看到此时的第一个字符为大写,需要转化为小写
s = s[0].lower()+s[1:] # 最终结果
```
整理以上分析得到如下代码:
```python
import re
def camel(s):
s = re.sub(r"(\s|_|-)+", " ", s).title().replace(" ", "")
return s[0].lower() + s[1:]
# 批量转化
def batch_camel(slist):
return [camel(s) for s in slist]
```
测试结果:
```python
s = batch_camel(['student_id', 'student\tname', 'student-add'])
print(s)
# 结果
['studentId', 'studentName', 'studentAdd']
```
[上一个例子](98.md) [下一个例子](100.md)
================================================
FILE: md/batch.py
================================================
"""
@author jackzhenguo
@desc 批量生成模板文件
@tag datetime file
@version v1.2
@date 2020/8/23
"""
import os
import calendar
from datetime import date,datetime
def getEverydaySince(year,month,day,n=10):
i = 0
_, days = calendar.monthrange(year, month)
while i < n:
d = date(year,month,day)
if day == days:
month,day = month+1,0
_, days = calendar.monthrange(year, month)
if month == 13:
year,month = year+1,1
_, days = calendar.monthrange(year, month)
yield d
day += 1
i += 1
def batchCreate(ps='.md',start=100,n=100,path='.',year=2020,month=2,day=1):
for i,day in zip(range(start,start+n),\
getEverydaySince(year,month,day,n) \
):
with open(path+'/'+str(i)+ps,'w') as fw:
fw.write("""
```markdown
@author jackzhenguo
@desc
@tag
@version
@date {}
```
""".format(datetime.strftime(day,'%Y/%m/%d'))\
)
print(day)
print('done')
================================================
FILE: script/add_nav.py
================================================
# function: give each *.md example to a navigation in bottom of file
# author: zhenguo
# date: 2021.2.27
# version: 1.0
import os
for file in os.listdir('../md'):
if os.path.splitext(file)[-1] == '.md':
with open('../md/' + file, 'a') as f:
file_name = os.path.splitext(file)[0]
try:
c = '\n\n[上一个例子](%s.md) [下一个例子](%s.md) ' % (str(int(file_name) - 1), str(int(file_name) + 1))
f.write(c)
print('文件%s写入成功' % (file,))
except:
print(ex)