View on GitHub

wtfpython-cn

wtfpython的中文翻译/施工结束/ 能力有限,欢迎帮我改进翻译

What the f*ck Python! 🐍

一些有趣且鲜为人知的 Python 特性.

翻译版本: English Vietnamese Tiếng Việt Spanish Español Korean 한국어 Russian Русский German Deutsch Add translation

其他模式: Interactive

WTFPL 2.0 Commit id

Python, 是一个设计优美的解释型高级语言, 它提供了很多能让程序员感到舒适的功能特性. 但有的时候, Python 的一些输出结果对于初学者来说似乎并不是那么一目了然.

这个有趣的项目意在收集 Python 中那些难以理解和反人类直觉的例子以及鲜为人知的功能特性, 并尝试讨论这些现象背后真正的原理!

虽然下面的有些例子并不一定会让你觉得 WTFs, 但它们依然有可能会告诉你一些你所不知道的 Python 有趣特性. 我觉得这是一种学习编程语言内部原理的好办法, 而且我相信你也会从中获得乐趣!

如果您是一位经验比较丰富的 Python 程序员, 你可以尝试挑战看是否能一次就找到例子的正确答案. 你可能对其中的一些例子已经比较熟悉了, 那这也许能唤起你当年踩这些坑时的甜蜜回忆 :sweat_smile:

PS: 如果你不是第一次读了, 你可以在这里获取变动内容.

那么, 让我们开始吧…

Table of Contents/目录

Structure of the Examples/示例结构

所有示例的结构都如下所示:

> 一个精选的标题 *

标题末尾的星号表示该示例在第一版中不存在,是最近添加的.

# 准备代码.
# 释放魔法...

Output (Python version):

>>> 触发语句
出乎意料的输出结果

(可选): 对意外输出结果的简短描述.

💡 说明:

  • 简要说明发生了什么以及为什么会发生.
    如有必要, 举例说明
    

    Output:

    >>> 触发语句 # 一些让魔法变得容易理解的例子
    # 一些正常的输入
    

注意: 所有的示例都在 Python 3.5.2 版本的交互解释器上测试过, 如果不特别说明应该适用于所有 Python 版本.

Usage/用法

我个人建议, 最好依次阅读下面的示例, 并对每个示例:

PS: 你也可以在命令行阅读 WTFpython. 我们有 pypi 包 和 npm 包(支持代码高亮).(译: 这两个都是英文版的)

安装 npm 包 wtfpython

$ npm install -g wtfpython

或者, 安装 pypi 包 wtfpython

$ pip install wtfpython -U

现在, 在命令行中运行 wtfpython, 你就可以开始浏览了.


👀 Examples/示例

Section: Strain your brain!/大脑运动!

> First things first!/要事优先 *

众所周知,Python 3.8 推出”海象”运算符 (:=) 方便易用,让我们一起看看。

1.

# Python 版本 3.8+

>>> a = "wtf_walrus"
>>> a
'wtf_walrus'

>>> a := "wtf_walrus"
File "<stdin>", line 1
    a := "wtf_walrus"
      ^
SyntaxError: invalid syntax

>>> (a := "wtf_walrus") # 该语句有效
'wtf_walrus'
>>> a
'wtf_walrus'

2 .

# Python 版本 3.8+

>>> a = 6, 9
>>> a
(6, 9)

>>> (a := 6, 9)
(6, 9)
>>> a
6

>>> a, b = 6, 9 # 典型拆包操作
>>> a, b
(6, 9)
>>> (a, b = 16, 19) # 出错啦
  File "<stdin>", line 1
    (a, b = 16, 19)
          ^
SyntaxError: invalid syntax

>>> (a, b := 16, 19) # 这里的结果是一个奇怪的三元组
(6, 16, 19)

>>> a # a值仍然没变?
6

>>> b
16

💡 说明

“海象”运算符简介

海象运算符 (:=) 在Python 3.8中被引入,用来在表达式中为变量赋值。

def some_func():
        # 假设这儿有一些耗时的计算
        # time.sleep(1000)
        return 5

# 引入“海象”运算符前的例子
if some_func():
        print(some_func()) # 糟糕的案例——函数运算了两次

# 或者,加以改进:
a = some_func()
if a:
    print(a)

# 有了“海象”运算符,你可以写的更简洁:
if a := some_func():
        print(a)

输出 (> Python 3.8):

5
5
5

这样既减少了一行代码,又避免了两次调用 some_func 函数。


> Strings can be tricky sometimes/微妙的字符串 *

1.

>>> a = "some_string"
>>> id(a)
140420665652016
>>> id("some" + "_" + "string") # 注意两个的id值是相同的.
140420665652016

2.

>>> a = "wtf"
>>> b = "wtf"
>>> a is b
True

>>> a = "wtf!"
>>> b = "wtf!"
>>> a is b
False

>>> a, b = "wtf!", "wtf!"
>>> a is b 
True # 3.7 版本返回结果为 False.

3.

>>> 'a' * 20 is 'aaaaaaaaaaaaaaaaaaaa'
True
>>> 'a' * 21 is 'aaaaaaaaaaaaaaaaaaaaa'
False # 3.7 版本返回结果为 True

很好理解, 对吧?

💡 说明:


> Be careful with chained operations/小心链式操作

>>> (False == False) in [False] # 可以理解
False
>>> False == (False in [False]) # 可以理解
False
>>> False == False in [False] # 为毛?
True

>>> True is False == False
False
>>> False is False is False
True

>>> 1 > 0 < 1
True
>>> (1 > 0) < 1
False
>>> 1 > (0 < 1)
False

💡 说明:

根据 https://docs.python.org/3/reference/expressions.html#comparisons

形式上, 如果 a, b, c, …, y, z 是表达式, 而 op1, op2, …, opN 是比较运算符, 那么除了每个表达式最多只出现一次以外 a op1 b op2 c … y opN z 就等于 a op1 b and b op2 c and … y opN z.

虽然上面的例子似乎很愚蠢, 但是像 a == b == c0 <= x <= 100 就很棒了.


> How not to use is operator/为什么不使用 is 操作符

下面是一个在互联网上非常有名的例子.

1.

>>> a = 256
>>> b = 256
>>> a is b
True

>>> a = 257
>>> b = 257
>>> a is b
False

2.

>>> a = []
>>> b = []
>>> a is b
False

>>> a = tuple()
>>> b = tuple()
>>> a is b
True

3. Output

>>> a, b = 257, 257
>>> a is b
True

Output (Python 3.7.x specifically)

>>> a, b = 257, 257
>>> a is b
False

💡 说明:

is== 的区别

256 是一个已经存在的对象, 而 257 不是

当你启动Python 的时候, 数值为 -5256 的对象就已经被分配好了. 这些数字因为经常被使用, 所以会被提前准备好.

Python 通过这种创建小整数池的方式来避免小整数频繁的申请和销毁内存空间.

引用自 https://docs.python.org/3/c-api/long.html

当前的实现为-5到256之间的所有整数保留一个整数对象数组, 当你创建了一个该范围内的整数时, 你只需要返回现有对象的引用. 所以改变1的值是有可能的. 我怀疑这种行为在Python中是未定义行为. :-)

>>> id(256)
10922528
>>> a = 256
>>> b = 256
>>> id(a)
10922528
>>> id(b)
10922528
>>> id(257)
140084850247312
>>> x = 257
>>> y = 257
>>> id(x)
140084850247440
>>> id(y)
140084850247344

这里解释器并没有智能到能在执行 y = 257 时意识到我们已经创建了一个整数 257, 所以它在内存中又新建了另一个对象.

类似的优化也适用于其他不可变对象,例如空元组。由于列表是可变的,这就是为什么 [] is [] 将返回 False() is () 将返回 True。 这解释了我们的第二个代码段。而第三个呢:

ab 在同一行中使用相同的值初始化时,会指向同一个对象.

>>> a, b = 257, 257
>>> id(a)
140640774013296
>>> id(b)
140640774013296
>>> a = 257
>>> b = 257
>>> id(a)
140640774013392
>>> id(b)
140640774013488

> Hash brownies/是时候来点蛋糕了!

1.

some_dict = {}
some_dict[5.5] = "JavaScript"
some_dict[5.0] = "Ruby"
some_dict[5] = "Python"

Output:

>>> some_dict[5.5]
"JavaScript"
>>> some_dict[5.0] # "Python" 消除了 "Ruby" 的存在?
"Python"
>>> some_dict[5] 
"Python"

>>> complex_five = 5 + 0j
>>> type(complex_five)
complex
>>> some_dict[complex_five]
"Python"

为什么到处都是Python?

💡 说明


> Deep down, we’re all the same./本质上,我们都一样. *

class WTF:
  pass

Output:

>>> WTF() == WTF() # 两个不同的对象应该不相等
False
>>> WTF() is WTF() # 也不相同
False
>>> hash(WTF()) == hash(WTF()) # 哈希值也应该不同
True
>>> id(WTF()) == id(WTF())
True

💡 说明:


> Disorder within order/有序中潜藏着无序 *

from collections import OrderedDict

dictionary = dict()
dictionary[1] = 'a'; dictionary[2] = 'b';

ordered_dict = OrderedDict()
ordered_dict[1] = 'a'; ordered_dict[2] = 'b';

another_ordered_dict = OrderedDict()
another_ordered_dict[2] = 'b'; another_ordered_dict[1] = 'a';

class DictWithHash(dict):
    """
    实现了 __hash__ 魔法方法的dict类
    """
    __hash__ = lambda self: 0

class OrderedDictWithHash(OrderedDict):
    """
    实现了 __hash__ 魔法方法的OrderedDict类
    """
    __hash__ = lambda self: 0

Output

>>> dictionary == ordered_dict # 如果 a == b
True
>>> dictionary == another_ordered_dict # 且 b == c
True
>>> ordered_dict == another_ordered_dict # 那么为什么 c == a 不成立??
False

# 众所周知,set数据结构储存不重复元素,
# 让我们生成以上字典的 set 数据类型,看看会发生什么……

>>> len({dictionary, ordered_dict, another_ordered_dict})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'dict'

# dict类没有实现 __hash__ ,出错可以理解,接下来使用我们派生的类。

>>> dictionary = DictWithHash()
>>> dictionary[1] = 'a'; dictionary[2] = 'b';
>>> ordered_dict = OrderedDictWithHash()
>>> ordered_dict[1] = 'a'; ordered_dict[2] = 'b';
>>> another_ordered_dict = OrderedDictWithHash()
>>> another_ordered_dict[2] = 'b'; another_ordered_dict[1] = 'a';
>>> len({dictionary, ordered_dict, another_ordered_dict})
1
>>> len({ordered_dict, another_ordered_dict, dictionary}) # 交换顺序
2

到底发生了什么?

💡 说明:


> Keep trying…/不停的try *

def some_func():
    try:
        return 'from_try'
    finally:
        return 'from_finally'

def another_func(): 
    for _ in range(3):
        try:
            continue
        finally:
            print("Finally!")

def one_more_func(): # A gotcha!
    try:
        for i in range(3):
            try:
                1 / i
            except ZeroDivisionError:
                # Let's throw it here and handle it outside for loop
                raise ZeroDivisionError("A trivial divide by zero error")
            finally:
                print("Iteration", i)
                break
    except ZeroDivisionError as e:
        print("Zero division error occurred", e)

Output:

>>> some_func()
'from_finally'

>>> another_func()
Finally!
Finally!
Finally!

>>> 1 / 0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

>>> one_more_func()
Iteration 0

💡 说明:


> For what?/为什么?

some_string = "wtf"
some_dict = {}
for i, some_dict[i] in enumerate(some_string):
    pass

Output:

>>> some_dict # 创建了索引字典.
{0: 'w', 1: 't', 2: 'f'}

💡 说明:


> Evaluation time discrepancy/执行时机差异

1.

array = [1, 8, 15]
# 一个典型的生成器表达式
g = (x for x in array if array.count(x) > 0)
array = [2, 8, 22]

Output:

>>> print(list(g)) #其他的值去哪儿了?
[8]

2.

array_1 = [1,2,3,4]
g1 = (x for x in array_1)
array_1 = [1,2,3,4,5]

array_2 = [1,2,3,4]
g2 = (x for x in array_2)
array_2[:] = [1,2,3,4,5]

Output:

>>> print(list(g1))
[1,2,3,4]

>>> print(list(g2))
[1,2,3,4,5]

3.

array_3 = [1, 2, 3]
array_4 = [10, 20, 30]
gen = (i + j for i in array_3 for j in array_4)

array_3 = [4, 5, 6]
array_4 = [400, 500, 600]

Output:

>>> print(list(gen))
[401, 501, 601, 402, 502, 602, 403, 503, 603]

💡 说明


> is not ... is not is (not ...)/is not ... 不是 is (not ...)

>>> 'something' is not None
True
>>> 'something' is (not None)
False

💡 说明:


> A tic-tac-toe where X wins in the first attempt!/一蹴即至!

# 我们先初始化一个变量row
row = [""]*3 #row i['', '', '']
# 并创建一个变量board
board = [row]*3

Output:

>>> board
[['', '', ''], ['', '', ''], ['', '', '']]
>>> board[0]
['', '', '']
>>> board[0][0]
''
>>> board[0][0] = "X"
>>> board
[['X', '', ''], ['X', '', ''], ['X', '', '']]

我们有没有赋值过3个 “X” 呢?

💡 说明:

当我们初始化 row 变量时, 下面这张图展示了内存中的情况。

image

而当通过对 row 做乘法来初始化 board 时, 内存中的情况则如下图所示 (每个元素 board[0], board[1]board[2] 都和 row 一样引用了同一列表.)

image

我们可以通过不使用变量 row 生成 board 来避免这种情况. (这个issue提出了这个需求.)

>>> board = [['']*3 for _ in range(3)]
>>> board[0][0] = "X"
>>> board
[['X', '', ''], ['', '', ''], ['', '', '']]

> Schrödinger’s variable/薛定谔的变量 *

funcs = []
results = []
for x in range(7):
    def some_func():
        return x
    funcs.append(some_func)
    results.append(some_func()) # 注意这里函数被执行了

funcs_results = [func() for func in funcs]

Output:

>>> results
[0, 1, 2, 3, 4, 5, 6]
>>> funcs_results
[6, 6, 6, 6, 6, 6, 6]

即使每次在迭代中将 some_func 加入 funcs 前的 x 值都不相同, 所有的函数还是都返回6.

// 再换个例子

>>> powers_of_x = [lambda x: x**i for i in range(10)]
>>> [f(2) for f in powers_of_x]
[512, 512, 512, 512, 512, 512, 512, 512, 512, 512]

💡 说明:

由于 x 是一个全局值,我们可以通过更新 x 来更改 funcs 用来查找和返回的值:

>>> x = 42
>>> [func() for func in funcs]
[42, 42, 42, 42, 42, 42, 42]
funcs = []
for x in range(7):
    def some_func(x=x):
        return x
    funcs.append(some_func)

Output:

>>> funcs_results = [func() for func in funcs]
>>> funcs_results
[0, 1, 2, 3, 4, 5, 6]

此时,不再使用全局变量 x

>>> inspect.getclosurevars(funcs[0])
ClosureVars(nonlocals={}, globals={}, builtins={}, unbound=set())

> The chicken-egg problem/先有鸡还是先有蛋 *

1.

>>> isinstance(3, int)
True
>>> isinstance(type, object)
True
>>> isinstance(object, type)
True

那么到底谁是“最终”的基类呢?下边顺便列出更多的令人困惑的地方

2.

>>> class A: pass
>>> isinstance(A, A)
False
>>> isinstance(type, type)
True
>>> isinstance(object, object)
True

3.

>>> issubclass(int, object)
True
>>> issubclass(type, object)
True
>>> issubclass(object, type)
False

💡 说明


> Subclass relationships/子类关系 *

Output:

>>> from collections.abc import Hashable
>>> issubclass(list, object)
True
>>> issubclass(object, Hashable)
True
>>> issubclass(list, Hashable)
False

子类关系应该是可传递的, 对吧? (即, 如果 AB 的子类, BC 的子类, 那么 A 应该C 的子类.)

💡 说明:


> Methods equality and identity/方法的相等性和唯一性 *

1.

class SomeClass:
    def method(self):
        pass

    @classmethod
    def classm(cls):
        pass

    @staticmethod
    def staticm():
        pass

Output:

>>> print(SomeClass.method is SomeClass.method)
True
>>> print(SomeClass.classm is SomeClass.classm)
False
>>> print(SomeClass.classm == SomeClass.classm)
True
>>> print(SomeClass.staticm is SomeClass.staticm)
True

访问 classm 两次,我们得到一个相等的对象,但不是同一个? 让我们看看 SomeClass 的实例会发生什么:

2.

o1 = SomeClass()
o2 = SomeClass()

Output:

>>> print(o1.method == o2.method)
False
>>> print(o1.method == o1.method)
True
>>> print(o1.method is o1.method)
False
>>> print(o1.classm is o1.classm)
False
>>> print(o1.classm == o1.classm == o2.classm == SomeClass.classm)
True
>>> print(o1.staticm is o1.staticm is o2.staticm is SomeClass.staticm)
True

访问 ` classm or method 两次, 为 SomeClass` 的同一个实例创建了相等但是不同的对象。

💡 说明

>>> o1.method
<bound method SomeClass.method of <__main__.SomeClass object at ...>>
>>> SomeClass.method
<function SomeClass.method at ...>
>>> o1.classm
<bound method SomeClass.classm of <class '__main__.SomeClass'>>
>>> SomeClass.classm
<bound method SomeClass.classm of <class '__main__.SomeClass'>>
>>> o1.staticm
<function SomeClass.staticm at ...>
>>> SomeClass.staticm
<function SomeClass.staticm at ...>

> All-true-ation/返回True的all函数 *

>>> all([True, True, True])
True
>>> all([True, True, False])
False

>>> all([])
True
>>> all([[]])
False
>>> all([[[]]])
True

为什么会有这种True-False的变化?

💡 说明


> The surprising comma/意外的逗号

Output:

>>> def f(x, y,):
...     print(x, y)
...
>>> def g(x=4, y=5,):
...     print(x, y)
...
>>> def h(x, **kwargs,):
  File "<stdin>", line 1
    def h(x, **kwargs,):
                     ^
SyntaxError: invalid syntax
>>> def h(*args,):
  File "<stdin>", line 1
    def h(*args,):
                ^
SyntaxError: invalid syntax

💡 说明:

> Strings and the backslashes/字符串与反斜杠

Output:

>>> print("\"")
"

>>> print(r"\"")
\"

>>> print(r"\")
File "<stdin>", line 1
    print(r"\")
              ^
SyntaxError: EOL while scanning string literal

>>> r'\'' == "\\'"
True

💡 说明:


> not knot!/别纠结!

x = True
y = False

Output:

>>> not x == y
True
>>> x == not y
  File "<input>", line 1
    x == not y
           ^
SyntaxError: invalid syntax

💡 说明:


> Half triple-quoted strings/三个引号

Output:

>>> print('wtfpython''')
wtfpython
>>> print("wtfpython""")
wtfpython
>>> # 下面的语句会抛出 `SyntaxError` 异常
>>> # print('''wtfpython')
>>> # print("""wtfpython")

💡 说明:


> What’s wrong with booleans?/布尔你咋了?

1.

# 一个简单的例子, 统计下面可迭代对象中的布尔型值的个数和整型值的个数
mixed_list = [False, 1.0, "some_string", 3, True, [], False]
integers_found_so_far = 0
booleans_found_so_far = 0

for item in mixed_list:
    if isinstance(item, int):
        integers_found_so_far += 1
    elif isinstance(item, bool):
        booleans_found_so_far += 1

Output:

>>> integers_found_so_far
4
>>> booleans_found_so_far
0

2.

>>> some_bool = True
>>> "wtf" * some_bool
'wtf'
>>> some_bool = False
>>> "wtf" * some_bool
''

3.

def tell_truth():
    True = False
    if True == False:
        print("I have lost faith in truth!")

Output (< 3.x):

>>> tell_truth()
I have lost faith in truth!

💡 说明:

  >>> int(True)
  1
  >>> int(False)
  0

> Class attributes and instance attributes/类属性和实例属性

1.

class A:
    x = 1

class B(A):
    pass

class C(A):
    pass

Output:

>>> A.x, B.x, C.x
(1, 1, 1)
>>> B.x = 2
>>> A.x, B.x, C.x
(1, 2, 1)
>>> A.x = 3
>>> A.x, B.x, C.x
(3, 2, 3)
>>> a = A()
>>> a.x, A.x
(3, 3)
>>> a.x += 1
>>> a.x, A.x
(4, 3)

2.

class SomeClass:
    some_var = 15
    some_list = [5]
    another_list = [5]
    def __init__(self, x):
        self.some_var = x + 1
        self.some_list = self.some_list + [x]
        self.another_list += [x]

Output:

>>> some_obj = SomeClass(420)
>>> some_obj.some_list
[5, 420]
>>> some_obj.another_list
[5, 420]
>>> another_obj = SomeClass(111)
>>> another_obj.some_list
[5, 111]
>>> another_obj.another_list
[5, 420, 111]
>>> another_obj.another_list is SomeClass.another_list
True
>>> another_obj.another_list is some_obj.another_list
True

💡 说明:


> yielding None/生成 None

some_iterable = ('a', 'b')

def some_func(val):
    return "something"

Output:

>>> [x for x in some_iterable]
['a', 'b']
>>> [(yield x) for x in some_iterable]
<generator object <listcomp> at 0x7f70b0a4ad58>
>>> list([(yield x) for x in some_iterable])
['a', 'b']
>>> list((yield x) for x in some_iterable)
['a', None, 'b', None]
>>> list(some_func((yield x)) for x in some_iterable)
['a', 'something', 'b', 'something']

💡 说明:


> Yielding from… return!/生成器里的return *

1.

def some_func(x):
    if x == 3:
        return ["wtf"]
    else:
        yield from range(x)

Output (> 3.3):

>>> list(some_func(3))
[]

"wtf" 去哪儿了?是因为yield from的一些特殊效果吗?让我们验证一下

2.

def some_func(x):
    if x == 3:
        return ["wtf"]
    else:
        for i in range(x):
          yield i

Output:

>>> list(some_func(3))
[]

同样的结果,这里也不起作用。

💡 说明

“… 生成器中的 return expr 会导致在退出生成器时引发 StopIteration(expr)。”


> Nan-reflexivity/Nan的自反性

1.

a = float('inf')
b = float('nan')
c = float('-iNf')  # 这些字符串不区分大小写
d = float('nan')

Output:

>>> a
inf
>>> b
nan
>>> c
-inf
>>> float('some_other_string')
ValueError: could not convert string to float: some_other_string
>>> a == -c #inf==inf
True
>>> None == None # None==None
True
>>> b == d #但是 nan!=nan
False
>>> 50/a
0.0
>>> a/a
nan
>>> 23 + b
nan

2.

>>> x = float('nan')
>>> y = x / x
>>> y is y # 同一性(identity)具备
True
>>> y == y # y不具备相等性(equality)
False
>>> [y] == [y] # 但包含y的列表验证相等性(equality)成功了
True

💡 说明:

'inf''nan' 是特殊的字符串(不区分大小写), 当显示转换成 float 型时, 它们分别用于表示数学意义上的 “无穷大” 和 “非数字”.


> Mutating the immutable!/强人所难

some_tuple = ("A", "tuple", "with", "values")
another_tuple = ([1, 2], [3, 4], [5, 6])

Output:

>>> some_tuple[2] = "change this"
TypeError: 'tuple' object does not support item assignment
>>> another_tuple[2].append(1000) # 这里不出现错误
>>> another_tuple
([1, 2], [3, 4], [5, 6, 1000])
>>> another_tuple[2] += [99, 999]
TypeError: 'tuple' object does not support item assignment
>>> another_tuple
([1, 2], [3, 4], [5, 6, 1000, 99, 999])

我还以为元组是不可变的呢…

💡 说明:

(译: 对于不可变对象, 这里指tuple, += 并不是原子操作, 而是 extend= 两个动作, 这里 = 操作虽然会抛出异常, 但 extend 操作已经修改成功了. 详细解释可以看这里)


> The disappearing variable from outer scope/消失的外部变量

e = 7
try:
    raise Exception()
except Exception as e:
    pass

Output (Python 2.x):

>>> print(e)
# prints nothing

Output (Python 3.x):

>>> print(e)
NameError: name 'e' is not defined

💡 说明:


> The mysterious key type conversion/神秘的键型转换 *

class SomeClass(str):
    pass

some_dict = {'s':42}

Output:

>>> type(list(some_dict.keys())[0])
str
>>> s = SomeClass('s')
>>> some_dict[s] = 40
>>> some_dict # 预期: 两个不同的键值对
{'s': 40}
>>> type(list(some_dict.keys())[0])
str

💡 说明:


> Let’s see if you can guess this?/看看你能否猜到这一点?

a, b = a[b] = {}, 5

Output:

>>> a
{5: ({...}, 5)}

💡 说明:


> Exceeds the limit for integer string conversion/整型转字符串越界

>>> # Python 3.10.6
>>> int("2" * 5432)

>>> # Python 3.10.8
>>> int("2" * 5432)

Output:

>>> # Python 3.10.6
222222222222222222222222222222222222222222222222222222222222222...

>>> # Python 3.10.8
Traceback (most recent call last):
   ...
ValueError: Exceeds the limit (4300) for integer string conversion:
   value has 5432 digits; use sys.set_int_max_str_digits()
   to increase the limit.

💡 说明:

更多更改设置上限的操作细节查看文档


Section: Slippery Slopes/滑坡谬误

> Modifying a dictionary while iterating over it/迭代字典时的修改

x = {0: None}

for i in x:
    del x[i]
    x[i+1] = None
    print(i)

Output (Python 2.7- Python 3.5):

0
1
2
3
4
5
6
7

是的, 它运行了八次然后才停下来.

💡 说明:


> Stubborn del operator/坚强的 del *

class SomeClass:
    def __del__(self):
        print("Deleted!")

Output: 1.

>>> x = SomeClass()
>>> y = x
>>> del x # 这里应该会输出 "Deleted!"
>>> del y
Deleted!

唷, 终于删除了. 你可能已经猜到了在我们第一次尝试删除 x 时是什么让 __del__ 免于被调用的. 那让我们给这个例子增加点难度.

2.

>>> x = SomeClass()
>>> y = x
>>> del x
>>> y # 检查一下y是否存在
<__main__.SomeClass instance at 0x7f98a1a67fc8>
>>> del y # 像之前一样, 这里应该会输出 "Deleted!"
>>> globals() # 好吧, 并没有. 让我们看一下所有的全局变量
Deleted!
{'__builtins__': <module '__builtin__' (built-in)>, 'SomeClass': <class __main__.SomeClass at 0x7f98a1a5f668>, '__package__': None, '__name__': '__main__', '__doc__': None}

好了,现在它被删除了 :confused:

💡 说明:


> The out of scope variable/外部作用域变量

1.

a = 1
def some_func():
    return a

def another_func():
    a += 1
    return a

2.

def some_closure_func():
    a = 1
    def some_inner_func():
        return a
    return some_inner_func()

def another_closure_func():
    a = 1
    def another_inner_func():
        a += 1
        return a
    return another_inner_func()

Output:

>>> some_func()
1
>>> another_func()
UnboundLocalError: local variable 'a' referenced before assignment

>>> some_closure_func()
1
>>> another_closure_func()
UnboundLocalError: local variable 'a' referenced before assignment

💡 说明:


> Deleting a list item while iterating/迭代列表时删除元素

list_1 = [1, 2, 3, 4]
list_2 = [1, 2, 3, 4]
list_3 = [1, 2, 3, 4]
list_4 = [1, 2, 3, 4]

for idx, item in enumerate(list_1):
    del item

for idx, item in enumerate(list_2):
    list_2.remove(item)

for idx, item in enumerate(list_3[:]):
    list_3.remove(item)

for idx, item in enumerate(list_4):
    list_4.pop(idx)

Output:

>>> list_1
[1, 2, 3, 4]
>>> list_2
[2, 4]
>>> list_3
[]
>>> list_4
[2, 4]

你能猜到为什么输出是 [2, 4] 吗?

💡 说明:

del, removepop 的不同:

为什么输出是 [2, 4]?


> Lossy zip of iterators/丢三落四的zip *

>>> numbers = list(range(7))
>>> numbers
[0, 1, 2, 3, 4, 5, 6]
>>> first_three, remaining = numbers[:3], numbers[3:]
>>> first_three, remaining
([0, 1, 2], [3, 4, 5, 6])
>>> numbers_iter = iter(numbers)
>>> list(zip(numbers_iter, first_three)) 
[(0, 0), (1, 1), (2, 2)]
# so far so good, let's zip the remaining
>>> list(zip(numbers_iter, remaining))
[(4, 3), (5, 4), (6, 5)]

numbers 列表中的元素 3 哪里去了?

💡 说明


> Loop variables leaking out!/循环变量泄漏!

1.

for x in range(7):
    if x == 6:
        print(x, ': for x inside loop')
print(x, ': x in global')

Output:

6 : for x inside loop
6 : x in global

但是 x 从未在循环外被定义…

2.

# 这次我们先初始化x
x = -1
for x in range(7):
    if x == 6:
        print(x, ': for x inside loop')
print(x, ': x in global')

Output:

6 : for x inside loop
6 : x in global

3.

x = 1
print([x for x in range(5)])
print(x, ': x in global')

Output (on Python 2.x):

[0, 1, 2, 3, 4]
(4, ': x in global')

Output (on Python 3.x):

[0, 1, 2, 3, 4]
1 : x in global

💡 说明:


> Beware of default mutable arguments!/当心默认的可变参数!

def some_func(default_arg=[]):
    default_arg.append("some_string")
    return default_arg

Output:

>>> some_func()
['some_string']
>>> some_func()
['some_string', 'some_string']
>>> some_func([])
['some_string']
>>> some_func()
['some_string', 'some_string', 'some_string']

💡 说明:


> Catching the Exceptions/捕获异常

some_list = [1, 2, 3]
try:
    # 这里会抛出异常 ``IndexError``
    print(some_list[4])
except IndexError, ValueError:
    print("Caught!")

try:
    # 这里会抛出异常 ``ValueError``
    some_list.remove(4)
except IndexError, ValueError:
    print("Caught again!")

Output (Python 2.x):

Caught!

ValueError: list.remove(x): x not in list

Output (Python 3.x):

  File "<input>", line 3
    except IndexError, ValueError:
                     ^
SyntaxError: invalid syntax

💡 说明:


> Same operands, different story!/同人不同命!

1.

a = [1, 2, 3, 4]
b = a
a = a + [5, 6, 7, 8]

Output:

>>> a
[1, 2, 3, 4, 5, 6, 7, 8]
>>> b
[1, 2, 3, 4]

2.

a = [1, 2, 3, 4]
b = a
a += [5, 6, 7, 8]

Output:

>>> a
[1, 2, 3, 4, 5, 6, 7, 8]
>>> b
[1, 2, 3, 4, 5, 6, 7, 8]

💡 说明:


> Name resolution ignoring class scope/忽略类作用域的名称解析

1.

x = 5
class SomeClass:
    x = 17
    y = (x for i in range(10))

Output:

>>> list(SomeClass.y)[0]
5

2.

x = 5
class SomeClass:
    x = 17
    y = [x for i in range(10)]

Output (Python 2.x):

>>> SomeClass.y[0]
17

Output (Python 3.x):

>>> SomeClass.y[0]
5

💡 说明:


> Rounding like a banker/像银行家一样舍入 *

让我们实现一个简单的函数来获取列表的中间元素:

def get_middle(some_list):
    mid_index = round(len(some_list) / 2)
    return some_list[mid_index - 1]

Python 3.x:

>>> get_middle([1])  # looks good
1
>>> get_middle([1,2,3])  # looks good
2
>>> get_middle([1,2,3,4,5])  # huh?
2
>>> len([1,2,3,4,5]) / 2  # good
2.5
>>> round(len([1,2,3,4,5]) / 2)  # why?
2

似乎 Python 将 2.5 舍入到 2。

💡 说明

>>> round(0.5)
0
>>> round(1.5)
2
>>> round(2.5)
2
>>> import numpy  # numpy的结果也是一样
>>> numpy.round(0.5)
0.0
>>> numpy.round(1.5)
2.0
>>> numpy.round(2.5)
2.0

> Needles in a Haystack/大海捞针

迄今为止,每一位Python开发者都会遇到类似以下的情况。

1.

x, y = (0, 1) if True else None, None

Output:

>>> x, y  # 期望的结果是 (0, 1)
((0, 1), None)

2.

t = ('one', 'two')
for i in t:
    print(i)

t = ('one')
for i in t:
    print(i)

t = ()
print(t)

Output:

one
two
o
n
e
tuple()

3.

ten_words_list = [
    "some",
    "very",
    "big",
    "list",
    "that"
    "consists",
    "of",
    "exactly",
    "ten",
    "words"
]

Output

>>> len(ten_words_list)
9

4. 不够健壮的断言机制

a = "python"
b = "javascript"

Output:

# 带有失败警告信息的assert表达式
>>> assert(a == b, "Both languages are different")
# 未引发 AssertionError 

5.

some_list = [1, 2, 3]
some_dict = {
  "key_1": 1,
  "key_2": 2,
  "key_3": 3
}

some_list = some_list.append(4) 
some_dict = some_dict.update({"key_4": 4})

Output:

>>> print(some_list)
None
>>> print(some_dict)
None

6.

def some_recursive_func(a):
    if a[0] == 0:
        return
    a[0] -= 1
    some_recursive_func(a)
    return a

def similar_recursive_func(a):
    if a == 0:
        return a
    a -= 1
    similar_recursive_func(a)
    return a

Output:

>>> some_recursive_func([5, 0])
[0, 0]
>>> similar_recursive_func(5)
4

💡 说明:


> Splitsies/分割函数 *

>>> 'a'.split()
['a']

# is same as
>>> 'a'.split(' ')
['a']

# but
>>> len(''.split())
0

# isn't the same as
>>> len(''.split(' '))
1

💡 说明


> Wild imports/通配符导入方式 *

# File: module.py

def some_weird_name_func_():
    print("works!")

def _another_weird_name_func():
    print("works!")

Output

>>> from module import *
>>> some_weird_name_func_()
"works!"
>>> _another_weird_name_func()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name '_another_weird_name_func' is not defined

💡 说明


> All sorted?/都排序了吗? *

>>> x = 7, 8, 9
>>> sorted(x) == x
False
>>> sorted(x) == sorted(x)
True

>>> y = reversed(x)
>>> sorted(y) == sorted(y)
False

💡 说明


> Midnight time doesn’t exist?/不存在的午夜?

from datetime import datetime

midnight = datetime(2018, 1, 1, 0, 0)
midnight_time = midnight.time()

noon = datetime(2018, 1, 1, 12, 0)
noon_time = noon.time()

if midnight_time:
    print("Time at midnight is", midnight_time)

if noon_time:
    print("Time at noon is", noon_time)

Output:

('Time at noon is', datetime.time(12, 0))

midnight_time 并没有被输出.

💡 说明:

在Python 3.5之前, 如果 datetime.time 对象存储的UTC的午夜时间(译: 就是 00:00), 那么它的布尔值会被认为是 False. 当使用 if obj: 语句来检查 obj 是否为 null 或者某些“空”值的时候, 很容易出错.


Section: The Hidden treasures!/隐藏的宝藏!

本节包含了一些像我这样的大多数初学者都不知道的关于Python的鲜为人知的有趣的事情(好吧,现在不是了)。

> Okay Python, Can you make me fly?/Python, 可否带我飞? *

好, 去吧.

import antigravity

Output: 嘘.. 这是个超级秘密.

💡 说明:


> goto, but why?/goto, 但为什么? *

from goto import goto, label
for i in range(9):
    for j in range(9):
        for k in range(9):
            print("I'm trapped, please rescue!")
            if k == 2:
                goto .breakout # 从多重循环中跳出
label .breakout
print("Freedom!")

Output (Python 2.3):

I'm trapped, please rescue!
I'm trapped, please rescue!
Freedom!

💡 说明:


> Brace yourself!/做好思想准备 *

如果你不喜欢在Python中使用空格来表示作用域, 你可以导入 C 风格的 {},

from __future__ import braces

Output:

  File "some_file.py", line 1
    from __future__ import braces
SyntaxError: not a chance

想用大括号 braces? 没门! 觉得不爽, 请去用java。那么,另一个令人惊讶的事情,找一找在 __future__ 模块中,哪里引发了 SyntaxError code?

💡 说明:


> Let’s meet Friendly Language Uncle For Life/让生活更友好 *

Output (Python 3.x)

>>> from __future__ import barry_as_FLUFL
>>> "Ruby" != "Python" # 这里没什么疑问
  File "some_file.py", line 1
    "Ruby" != "Python"
              ^
SyntaxError: invalid syntax

>>> "Ruby" <> "Python"
True

这就对了.

💡 说明:


> Even Python understands that love is complicated/连Python也知道爱是难言的 *

import this

等等, this 是什么? this 是爱 :heart:

Output:

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
优美胜于丑陋(Python 以编写优美的代码为目标)
Explicit is better than implicit.
明了胜于晦涩(优美的代码应当是明了的,命名规范,风格相似)
Simple is better than complex.
简洁胜于复杂(优美的代码应当是简洁的,不要有复杂的内部实现)
Complex is better than complicated.
复杂胜于凌乱(如果复杂不可避免,那代码间也不能有难懂的关系,要保持接口简洁)
Flat is better than nested.
扁平胜于嵌套(优美的代码应当是扁平的,不能有太多的嵌套)
Sparse is better than dense.
间隔胜于紧凑(优美的代码有适当的间隔,不要奢望一行代码解决问题)
Readability counts.
可读性很重要(优美的代码一定是可读的)
Special cases aren't special enough to break the rules.
没有特例特殊到需要违背这些规则(这些规则至高无上)
Although practicality beats purity.
尽管我们更倾向于实用性
Errors should never pass silently.
不要安静的包容所有错误
Unless explicitly silenced.
除非你确定需要这样做(精准地捕获异常,不写 except:pass 风格的代码)
In the face of ambiguity, refuse the temptation to guess.
拒绝诱惑你去猜测的暧昧事物
There should be one-- and preferably only one --obvious way to do it.
而是尽量找一种,最好是唯一一种明显的解决方案(如果不确定,就用穷举法)
Although that way may not be obvious at first unless you're Dutch.
虽然这并不容易,因为你不是 Python 之父(这里的 Dutch 是指 Guido )
Now is better than never.
现在行动好过永远不行动
Although never is often better than *right* now.
尽管不行动要好过鲁莽行动
If the implementation is hard to explain, it's a bad idea.
如果你无法向人描述你的方案,那肯定不是一个好方案;
If the implementation is easy to explain, it may be a good idea.
如果你能轻松向人描述你的方案,那也许会是一个好方案(方案测评标准)
Namespaces are one honking great idea -- let's do more of those!
命名空间是一种绝妙的理念,我们应当多加利用(倡导与号召)

这是 Python 之禅!

>>> love = this
>>> this is love
True
>>> love is True
False
>>> love is False
False
>>> love is not True or False
True
>>> love is not True or False; love is love  # 爱是难言的
True

💡 说明:


> Yes, it exists!/是的, 它存在!

循环的 else. 一个典型的例子:

  def does_exists_num(l, to_find):
      for num in l:
          if num == to_find:
              print("Exists!")
              break
      else:
          print("Does not exist")

Output:

>>> some_list = [1, 2, 3, 4, 5]
>>> does_exists_num(some_list, 4)
Exists!
>>> does_exists_num(some_list, -1)
Does not exist

异常的 else . 例,

try:
    pass
except:
    print("Exception occurred!!!")
else:
    print("Try block executed successfully...")

Output:

Try block executed successfully...

💡 说明:


> Ellipsis/省略 *

def some_func():
    Ellipsis

Output

>>> some_func()
# 没有输出,也没有报错

>>> SomeRandomString
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'SomeRandomString' is not defined

>>> Ellipsis
Ellipsis

💡 说明


> Inpinity/无限 *

英文拼写是有意的, 请不要为此提交补丁. (译: 这里是为了突出 Python 中无限的定义与Pi有关, 所以将两个单词拼接了.)

Output (Python 3.x):

>>> infinity = float('infinity')
>>> hash(infinity)
314159
>>> hash(float('-inf'))
-314159

💡 说明:


> Let’s mangle/修饰时间! *

1.

class Yo(object):
    def __init__(self):
        self.__honey = True
        self.bro = True

Output:

>>> Yo().bro
True
>>> Yo().__honey
AttributeError: 'Yo' object has no attribute '__honey'
>>> Yo()._Yo__honey
True

2.

class Yo(object):
    def __init__(self):
        # 这次试试对称形式
        self.__honey__ = True
        self.bro = True

Output:

>>> Yo().bro
True

>>> Yo()._Yo__honey__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'Yo' object has no attribute '_Yo__honey__'

为什么 Yo()._Yo__honey 能运行? 只有印度人理解.(译: 这个梗可能是指印度音乐人Yo Yo Honey Singh)

3.

_A__variable = "Some value"

class A(object):
    def some_func(self):
        return __variable # 没在任何地方初始化

Output:

>>> A().__variable
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'A' object has no attribute '__variable'

>>> A().some_func()
'Some value'

💡 说明:


Section: Appearances are deceptive!/外表是靠不住的!

> Skipping lines?/跳过一行?

Output:

>>> value = 11
>>> valuе = 32
>>> value
11

什么鬼?

注意: 如果你想要重现的话最简单的方法是直接复制上面的代码片段到你的文件或命令行里.

💡 说明:

一些非西方字符虽然看起来和英语字母相同, 但会被解释器识别为不同的字母.

>>> ord('е') # 西里尔语的 'e' (Ye)
1077
>>> ord('e') # 拉丁语的 'e', 用于英文并使用标准键盘输入
101
>>> 'е' == 'e'
False

>>> value = 42 # 拉丁语 e
>>> valuе = 23 # 西里尔语 'e', Python 2.x 的解释器在这会抛出 `SyntaxError` 异常
>>> value
42

内置的 ord() 函数可以返回一个字符的 Unicode 代码点, 这里西里尔语 ‘e’ 和拉丁语 ‘e’ 的代码点不同证实了上述例子.


> Teleportation/空间移动 *

import numpy as np

def energy_send(x):
    # 初始化一个 numpy 数组
    np.array([float(x)])

def energy_receive():
    # 返回一个空的 numpy 数组
    return np.empty((), dtype=np.float).tolist()

Output:

>>> energy_send(123.456)
>>> energy_receive()
123.456

谁来给我发个诺贝尔奖?

💡 说明:


> Well, something is fishy…/嗯,有些可疑…

def square(x):
    """
    一个通过加法计算平方的简单函数.
    """
    sum_so_far = 0
    for counter in range(x):
        sum_so_far = sum_so_far + x
  return sum_so_far

Output (Python 2.x):

>>> square(10)
10

难道不应该是100吗?

注意: 如果你无法重现, 可以尝试运行这个文件mixed_tabs_and_spaces.py.

💡 说明:


Section: Miscellaneous/杂项

> += is faster/更快的 +=

# 用 "+" 连接三个字符串:
>>> timeit.timeit("s1 = s1 + s2 + s3", setup="s1 = ' ' * 100000; s2 = ' ' * 100000; s3 = ' ' * 100000", number=100)
0.25748300552368164
# 用 "+=" 连接三个字符串:
>>> timeit.timeit("s1 += s2 + s3", setup="s1 = ' ' * 100000; s2 = ' ' * 100000; s3 = ' ' * 100000", number=100)
0.012188911437988281

💡 说明:


> Let’s make a giant string!/来做个巨大的字符串吧!

def add_string_with_plus(iters):
    s = ""
    for i in range(iters):
        s += "xyz"
    assert len(s) == 3*iters

def add_bytes_with_plus(iters):
    s = b""
    for i in range(iters):
        s += b"xyz"
    assert len(s) == 3*iters

def add_string_with_format(iters):
    fs = "{}"*iters
    s = fs.format(*(["xyz"]*iters))
    assert len(s) == 3*iters

def add_string_with_join(iters):
    l = []
    for i in range(iters):
        l.append("xyz")
    s = "".join(l)
    assert len(s) == 3*iters

def convert_list_to_string(l, iters):
    s = "".join(l)
    assert len(s) == 3*iters

Output:

>>> timeit(add_string_with_plus(10000))
1000 loops, best of 3: 972 µs per loop
>>> timeit(add_bytes_with_plus(10000))
1000 loops, best of 3: 815 µs per loop
>>> timeit(add_string_with_format(10000))
1000 loops, best of 3: 508 µs per loop
>>> timeit(add_string_with_join(10000))
1000 loops, best of 3: 878 µs per loop
>>> l = ["xyz"]*10000
>>> timeit(convert_list_to_string(l, 10000))
10000 loops, best of 3: 80 µs per loop

让我们将迭代次数增加10倍.

>>> timeit(add_string_with_plus(100000)) # 执行时间线性增加
100 loops, best of 3: 9.75 ms per loop
>>> timeit(add_bytes_with_plus(100000)) # 二次增加
1000 loops, best of 3: 974 ms per loop
>>> timeit(add_string_with_format(100000)) # 线性增加
100 loops, best of 3: 5.25 ms per loop
>>> timeit(add_string_with_join(100000)) # 线性增加
100 loops, best of 3: 9.85 ms per loop
>>> l = ["xyz"]*100000
>>> timeit(convert_list_to_string(l, 100000)) # 线性增加
1000 loops, best of 3: 723 µs per loop

💡 说明:


> Slowing down dict lookups/让字典的查找慢下来 *

some_dict = {str(i): 1 for i in range(1_000_000)}
another_dict = {str(i): 1 for i in range(1_000_000)}

Output:

>>> %timeit some_dict['5']
28.6 ns ± 0.115 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
>>> some_dict[1] = 1
>>> %timeit some_dict['5']
37.2 ns ± 0.265 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

>>> %timeit another_dict['5']
28.5 ns ± 0.142 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
>>> another_dict[1]  # Trying to access a key that doesn't exist
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 1
>>> %timeit another_dict['5']
38.5 ns ± 0.0913 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

为什么相同的查找会变得越来越慢?

💡 说明


> Bloating instance dicts/变臃肿的dict实例们 *

import sys

class SomeClass:
    def __init__(self):
        self.some_attr1 = 1
        self.some_attr2 = 2
        self.some_attr3 = 3
        self.some_attr4 = 4


def dict_size(o):
    return sys.getsizeof(o.__dict__)

Output: (Python 3.8, 其他 Python 3 的版本也许稍有不同)

>>> o1 = SomeClass()
>>> o2 = SomeClass()
>>> dict_size(o1)
104
>>> dict_size(o2)
104
>>> del o1.some_attr1
>>> o3 = SomeClass()
>>> dict_size(o3)
232
>>> dict_size(o1)
232

让我们在一个新的解释器中再试一次:

>>> o1 = SomeClass()
>>> o2 = SomeClass()
>>> dict_size(o1)
104  # 意料之中
>>> o1.some_attr5 = 5
>>> o1.some_attr6 = 6
>>> dict_size(o1)
360
>>> dict_size(o2)
272
>>> o3 = SomeClass()
>>> dict_size(o3)
232

是什么让那些字典变得臃肿? 为什么新创建的对象也会变臃肿?

💡 说明


> Minor Ones/小知识点


Contributing/贡献

欢迎各种补丁! 详情请看CONTRIBUTING.md.(译: 这是给原库提贡献的要求模版)

你可以通过新建 issue 或者在上 Gitter 与我们进行讨论.

(译: 如果你想对这个翻译项目提供帮助, 请看这里)

Acknowledgements/致谢

这个系列最初的想法和设计灵感来自于 Denys Dovhan 的项目 wtfjs. 社区的强大支持让它成长为现在的模样.

Some nice Links!/一些不错的资源

🎓 License/许可

CC 4.0

© Satwik Kansal

Help/帮助

如果您有任何想法或建议,欢迎分享.

Surprise your geeky pythonist friends?/想给你的极客朋友一个惊喜?

您可以使用这些快链向 Twitter 和 Linkedin 上的朋友推荐 wtfpython,

Twitter | Linkedin

Need a pdf version?/需要来一份pdf版的?

你可以快速在获得英文作者制作的版本.

Follow Commit/追踪Commit

这是中文版 fork 时所处的原库 Commit, 当原库更新时我会跟随更新.

Commit id