python一些不为人知的小技巧

startswith()和endswith()参数可以是元组

当检测字符串开头或结尾时，如果有多个检测值，可以用元组作为startswith()和endswith()参数:

# bad
if image.endswith('.jpg') or image.endswith('.png') or image.endswith('.gif'):
    pass
# good
if image.endswith(('.jpg', '.png', '.gif')):
    pass
# bad
if url.startswith('http:') or url.startswith('https:') or url.startswith('ftp:'):
    pass
# good
if url.startswith(('http:', 'https:', 'ftp:')):
    pass

enumerate()设置start参数做为索引起始值

当用enumerate()迭代同时要得到索引时，可以设置start参数作为索引起始值:

# bad
for index, v in enumerate(data):
    print(index+1, v)
# good
for index, v in enumerate(data, start=1):
    print(index, v)

对切片命名

当代码中到处都是硬编码的切片索引时，我们的代码将变得无法阅读。可以对切片命名解决此问题:

record = '....................100.................513.25......'
# bad
cost = int(record[20:23]) * float(record[40:46])
# good
SHARES = slice(20, 23)
PRICE = slice(40, 46)
cost = int(record[SHARES]) * float(record[PRICE])

作为一条基本准则，代码中如果有很多硬编码的索引值，将导致可读性合可维护性都不佳。一般来说，内置的slice()函数会创建一个切片对象，可以用在任何允许进行切片操作的地方。例如:

>>> items = [0, 1, 2, 3, 4, 5, 6]
>>> a = slice(2, 4)
>>> items[2:4]
[2, 3]
>>> items[a]
[2, 3]
>>> items[a] = [-2, -3]
>>> items
[0, 1, -2, -3, 4, 5, 6]
>>> del items[a]
>>> items
[0, 1, 4, 5, 6]
>>>

上下文管理器可以同时管理多个资源

假设你要读取一个文件的内容，经过处理以后，写入到另一个文件。你能写出pythonic的代码，所以你使用了上下文管理器，满意地的写出了下面这样的代码:

1
2
3

with open('input.txt', 'r') as source:
    with open('output.txt', 'w') as target:
        target.write(source.read())

你已经做的很好了，但是上下文管理器可以同时管理多个资源，上面这段代码还可以这样写:

1 2	with open('input.txt', 'r') as source, open('output.txt', 'w') as target: target.write(source.read())

else子句

Python中的else子句不仅能在if语句中使用，还能在for、while、和try语句中使用。
在for循环或是while循环正常运行完毕时（而不是通过break语句或是return语句或是异常退出循环），才会运行else块。
举个例子:

>>> for i in range(3):
...     print(i)
... else:
...     print('Iterated over everything')
... 
0
1
2
Iterated over everything
>>>

如上，for循环正常结束，所以运行了后面的else块

>>> for i in range(3):
...     if i == 2:
...         break
...     print(i)
... else:
...     print('Iterated over everything')
... 
0
1
>>>

由此可以看出，for循环如果没有正常运行完毕（如上面是break结束循环的），是不会运行后面的else块。
仅当try块中没有异常抛出时才运行else块。一开始，你可能觉得没必要在try/except块中使用else子句。
毕竟，在下述代码片段中，只有dangerous_call()不抛出异常，after_call()才会执行，对吧？

try：
    dangerous_call()
    after_call()
except OSError:
    log('OSError...')

然而，after_call()不应该放在try块中。为了清晰明确，try块中应该只包括抛出预期异常的语句。因此,下面这种写法更优雅:

try:
    dangerous_call()
except OSError:
    log('OSError...')
else:
    after_call()

现在很明确，try块防守的是dangerous_call()可能出现的错误，而不是after_call()。而且很明显，只有try块不抛出异常，才会执行after_call()。但要注意一点，else子句抛出的异常不会由前面的except子句处理，也就是说此时after_call()如果抛出异常，将不会被捕获到。

脚本与命令行结合

可以使用下面方法运行一个Python脚本，在脚本运行结束后，直接进入Python命令行。这样做的好处是脚本的对象不会被清空，可以通过命令行直接调用。

$ cat hello.py
#! /usr/bin/env python
# -*- coding: utf-8 -*-
# __author__ = "shuke"
# Date: 2018/2/7

info = {'name': 'shuke','age': 18}
li = [1,2,3,4,'A']
const = 1000
# 结合命令行
$ python -i hello.py
>>> info
{'name': 'shuke', 'age': 18}
>>> li
[1, 2, 3, 4, 'A']
>>> const
1000
>>> exit()

默认字典的简单树状表达

import json
import collections

tree = lambda: collections.defaultdict(tree)
root = tree()
root['menu']['id'] = 'file'
root['menu']['value'] = 'File'
root['menu']['menuitems']['new']['value'] = 'New'
root['menu']['menuitems']['new']['onclick'] = 'new();'
root['menu']['menuitems']['open']['value'] = 'Open'
root['menu']['menuitems']['open']['onclick'] = 'open();'
root['menu']['menuitems']['close']['value'] = 'Close'
root['menu']['menuitems']['close']['onclick'] = 'close();'
print(json.dumps(root, sort_keys=True, indent=4, separators=(',', ': ')))

"""
output: 
{
    "menu": {
        "id": "file",
        "menuitems": {
            "close": {
                "onclick": "close();",
                "value": "Close"
            },
            "new": {
                "onclick": "new();",
                "value": "New"
            },
            "open": {
                "onclick": "open();",
                "value": "Open"
            }
        },
        "value": "File"
    }
}
"""

扩展拆箱(只兼容python3)

>>> a, *b, c = [1, 2, 3, 4, 5]
>>> a
1
>>> b
[2, 3, 4]
>>> c
5

列表切割赋值

>>> a = [1, 2, 3, 4, 5]
>>> a[2:3] = [0, 0]
>>> a
[1, 2, 0, 0, 4, 5]
>>> a[1:1] = [8, 9]
>>> a
[1, 8, 9, 2, 0, 0, 4, 5]
>>> a[1:-1] = []
>>> a
[1, 5]

命名列表切割方式

>>> a = [0, 1, 2, 3, 4, 5]
>>> LASTTHREE = slice(-3, None)
>>> LASTTHREE
slice(-3, None, None)
>>> a[LASTTHREE]
[3, 4, 5]

列表以及迭代器的压缩和解压缩

a = [1, 2, 3]
b = ['a', 'b', 'c']

z = zip(a, b)

for i in zip(a, b):
    print(i)

for item in zip(*z):
    print(item)

列表展开

>>> import itertools
>>> a = [[1, 2], [3, 4], [5, 6]]
>>> list(itertools.chain.from_iterable(a))
[1, 2, 3, 4, 5, 6]
 
>>> sum(a, [])
[1, 2, 3, 4, 5, 6]
 
>>> [x for l in a for x in l]
[1, 2, 3, 4, 5, 6]
 
>>> a = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
>>> [x for l1 in a for l2 in l1 for x in l2]
[1, 2, 3, 4, 5, 6, 7, 8]
 
>>> a = [1, 2, [3, 4], [[5, 6], [7, 8]]]
>>> flatten = lambda x: [y for l in x for y in flatten(l)] if type(x) is list else [x]
>>> flatten(a)
[1, 2, 3, 4, 5, 6, 7, 8]

生成器表达式

>>> g = (x ** 2 for x in range(5))
>>> next(g)
0
>>> next(g)
1
>>> next(g)
4
>>> next(g)
9
>>> next(g)
16
>>> next(g)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> sum(x ** 3 for x in range(10))
2025
>>> sum(x ** 3 for x in range(10) if x % 3 == 1)
408
# 迭代器中没有可迭代对象的时候会引发StopIteration错误

字典setdefault

>>> request = {}
>>> request.setdefault(None,[]).append(123)
>>> print(request)
{None: [123]}
>>> request.setdefault(None,[]).append(456)
>>> print(request)
{None: [123, 456]}

字典推导

>>> m = {x: x ** 2 for x in range(5)}
>>> m
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}
>>> m = {x: 'A' + str(x) for x in range(10)}
>>> m
{0: 'A0', 1: 'A1', 2: 'A2', 3: 'A3', 4: 'A4', 5: 'A5', 6: 'A6', 7: 'A7', 8: 'A8', 9: 'A9'}

字典推导反转字典

>>> m = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> m
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> {v: k for k, v in m.items()}
{1: 'a', 2: 'b', 3: 'c', 4: 'd'}

命名元组

>>> Point = collections.namedtuple('Point', ['x', 'y'])
>>> p = Point(x=1.0, y=2.0)
>>> p
Point(x=1.0, y=2.0)
>>> p.x
1.0
>>> p.y
2.0

继承命名元组

import collections


class Point(collections.namedtuple('PointBase', ['x', 'y'])):
    __slots__ = ()

    def __add__(self, other):
        return Point(x=self.x + other.x, y=self.y + other.y)


p = Point(x=1.0, y=2.0)
q = Point(x=2.0, y=3.0)
print(p + q)
"""
Point(x=3.0, y=5.0)
"""

统计在可迭代器中最常出现的元素

>>> A = collections.Counter([1, 1, 2, 2, 3, 3, 3, 3, 4, 5, 6, 7])
>>> A
Counter({3: 4, 1: 2, 2: 2, 4: 1, 5: 1, 6: 1, 7: 1})
>>> A.most_common(1)
[(3, 4)]
>>> A.most_common(3)
[(3, 4), (1, 2), (2, 2)]

两端都可以操作的队列

>>> Q = collections.deque()
>>> Q.append(1)
>>> Q.appendleft(2)
>>> Q.extend([3, 4])
>>> Q.extendleft([5, 6])
>>> Q
deque([6, 5, 2, 1, 3, 4])
>>> Q.pop()
4
>>> Q.popleft()
6
>>> Q
deque([5, 2, 1, 3])
>>> Q.rotate(3)
>>> Q
deque([2, 1, 3, 5])
>>> Q.rotate(-3)
>>> Q
deque([5, 2, 1, 3])

有最大长度的双端队列

>>> last_three = collections.deque(maxlen=3)
>>> for i in range(10):
...     last_three.append(i)
...     print(','.join(str(x) for x in last_three))
...
0
0,1
0,1,2
1,2,3
2,3,4
3,4,5
4,5,6
5,6,7
6,7,8
7,8,9

最大和最小的几个列表元素

>>> import random
>>> import heapq
>>> a = [random.randint(0, 100) for __ in range(100)]
>>>  heapq.nsmallest(5, a)
  File "<stdin>", line 1
    heapq.nsmallest(5, a)
    ^
IndentationError: unexpected indent
>>> heapq.nsmallest(5, a)
[0, 1, 3, 4, 4]
>>> heapq.nlargest(5, a)
[100, 99, 98, 97, 96]

两个列表的笛卡尔积

>>> import itertools
>>> for p in itertools.product([1, 2, 3], [4, 5]):
...     print(p)
...
(1, 4)
(1, 5)
(2, 4)
(2, 5)
(3, 4)
(3, 5)
>>> for p in itertools.product([0, 1], repeat=4):
...     print(''.join(str(x) for x in p))
...
0000
0001
0010
0011
0100
0101
0110
0111
1000
1001
1010
1011
1100
1101
1110
1111

列表组合和列表元素替代组合

>>> for c in itertools.combinations([1, 2, 3, 4, 5], 3):
...     print(''.join(str(x) for x in c))
...
123
124
125
134
135
145
234
235
245
345
>>> for c in itertools.combinations_with_replacement([1, 2, 3], 2):
...     print(''.join(str(x) for x in c))
...
11
12
13
22
23
33

列表元素排列组合

>>> for p in itertools.permutations([1, 2, 3, 4]):
...      print(''.join(str(x) for x in p))
...
1234
1243
1324
1342
1423
1432
2134
2143
2314
2341
2413
2431
3124
3142
3214
3241
3412
3421
4123
4132
4213
4231
4312
4321

可链接迭代器

>>> a = [1, 2, 3, 4]
>>> for p in itertools.chain(itertools.combinations(a, 2), itertools.combinations(a, 3)):
...     print(p)
...
(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)
(1, 2, 3)
(1, 2, 4)
(1, 3, 4)
(2, 3, 4)
>>> for subset in itertools.chain.from_iterable(itertools.combinations(a, n) for n in range(len(a) + 1)):
...      print(subset)
...
()
(1,)
(2,)
(3,)
(4,)
(1, 2)
(1, 3)
(1, 4)
(2, 3)
(2, 4)
(3, 4)
(1, 2, 3)
(1, 2, 4)
(1, 3, 4)
(2, 3, 4)
(1, 2, 3, 4)

根据文件指定列类聚

>>> import itertools
>>> with open('contactlenses.csv', 'r') as infile:
...     data = [line.strip().split(',') for line in infile]
...
>>> data = data[1:]
>>> def print_data(rows):
...     print '\n'.join('\t'.join('{: <16}'.format(s) for s in row) for row in rows)
...

>>> print_data(data)
young               myope                   no                      reduced                 none
young               myope                   no                      normal                  soft
young               myope                   yes                     reduced                 none
young               myope                   yes                     normal                  hard
young               hypermetrope            no                      reduced                 none
young               hypermetrope            no                      normal                  soft
young               hypermetrope            yes                     reduced                 none
young               hypermetrope            yes                     normal                  hard
pre-presbyopic      myope                   no                      reduced                 none
pre-presbyopic      myope                   no                      normal                  soft
pre-presbyopic      myope                   yes                     reduced                 none
pre-presbyopic      myope                   yes                     normal                  hard
pre-presbyopic      hypermetrope            no                      reduced                 none
pre-presbyopic      hypermetrope            no                      normal                  soft
pre-presbyopic      hypermetrope            yes                     reduced                 none
pre-presbyopic      hypermetrope            yes                     normal                  none
presbyopic          myope                   no                      reduced                 none
presbyopic          myope                   no                      normal                  none
presbyopic          myope                   yes                     reduced                 none
presbyopic          myope                   yes                     normal                  hard
presbyopic          hypermetrope            no                      reduced                 none
presbyopic          hypermetrope            no                      normal                  soft
presbyopic          hypermetrope            yes                     reduced                 none
presbyopic          hypermetrope            yes                     normal                  none

>>> data.sort(key=lambda r: r[-1])
>>> for value, group in itertools.groupby(data, lambda r: r[-1]):
...     print '-----------'
...     print 'Group: ' + value
...     print_data(group)
...
-----------
Group: hard
young               myope                   yes                     normal                  hard
young               hypermetrope            yes                     normal                  hard
pre-presbyopic      myope                   yes                     normal                  hard
presbyopic          myope                   yes                     normal                  hard
-----------
Group: none
young               myope                   no                      reduced                 none
young               myope                   yes                     reduced                 none
young               hypermetrope            no                      reduced                 none
young               hypermetrope            yes                     reduced                 none
pre-presbyopic      myope                   no                      reduced                 none
pre-presbyopic      myope                   yes                     reduced                 none
pre-presbyopic      hypermetrope            no                      reduced                 none
pre-presbyopic      hypermetrope            yes                     reduced                 none
pre-presbyopic      hypermetrope            yes                     normal                  none
presbyopic          myope                   no                      reduced                 none
presbyopic          myope                   no                      normal                  none
presbyopic          myope                   yes                     reduced                 none
presbyopic          hypermetrope            no                      reduced                 none
presbyopic          hypermetrope            yes                     reduced                 none
presbyopic          hypermetrope            yes                     normal                  none
-----------
Group: soft
young               myope                   no                      normal                  soft
young               hypermetrope            no                      normal                  soft
pre-presbyopic      myope                   no                      normal                  soft
pre-presbyopic      hypermetrope            no                      normal                  soft
presbyopic          hypermetrope            no                      normal                  soft

按单词反转字符串

按单词反转字符串是一道很常见的面试题。在Python中实现起来非常简单

def reverse_string_by_word(s):
    lst = s.split()  # split by blank space by default
    return ' '.join(lst[::-1])

s = 'Power of Love'
print reverse_string_by_word(s)
# Love of Power

s = 'Hello    World!'
print reverse_string_by_word(s)
# World! Hello

上面的实现其实已经能满足大多数情况，但是并不完美。比如第二个字符串中的感叹号并没有被翻转，而且原字符串中的空格数量也没有保留。（在上面的例子里其实Hello和World之间不止一个空格）

我们期望的结果应该是这样子的:

1 2	print reverse_string_by_word(s) # Expected: !World Hello

要改进上面的方案还不把问题复杂化，推荐使用re模块。你可以查阅re.split() 的官方文档。我们看一下具体例子

>>> import re
>>> s = 'Hello  World!'

>>> re.split(r'\s+', s)    # will discard blank spaces
['Hello', 'World!']

>>> re.split(r'(\s+)', s)  # will keep spaces as a group
['Hello', '  ', 'World!']

>>> s = '< Welcome to EF.COM! >'

>>> re.split(r'\s+', s)  # split by spaces
['<', 'Welcome', 'to', 'EF.COM!', '>']

>>> re.split(r'(\w+)', s)  # exactly split by word
['< ', 'Welcome', ' ', 'to', ' ', 'EF', '.', 'COM', '! >']

>>> re.split(r'(\s+|\w+)', s)  # split by space and word
['<', ' ', '', 'Welcome', '', ' ', '', 'to', '', ' ', '', 'EF', '.', 'COM', '!', ' ', '>']

>>> ''.join(re.split(r'(\s+|\w+)', s)[::-1])
'> !COM.EF to Welcome <'

>>> ''.join(re.split(r'(\s+)', s)[::-1])
'> EF.COM! to Welcome <'

>>> ''.join(re.split(r'(\w+)', s)[::-1])
'! >COM.EF to Welcome< '

如果你觉得用切片将序列倒序可读性不高，那么其实也可以这样写

1 2	>>> ''.join(reversed(re.split(r'(\s+\|\w+)', s))) '> !COM.EF to Welcome <

30个有关Python的小技巧
 Python学习之路上的几个经典问题|大师兄