Python3 CookBook | 迭代器与生成器

本文首发于知乎专栏，欢迎关注。知乎专栏传送门

以下测试代码全部基于 Python3。

反向迭代

想要反向迭代一个序列很容易，使用内置函数 reversed() 便可以做到，如下：

In [1]: a = [1, 2, 3, 4]
 
In [2]: for x in reversed(a):
   ...:     print(x)
   ...:
4
3
2
1

反向迭代的特点是，需要预先知道迭代对象的大小，或者对象实现了 __reversed__() 方法，如果两者都不符合，那么，必须先将对象转换成一个列表才可以。

# Print a file backwards
f = open('somefile')
for line in reversed(list(f)):
    print(line, end='')

有一个需要注意的问题就是，如果迭代对象元素很多的话，在转换成列表的过程中会耗费大量的内存。

想解决这个问题，可以在自定义类上实现 __reversed__() 方法来解决，代码如下：

#!/usr/bin/env python
#-*- encoding: utf-8 -*-
 
def reverse_iterate(): 
	for rr in reversed(Countdown(30)):
		print(rr)
	for rr in Countdown(30):
		print(rr)
 
class Countdown:
	def __init__(self, start):
		self.start = start
 
	    #  Forward iterator
    def __iter__(self):
		n = self.start
		while n > 0:
			yield n
			n -= 1

	# Reverse iterator 当使用reversed函数翻转对象时调用
	def __reversed__(self):
		n = 1
		while n <= self.start:
			yield n
			n += 1
 
 
if __name__ == '__main__':
	reverse_iterate()

这个方法可以使代码非常的高效，因为它不再需要将数据填充到一个列表中，然后再去反向迭代这个列表。

迭代器切片

在处理列表相关问题时，使用切片操作非常方便，但遗憾的是，迭代器并不支持标准的切片操作，主要原因就是因为，我们事先并不知道迭代器和生成器的长度。

In [3]: def count(n):
   ...:     while True:
   ...:         yield n
   ...:         n += 1
   ...:
 
In [4]: c = count(0)
 
In [5]: c[10: 20]
-----------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-5-60489cd5ce42> in <module>()
> 1 c[10: 20]
 
TypeError: 'generator' object is not subscriptable

想在迭代器和生成器上使用切片操作，可以使用 itertools.islice() 函数：

In [6]: import itertools
 
In [7]: for x in itertools.islice(c, 10, 20):
   ...:     print(x)
   ...:
10
11
12
13
14
15
16
17
18
19

但是这里有一个问题，islice() 函数会消耗掉传入的数据，比如我再调用一次这个函数，返回的结果就发生了变化。

In [8]: for x in itertools.islice(c, 10, 20):
   ...:     print(x)
   ...:
   ...:
30
31
32
33
34
35
36
37
38
39

所以，如果想多次使用切片的结果，就需要把数据存起来。

顺序迭代合并后的排序迭代对象

假设现在有多个排序序列，现在想把它们合并，并且得到一个新的排序序列，应该怎么做呢？

heapq.merge() 函数可以完美解决这个问题：

In [9]: import heapq
 
In [10]: a = [1, 4, 7, 10]
 
In [11]: b = [2, 5, 6, 11]
 
In [12]: heapq.merge(a, b)
Out[12]: <generator object merge at 0x1087ab570>
 
In [13]: for x in heapq.merge(a, b):
	...:     print(x)
	...:
1
2
4
5
6
7
10
11

需要注意的一点是，传入的序列必须是排过序的。

如果序列中元素过多，也并不需要担心效率问题，通过上面代码也可以看出，heapq.merge() 函数的返回结果依然是一个生成器，并非是列表。

未完待续。。。

今天看啥 - 高品质阅读平台
本文地址：http://www.jintiankansha.me/t/Qv5lX5E7wH