Python3 CookBook学习笔记 -- 迭代器与生成器

1. 手动遍历迭代器

你想遍历一个可迭代对象中的所有元素，但是却不想使用for循环。

请使用 next() 函数并在代码中捕获 StopIteration 异常。我们也可提供默认值用于标记结尾。

>>> items = [1, 2, 3]
>>> it = iter(items) # Invokes items.__iter__()
>>> 
>>> next(it) # Invokes it.__next__()
1
>>> next(it)
2
>>> next(it)
3
>>> next(it)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration
>>> 
>>> next(it, None)

何为可迭代对象，需要这个对象中实现了 __iter__() 以及 __next__()两个函数。

2. 代理迭代

class Node:

    def __init__(self, value):
        self._value = value
        self._children=[]

    def __repr__(self):
        return 'Node({!r})'.format(self._value)

    def add_child(self, child):
        self._children.append(child)

    def __iter__(self):
        return iter(self._children)


if __name__ == '__main__':
    root=Node(0)
    child1=Node(1)
    child2=Node(2)
    root.add_child(child1)
    root.add_child(child2)
    print(root)
    for child in root:
        print(child)

Python 的迭代器协议需要 __iter__() 方法返回一个实现了 __next__() 方法的迭代器对象。如果你只是迭代遍历其他容器的内容，你无须担心底层是怎样实现的。你所要做的只是传递迭代请求既可。

这里的 iter() 函数的使用简化了代码， iter(s) 只是简单的通过调用 s.__iter__() 方法来返回对应的迭代器对象，就跟 len(s) 会调用 s.__len__() 原理是一样的。

3. 使用生成器创建新的迭代模式

如果我们想用函数实现生成器函数，那么函数中需要有一个 yield 语句即可将其转换为一个生成器。跟普通函数不同的是，生成器只能用于迭代操作，且只能使用一次。

>>> def countdown(n):
...     print('Starting to count from', n)
...     while n > 0:
...             yield n
...             n -= 1
...     print('Done!')
... 
>>> 
>>> c=countdown(3)
>>> c
<generator object countdown at 0x101aa93b8>
>>> 
>>> next(c)
Starting to count from 3
3
>>> next(c)
2
>>> next(c)
1
>>> next(c)
Done!
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

生成器只能使用一次

>>> n=(n*n for n in range(0, 10))
>>> 
>>> 
>>> sum(n)
285
>>> sum(n)
0

4. 实现迭代器协议

当想实现一个具有迭代功能的自定义对象时，最好的办法就是使用生成器函数。

class Node:
    def __init__(self, value):
        self._value = value
        self._children=[]

    def __repr__(self):
        return 'Node({!r})'.format(self._value)

    def add_child(self, child):
        self._children.append(child)

    def __iter__(self):
        return iter(self._children)

    def depth_first(self):
        yield self
        for c in self:
            yield from c.depth_first()

Python 的迭代协议要求一个 __iter__() 方法返回一个特殊的迭代器对象，这个迭代器对象实现了 __next__() 方法并通过 StopIteration 异常标识迭代的完成。但是，实现这些通常会比较繁琐。

我们实现这个深度优先的递归遍历树形节点的生成器，最直接的方式就是使用 yield 、yield from 将函数变成生成器函数。

5. 反向迭代

反方向迭代一个序列，需要使用reversed()函数。

>>> a = [1, 2, 3, 4]
>>> for x in reversed(a):
...     print(x)
...
4
3
2
1

反向迭代需要两个条件：

当对象的大小可预先确定
对象实现了__reversed__() 的特殊方法时才能生效。

如果两者都不符合，那你必须先将对象转换为一个 list 才行。

>>> f = open('/Users/faris/Desktop/docker.txt')
>>> 
>>> for line in reversed(list(f)):
...     print(line, end='')

在自定义类上实现 __reversed__() 方法来实现反向迭代，会使代码更加的高效，因为不用再转化成 list后再反向迭代。

class Countdown:
    def __init__(self, start):
        self.start = start
    def __iter__(self):
        n = self.start
        while n > 0 :
            yield n
            n = n - 1
    def __reversed__(self):
        n = 0
        while n <= self.start:
            yield n
            n = n + 1

>>> a=Countdown(10)
>>> [n for n in a]
[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
>>> 
>>> [n for n in reversed(a)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

6. 迭代器切片

标准的切片操作，需要知道所切对象的长度，因此迭代器与生成器不能使用标准切片。

函数 itertools.islice() 正好适用于在迭代器和生成器上做切片操作

>>> a=[10, 20, 30]
>>> 
>>> a[1:]
[20, 30]
>>>  
>>> b=(n for n in a)
>>> 
>>> b[1:]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'generator' object is not subscriptable
>>> 
>>> b=(n for n in a)
>>> for x in itertools.islice(b, 1, 2):
...     print(x)
... 
20

7. 跳过可迭代对象的开始部分

你想遍历一个可迭代对象，但是它开始的某些元素你并不感兴趣，想跳过它们。

itertools.dropwhile() 函数，你给它传递一个函数对象和一个可迭代对象，它会返回一个迭代器对象。

>>> from itertools import dropwhile
>>> 
>>> with open('/etc/passwd') as f:
...     for line in dropwhile(lambda line: line.startswith('#'), f):
...         print(line, end='')

如果想跳过前几个元素的话，可以使用上一节的 itertools.islice()

>>> from itertools import islice
>>> 
>>> a=[10, 20, 30, 40, 50]
>>> for n in islice(a, 3, None):
...     print(n)
... 
40
50

或者我们仍然可以使用生成器来完成。

>>> with open('/etc/passwd') as f:
...     lines = (line for line in f if not line.startswith('#'))
...     for line in lines:
...         print(line, end='')

8. 排列组合的迭代

迭代遍历一个集合中元素的所有可能的排列或组合。

8.1 itertools.permutations()

可以得到输入序列的所有排列组合

>>> from itertools import permutations
>>> 
>>> items = ['a', 'b', 'c']
>>> for p in permutations(items):
...     print(p)
... 
('a', 'b', 'c')
('a', 'c', 'b')
('b', 'a', 'c')
('b', 'c', 'a')
('c', 'a', 'b')
('c', 'b', 'a')

也可以选择可排列的个数

>>> from itertools import permutations
>>> 
>>> items = ['a', 'b', 'c']
>>> for p in permutations(items, 2):
...     print(p)
... 
('a', 'b')
('a', 'c')
('b', 'a')
('b', 'c')
('c', 'a')
('c', 'b')

8.2 itertools.combinations()

可以得到输入序列的所有组合。（不关心顺序）

>>> from itertools import combinations
>>> 
>>> items = ['a', 'b', 'c']
>>> for p in combinations(items, 2):
...     print(p)
... 
('a', 'b')
('a', 'c')
('b', 'c')

8.3 itertools.combinations_with_replacement()

在组合时，允许同一元素在不同位置重复出现

>>> from itertools import combinations_with_replacement
>>> 
>>> items = ['a', 'b', 'c']
>>> for p in combinations_with_replacement(items, 2):
...     print(p)
... 
('a', 'a')
('a', 'b')
('a', 'c')
('b', 'b')
('b', 'c')
('c', 'c')

9. 序列上索引值迭代

你想在迭代一个序列的同时跟踪正在被处理的元素索引。

内置的 enumerate() 函数可以很好的解决这个问题:

>>> my_list = ['a', 'b', 'c']
>>> 
>>> for n in enumerate(my_list):
...     print(n)
... 
(0, 'a')
(1, 'b')
(2, 'c')

你可以传递一个开始索引参数:

>>> my_list = ['a', 'b', 'c']
>>> 
>>> for n in enumerate(my_list, 2):
...     print(n)
... 
(2, 'a')
(3, 'b')
(4, 'c')

这样在你处理错误是很好定位的。

def parse_data(filename):
    with open(filename, 'rt') as f:
        for lineno, line in enumerate(f, 1):
            fields = line.split()
            try:
                count = int(fields[1])
            except ValueError as e:
                print('Line {}: Parse error: {}'.format(lineno, e))

另外需要注意：

序列元素本身就是元组的时候，取值应该是这样子的。

>>> data = [ (1, 2), (3, 4), (5, 6), (7, 8) ]
>>> 
>>> for no, (x, y) in enumerate(data):
...     print(no, x, y)
... 
0 1 2
1 3 4
2 5 6
3 7 8

10. 展开嵌套的序列

你想将一个多层嵌套的序列展开成一个单层列表。

我们可以写一个包含 yield from 语句的递归生成器来轻松解决这个问题:

>>> from collections import Iterable
>>> 
>>> def flatten(items, ignore_types=(str, bytes)):
...     for n in items:
...         if isinstance(n, Iterable) and not isinstance(n, ignore_types):
...             yield from flatten(n)
...         else:
...             yield  n
... 
>>> 
>>> items = [1, 2, [3, 4, [5, 6], 7], 8, 'faris']
>>> 
>>> print([n for n in flatten(items)])
[1, 2, 3, 4, 5, 6, 7, 8, 'faris']

isinstance 用来判断类型。

11. 顺序迭代合并后的排序迭代对象

你有一系列排序序列，想将它们合并后得到一个排序序列并在上面迭代遍历。

可以使用 heapq.merge。由于它不会立即合并，所以我们可以在很长的序列上使用它，也不会带来很大的开销。

但是要求所有序列必须是排过序的。

>>> import heapq
>>> a = [1, 4, 7, 10]
>>> b = [2, 5, 6, 11]
>>> 
>>> for n in heapq.merge(a, b):
...     print(n)
... 
1
2
4
5
6
7
10
11

如果有一个序列没有排序，它不会检查，则会直接输出：

>>> import heapq
>>> a = [1, 4, 7, 10]
>>> b = [ 6, 11, 2, 5]
>>> 
>>> for n in heapq.merge(a, b):
...     print(n)
... 
1
4
6
7
10
11
2
5

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 204,590评论 6赞 478
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 86,808评论 2赞 381
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 151,151评论 0赞 337
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 54,779评论 1赞 277
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 63,773评论 5赞 367
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 48,656评论 1赞 281
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,022评论 3赞 398
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 36,678评论 0赞 258
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 41,038评论 1赞 299
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 35,659评论 2赞 321
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 37,756评论 1赞 330
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 33,411评论 4赞 321
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,005评论 3赞 307
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 29,973评论 0赞 19
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,203评论 1赞 260
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 45,053评论 2赞 350
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 42,495评论 2赞 343