使用Manager可以方便的进行多进程数据共享,但当使用Manager处理list、dict等可变数据类型时,需要非常注意一个陷阱。看下面的代码:
from multiprocessing import Process, Manager
manager = Manager()
m = manager.list()
m.append({'id':1})
def test():
m[0]['id'] = 2
p = Process(target=test)
p.start()
p.join()
print(m[0])
执行结果是:
{'id': 1}
不是预期的:{'id': 2}
要达到预期的结果,代码应改为:
from multiprocessing import Process, Manager
manager = Manager()
m = manager.list()
m.append({'id':1})
def test():
hack = m[0]
hack['id'] = 2
m[0] = hack
p = Process(target=test)
p.start()
p.join()
print(m[0])
以上代码中让人困惑的操作的目的是绕过Manager的一个隐秘问题,这个问题是指:Manager对象无法监测到它引用的可变对象值的修改,需要通过触发__setitem__
方法来让它获得通知
代码中m[0] = hack这行代码就是用来故意触发proxy对象的__setitem__
方法的,关于这个问题Python官方文档解释如下:
If standard (non-proxy) list or dict objects are contained in a referent, modifications to those mutable values will not be propagated through the manager because the proxy has no way of knowing when the values contained within are modified. However, storing a value in a container proxy (which triggers a
__setitem__
on the proxy object) does propagate through the manager and so to effectively modify such an item, one could re-assign the modified value to the container proxy.
详情请参考: