我正在做一个项目,需要我从一些文件中提取大量信息。关于这个项目的格式和大部分信息与我要问的问题无关。我不知道如何将此字典与进程池中的所有进程共享。
这是我的代码(更改了变量名,删除了大部分代码,只需要知道部分):
import json
import multiprocessing
from multiprocessing import Pool, Lock, Manager
import glob
import os
def record(thing, map):
with mutex:
if(thing in map):
map[thing] += 1
else:
map[thing] = 1
def getThing(file, n, map):
#do stuff
thing = file.read()
record(thing, map)
def init(l):
global mutex
mutex = l
def main():
#create a manager to manage shared dictionaries
manager = Manager()
#get the list of filenames to be analyzed
fileSet1=glob.glob("filesSet1/*")
fileSet2=glob.glob("fileSet2/*")
#create a global mutex for the processes to share
l = Lock()
map = manager.dict()
#create a process pool, give it the global mutex, and max cpu count-1 (manager is its own process)
with Pool(processes=multiprocessing.cpu_count()-1, initializer=init, initargs=(l,)) as pool:
pool.map(lambda file: getThing(file, 2, map), fileSet1) #This line is what i need help with
main()
据我所知,lamda函数应该可以工作。我需要帮助的行是:pool.map(lambda file:getThing(file,2,map),fileSet1)。这给了我一个错误。给出的错误是“AttributeError:Cant pickle local object'main..'”。
任何帮助都将不胜感激!