Using the multiprocessing module within mod_wsgi is a really bad idea.
This is because it is an embedded system where Apache and mod_wsgi
manage processes. Once you start using multiprocessing module which
tries to do its own process management, then it could potentially
interfere with the operation of Apache/mod_wsgi in unexpected ways.
For example, taking your example and changing it not to be dependent
on web.py I get:
import multiprocessing
import os
def x(y):
print os.getpid(), 'x', y
return y
def application(environ, start_response):
status = '200 OK'
output = 'Hello World!'
response_headers = [('Content-type', 'text/plain'),
('Content-Length', str(len(output)))]
start_response(status, response_headers)
print 'create pool'
pool = multiprocessing.Pool(processes=1)
print 'map call'
result = pool.map(x, [1])
print os.getpid(), 'doit', result
return [output]
If I fire off a request to this it appears to work correctly,
returning me hello world string and log the appropriate messages.
[Tue May 03 09:40:36 2011] [info] [client 127.0.0.1] mod_wsgi
(pid=32752, process='hello-1',
application='hello-1.example.com|/mptest.wsgi'): Loading WSGI script
'/Library/WebServer/Sites/hello-1/htdocs/mptest.wsgi'.
[Tue May 03 09:40:36 2011] [error] create pool
[Tue May 03 09:40:36 2011] [error] map call
[Tue May 03 09:40:36 2011] [error] 32753 x 1
[Tue May 03 09:40:36 2011] [error] 32752 doit [1]
However, the process then appears to receive a signal from somewhere
causing it to shutdown:
[Tue May 03 09:40:36 2011] [info] mod_wsgi (pid=32752): Shutdown
requested 'hello-1'.
[Tue May 03 09:40:41 2011] [info] mod_wsgi (pid=32752): Aborting
process 'hello-1'.
The multiprocessing module does issue signals, so it may be the source of this.
One thought was that this may be occurring when the pool is destroyed
at the end of the function call, so I moved the creation of pool to
module scope.
import multiprocessing
import os
print 'create pool'
pool = multiprocessing.Pool(processes=1)
def x(y):
print os.getpid(), 'x', y
return y
def application(environ, start_response):
status = '200 OK'
output = 'Hello World!'
response_headers = [('Content-type', 'text/plain'),
('Content-Length', str(len(output)))]
start_response(status, response_headers)
print 'map call'
result = pool.map(x, [1])
print os.getpid(), 'doit', result
return [output]
This though will not even run:
[Tue May 03 09:47:31 2011] [info] [client 127.0.0.1] mod_wsgi
(pid=32893, process='hello-1',
application='hello-1.example.com|/mptest.wsgi'): Loading WSGI script
'/Library/WebServer/Sites/hello-1/htdocs/mptest.wsgi'.
[Tue May 03 09:47:31 2011] [error] create pool
[Tue May 03 09:47:31 2011] [error] map call
[Tue May 03 09:47:31 2011] [error] Process PoolWorker-1:
[Tue May 03 09:47:31 2011] [error] Traceback (most recent call last):
[Tue May 03 09:47:31 2011] [error] File
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/process.py",
line 231, in _bootstrap
[Tue May 03 09:47:31 2011] [error] self.run()
[Tue May 03 09:47:31 2011] [error] File
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/process.py",
line 88, in run
[Tue May 03 09:47:31 2011] [error] self._target(*self._args, **self._kwargs)
[Tue May 03 09:47:31 2011] [error] File
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/pool.py",
line 57, in worker
[Tue May 03 09:47:31 2011] [error] task = get()
[Tue May 03 09:47:31 2011] [error] File
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/queues.py",
line 339, in get
[Tue May 03 09:47:31 2011] [error] return recv()
[Tue May 03 09:47:31 2011] [error] AttributeError: 'module' object has
no attribute 'x'
The browser also then hangs at that point.
Part of the issue here may be that WSGI script files are not really
standard Python modules in that the basename of the WSGI script file
doesn't match a module in sys.modules. If the multiprocessing module
tries to do magic stuff with imports to find original code to execute
in sub process it isn't going to work.
Specifically, may be related to:
http://code.google.com/p/modwsgi/wiki/IssuesWithPickleModule
If I attempt to move x() into being a nested function as:
import multiprocessing
import os
print 'create pool'
pool = multiprocessing.Pool(processes=1)
def application(environ, start_response):
status = '200 OK'
output = 'Hello World!'
response_headers = [('Content-type', 'text/plain'),
('Content-Length', str(len(output)))]
start_response(status, response_headers)
def x(y):
print os.getpid(), 'x', y
return y
print 'map call'
result = pool.map(x, [1])
print os.getpid(), 'doit', result
return [output]
Then one does get pickle errors, albeit for a different reason:
[Tue May 03 09:52:59 2011] [info] [client 127.0.0.1] mod_wsgi
(pid=33010, process='hello-1',
application='hello-1.example.com|/mptest.wsgi'): Loading WSGI script
'/Library/WebServer/Sites/hello-1/htdocs/mptest.wsgi'.
[Tue May 03 09:52:59 2011] [error] create pool
[Tue May 03 09:52:59 2011] [error] map call
[Tue May 03 09:52:59 2011] [error] Exception in thread Thread-1:
[Tue May 03 09:52:59 2011] [error] Traceback (most recent call last):
[Tue May 03 09:52:59 2011] [error] File
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py",
line 522, in __bootstrap_inner
[Tue May 03 09:52:59 2011] [error] self.run()
[Tue May 03 09:52:59 2011] [error] File
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py",
line 477, in run
[Tue May 03 09:52:59 2011] [error] self.__target(*self.__args,
**self.__kwargs)
[Tue May 03 09:52:59 2011] [error] File
"/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/pool.py",
line 225, in _handle_tasks
[Tue May 03 09:52:59 2011] [error] put(task)
[Tue May 03 09:52:59 2011] [error] PicklingError: Can't pickle <type
'function'>: attribute lookup __builtin__.function failed
So, it is doing pickling in some form, which isn't going to work for
stuff in WSGI script file.
If you really want to pursue this, then suggest you move this code
outside of the WSGI script file and put it in a standard module on the
Python module search path you have set up for application.
Overall though, I would recommend against using multiprocessing module
from inside of mod_wsgi.
Graham