For the next release of PyMongo, Bernie Hackett and I have updated PyMongo's C extensions to be more compatible with mod_wsgi, and improved PyMongo's docs about mod_wsgi. In the process I had to buckle down and finally grasp the relationship between mod_wsgi and C extensions. The only way I could master it was to experiment. Enjoy the fruits of my research.

Summary: Python scripts in a mod_wsgi daemon process are isolated from each other in separate Python sub interpreters, but the state of their C extensions is not isolated. If a C extension tries to import and use Python classes, it creates an unholy mix of classes from different interpreters. PyMongo fell victim to this issue when encoding and decoding BSON. We used to have a hack for this, and now we have a nice workaround. When you deploy WSGI scripts, configure mod_wsgi to isolate your scripts from each other completely, by running them in separate daemon processes. If you maintain a Python extension, our latest strategies can help you make your script compatible with mod_wsgi.

Python Runs In A Daemon Process

Graham Dumpleton, mod_wsgi's author, recommends we use daemon mode under most circumstances. So the first step in configuring mod_wsgi is to fork a daemon process with WSGIDaemonProcess:

<VirtualHost *>
    WSGIDaemonProcess my_process
</VirtualHost>

Now, I'll use WSGIScriptAlias to tell mod_wsgi where my script is, and use WSGIProcessGroup to assign the script to the daemon:

<VirtualHost *>
    WSGIDaemonProcess my_process
    WSGIScriptAlias /my_app /path/to/app.wsgi
    WSGIProcessGroup my_process
</VirtualHost>

Python Variables Are Isolated In Sub Interpreters

When the daemon runs my script, it needs a Python interpreter. By default, the daemon uses a different Python interpreter for each "resource" on my web server. A resource is the concatenation of the server name, port number, and script name, so requests to my application over port 8080 might use an interpreter named "example.com:8080|/my_app". Each distinct combination of domain name, port number, and script name is mapped to a separate interpreter, so that multiple scripts don't affect each other's state.

When an HTTP request arrives, the daemon checks if it's created an interpreter for this resource. If not, it calls the Python C API function Py_NewInterpreter. The Python docs say:

This is an (almost) totally separate environment for the execution of Python code. In particular, the new interpreter has separate, independent versions of all imported modules, including the fundamental modules builtins, __main__ and sys. The table of loaded modules (sys.modules) and the module search path (sys.path) are also separate.

Let's see this separation in action. I'll make a module with a variable:

# module.py

var = 0

My WSGI script increments the variable with each request and responds with the new value:

# app.wsgi

import module

def application(environ, start_response):
    module.var += 1

    output = 'var = %d\n' % module.var
    response_headers = [('Content-Length', str(len(output)))]
    start_response('200 OK', response_headers)
    return [output]

(Incrementing an integer is not thread-safe, but I'm ignoring thread-safety here.)

I'll map two URLs, "foo" and "bar", to the same script:

<VirtualHost *>
    WSGIDaemonProcess my_process
    WSGIProcessGroup my_process
    WSGIScriptAlias /foo /path/to/app.wsgi
    WSGIScriptAlias /bar /path/to/app.wsgi
</VirtualHost>

mod_wsgi uses different sub interpreters for the two URLs, so they have different copies of var. When I request "foo" it increments one copy, and when I request "bar" it increments the other copy:

$ curl localhost/foo
var = 1
$ curl localhost/foo
var = 2
$ curl localhost/bar
var = 1
$ curl localhost/bar
var = 2

I can use WSGIApplicationGroup to change the relationship between URLs and sub interpreters. I'll put both URLs in the same "application group," meaning the same Python sub interpreter:

<VirtualHost *>
    WSGIDaemonProcess my_process
    WSGIProcessGroup my_process
    WSGIScriptAlias /foo /path/to/app.wsgi
    WSGIScriptAlias /bar /path/to/app.wsgi
    WSGIApplicationGroup my_application_group
</VirtualHost>

Now requests to "foo" and "bar" run in the same interpreter and increment the same copy of var:

$ curl localhost/foo
var = 1
$ curl localhost/foo
var = 2
$ curl localhost/bar
var = 3
$ curl localhost/bar
var = 4

If I set the application group to %{GLOBAL}, "foo" and "bar" will run in the daemon's main interpreter, not any sub interpreter at all. We'll see momentarily why this is useful.

But C Extensions Are Not Isolated

Remember when the Python docs said that Py_NewInterpreter creates "an (almost) totally separate environment"? One reason it's not completely separate is that C extensions are shared:

Extension modules are shared between (sub-)interpreters as follows: the first time a particular extension is imported, it is initialized normally, and a (shallow) copy of its module’s dictionary is squirreled away. When the same extension is imported by another (sub-)interpreter, a new module is initialized and filled with the contents of this copy; the extension’s init function is not called.

Static Variables Are Shared

I wrote an example C extension called demo to demonstrate the issues. The code is on GitHub. Instead of declaring a global variable in a Python module, let's make one in C:

/* A global variable. */
static long var = 0;

static PyObject* inc_and_get_var(PyObject* self, PyObject* args)
{
    var++;
    return PyInt_FromLong(var);
}

I call inc_and_get_var() in my WSGI script:

output = 'var: %s\n' % demo.inc_and_get_var()

Now, I'll change my Apache configuration back, so it uses the default application groups:

<VirtualHost *>
    WSGIDaemonProcess my_process
    WSGIProcessGroup my_process
    WSGIScriptAlias /foo /path/to/app.wsgi
    WSGIScriptAlias /bar /path/to/app.wsgi
</VirtualHost>

Once again mod_wsgi uses a different interpreter for each URL. So if I were using the var declared in Python, "foo" and "bar" would increment different copies of it. But of course a static variable declared in C is shared among all interpreters in a daemon:

$ curl localhost/foo
var: 1
$ curl localhost/bar
var: 2

Instead of using a static variable, I could have put the variable in the module's dict. But as the Python doc said, that dict is copied into the new interpreter, so the interpreters still wouldn't be completely isolated.

Python Classes Only Work In The First Interpreter

The shared-state problem becomes worse if the C extension uses a class implemented in Python. What if an extension imports a Python class and later calls PyObject_IsInstance() on it? Here's some C code for a function called is_myclass():

static PyObject* MyClass;

static PyObject* is_myclass(PyObject* self, PyObject* args)
{
    int outcome;
    PyObject* obj;
    if (!PyArg_ParseTuple(args, "O", &obj)) return NULL;

    outcome = PyObject_IsInstance(obj, MyClass);
    if (outcome) { Py_RETURN_TRUE; }
    else { Py_RETURN_FALSE; }
}

PyMODINIT_FUNC
initdemo(void)
{
    PyObject* mymodule = PyImport_ImportModule("mymodule");
    MyClass = PyObject_GetAttrString(mymodule, "MyClass");
    Py_DECREF(mymodule);
    Py_InitModule("demo", Methods);
}

(Error-checking is omitted. See GitHub for the real code.)

is_myclass() works just fine in the shell:

>>> import demo, mymodule
>>> obj = mymodule.MyClass()
>>> demo.is_myclass(obj)
True

How about in mod_wsgi? I'll make my WSGI script output the result of is_myclass():

import demo
import mymodule

def application(environ, start_response):
    obj = mymodule.MyClass()
    outcome = demo.is_myclass(obj)

    output = 'outcome = %s\n' % outcome
    response_headers = [('Content-Length', str(len(output)))]
    start_response('200 OK', response_headers)
    return [output]

Then I map two URLs to this script:

<VirtualHost *>
    WSGIDaemonProcess my_process
    WSGIProcessGroup my_process
    WSGIScriptAlias /foo demo.wsgi
    WSGIScriptAlias /bar demo.wsgi
</VirtualHost>

If I request the URL "foo", everything's peachy: my script thinks obj is an instance of MyClass. But when I request "bar" it thinks the opposite:

$ curl localhost/foo
outcome = True
$ curl localhost/bar
outcome = False

From now on, "foo" returns True and "bar" returns False. But if I restart Apache and request "bar" first, followed by "foo", the outcome is reversed:

$ sudo service apache2 restart
 * Restarting web server apache2
$ curl localhost/bar
outcome = True
$ curl localhost/foo
outcome = False

Do you see why? Only the first interpreter that imports my extension runs initdemo(), imports MyClass, and assigns it to a static variable. From then on, calls to is_myclass work in the first interpreter, because the object is compared to the Python class created in the same interpreter. Calls in the other interpreter always return false.

The inverse problem happens when I instantiate a MyClass object in C:

static PyObject* create_myclass(PyObject* self, PyObject* args)
{
    return PyObject_CallObject(MyClass, NULL);
}

I'll update my Python script to check if create_myclass() returns an instance of MyClass:

outcome = isinstance(
    demo.create_myclass(),
    mymodule.MyClass)

output = 'isinstance: %s' % outcome

Again, if I request "foo" first, it returns True and "bar" returns False:

$ curl localhost/foo
isinstance: True
$ curl localhost/bar
isinstance: False

If I restart Apache and request "bar" first, it returns True from then on, and "foo" returns False.

What's going on? initdemo() caches MyClass from the interpreter that calls it, so the instances it creates act normally in that interpreter. The second Python interpreter that imports demo does not call initdemo(), so the module has no opportunity to discover that it's being used from a different interpreter. It continues making objects that only work in the first interpreter. The mod_wsgi docs call this "an unholy mixing of code and data from multiple sub interpreters."

Note that types defined in C don't suffer these ills: they're static. For example, the datetime type is defined as:

static PyTypeObject PyDateTime_DateTimeType;

Every interpreter in the daemon process agrees on the memory address of this type, so both PyObject_IsInstance and isinstance work on datetimes across interpreters.

PyMongo and mod_wsgi

PyMongo's BSON encoder and decoder are written in C, in an extension called _cbson. _cbson caches Python classes, so it's vexed by problems with PyObject_IsInstance and isinstance when running in multiple sub interpreters. Bear with me, I'm going into some detail about why PyMongo had trouble in each case.

Encoding

In PyMongo we have a class representing MongoDB ObjectIds:

class ObjectId(object):
    # Etcetera.
    pass

_cbson needs both to recognize ObjectIds and to create them, so it caches the ObjectId class when it initializes:

static PyObject* ObjectId;

PyMODINIT_FUNC
init_cbson(void)
    PyObject* module = PyImport_ImportModule("bson.objectid");
    ObjectId = PyObject_GetAttrString(module, "ObjectId");
    Py_DECREF(module);

    /** More module setup .... */
}

Let's say I'm turning a dict into BSON. I execute this Python code:

bson_document = BSON.encode({"_id": ObjectId()})

PyMongo iterates the dict, checking each value's type to decide how to encode it. Is the value an int, a string, an ObjectId, something else?

PyObject* iter = PyObject_GetIter(dict);
while ((key = PyIter_Next(iter)) != NULL) {
    PyObject* value = PyDict_GetItem(dict, key);
    if (PyObject_IsInstance(value, ObjectId)) {
        /* Encode the ObjectId as BSON .... */
    }

    /* Check for other possible types .... */

    Py_DECREF(key);
}
Py_DECREF(iter);

By now you know what's going to happen: the first interpreter that imports _cbson is the one that caches the ObjectId class, and PyObject_IsInstance works there. In other interpreters, PyObject_IsInstance can't recognize ObjectIds.

Decoding

The PyObject_IsInstance problem manifested when turning Python objects into BSON. The inverse happens when decoding BSON: _cbson churns through a BSON document reading the type code for each field:

switch (type) {
case 7:
    value = PyObject_CallFunction(state->ObjectId,
                                  "s#", buffer, 12);
    break;

The value so constructed is an ObjectId, but isinstance(value, ObjectId) is False in any interpreters besides the first one. Our users don't call isinstance, it seems, because this bug was never reported.

You Can Isolate C Extensions In Separate Daemons

The mod_wsgi docs provide no guidance for writing C extensions, they just say:

Because of the possibility that extension module writers have not written their code to take into consideration it being used from multiple sub interpreters, the safest approach is to force all WSGI applications to run within the same application group, with that preferably being the first interpreter instance created by Python.

Following Dumpleton's advice, we tell PyMongo users to always use WSGIApplicationGroup %{GLOBAL} to put their applications in the main interpreter. Since that risks interference if you run multiple applications in the same daemon process, you should run each application in a separate daemon, like this:

<VirtualHost *>
    WSGIDaemonProcess my_process
    WSGIScriptAlias /foo /path/to/app.wsgi
    <Location /foo>
        WSGIProcessGroup my_process
    </Location>

    WSGIDaemonProcess my_other_process
    WSGIScriptAlias /bar /path/to/app.wsgi
    <Location /bar>
        WSGIProcessGroup my_other_process
    </Location>

    WSGIApplicationGroup %{GLOBAL}
</VirtualHost>

I've added an example like this to PyMongo's docs.

How Should C Extensions Handle Multiple Sub Interpreters?

But some users don't read the manual, and some aren't allowed to change their Apache config. How can we write a C extension that handles multiple sub interpreters gracefully?

PyMongo's Crummy Old Hack

Through version 2.6, our BSON encoder used the following algorithm to deal with multiple sub interpreters:

  • For each value, use PyObject_IsInstance to check if it is any BSON-encodable Python type.
  • If all checks fail, log a RuntimeWarning saying, "couldn't encode—reloading python modules and trying again."
  • Re-import and re-cache all Python classes, such as ObjectId. This ensures _cbson's references to Python classes come from the current interpreter.
  • Again check if the value is encodable.
  • If not, raise InvalidBSON because this isn't a mod_wsgi problem: the application is actually trying to encode something that isn't BSON-encodable.

There were a few problems with this. First, it wrote a warning to the Apache error log whenever an application encoded BSON in a different interpreter from the last one in which it had encoded BSON. Second, it only fixed PyObject_IsInstance, not isinstance.

PyMongo's Pretty Good New Workaround

Bernie Hackett's elegant solution avoids PyObject_IsInstance entirely when encoding. He added a _type_marker field to our Python classes:

class ObjectId(object):
    _type_marker = 7

_cbson uses the type marker to decide how to encode each value:

if (PyObject_HasAttrString(value, "_type_marker")) {
    long type;
    PyObject* type_marker = PyObject_GetAttrString(
        value, "_type_marker");

    type = PyInt_AsLong(type_marker);
    Py_DECREF(type_marker);
    switch (type) {
    case 7:
        /* Encode an ObjectId .... */

Not only is the type marker robust against sub interpreter issues, but it's faster than PyObject_IsInstance. If a value has no type marker, then we check for builtin types like strings and ints.

The only BSON-encodable Python type we don't control is UUID. It's implemented in Python, but it's provided by the standard library so we can't add a type marker. Here, Bernie took two approaches. First, he checked whether we're in a sub interpreter or the main one:

/*
 * Are we in the main interpreter or a sub-interpreter?
 * Useful for deciding if we can use cached pure python
 * types in mod_wsgi.
 */
static int
_in_main_interpreter(void) {
    static PyInterpreterState* main_interpreter = NULL;
    PyInterpreterState* interpreter;

    if (main_interpreter == NULL) {
        interpreter = PyInterpreterState_Head();
        while (PyInterpreterState_Next(interpreter))
            interpreter = PyInterpreterState_Next(interpreter);

        main_interpreter = interpreter;
    }

    return (main_interpreter == PyThreadState_Get()->interp);
}

The first time _in_main_interpreter() is called, it stashes a reference to the main interpreter. From then on, it can detect if we're in a sub interpreter by comparing the current interpreter's address to the main one's.

If we're in the main interpreter, we can use our cached copy of the UUID class with PyObject_IsInstance as normal. (We're either in the global application group, or not in mod_wsgi at all.) If we're in a sub interpreter, we have to re-import UUID each time, before we pass it to PyObject_IsInstance. The performance penalty is minimal: for one thing, we only check if a value is a UUID if it's failed all other checks. Second, the speedup from _type_marker compensates for the cost of re-importing UUID.

What about decoding? How does _cbson avoid returning instances of one interpreter's classes to another interpreter? Again, if _in_main_interpreter() is true, _cbson can safely use its cached classes. If not, it re-imports ObjectId each time it needs it—same for UUID and so forth. This is cheap: my benchmark only showed it costing a few microseconds per value. After all, re-importing a module is essentially a lookup in sys.modules. Real applications are I/O bound anyway and won't notice the hit. But if you're concerned, use WSGIApplicationGroup %{GLOBAL} to run your script in the main interpreter.

Recommendations

After all the intricacies I learned, I've arrived at simple recommendations.

Deployment

Multiple sub interpreters are only an issue if you have multiple scripts using the same C extension. If so, run the scripts in separate daemon processes using WSGIDaemonProcess and WSGIProcessGroup, and assign them to the main interpreter with WSGIApplicationGroup %{GLOBAL}.

Writing C Extensions

Extension authors can't rely on users to deploy this way, so C extensions should be written to support multiple sub interpreters.

  • If your extension imports your Python classes, add type markers to them as PyMongo did and use the type markers instead of PyObject_IsInstance.
  • Alternatively, implement the classes you need in C instead of Python, so they're safe to use across interpreters, the same as datetimes are.
  • If you import third-party or standard-library Python classes, check if you're running in a sub interpreter. If so, re-import these classes on demand. It's cheaper than you think.

You might also like my article on measuring test coverage of C extensions, or the one on automatically detecting refcount errors in C extensions.