Python's swap is not atomic
I rewrote PyMongo's connection pool over the last few months. Among the concurrency issues I had to nail down was, if a thread is resetting the connection pool as another thread is using the pool, how do I keep them from stepping on each other?
I thought I nailed this, but of course I didn't. There's a race condition in here:
class Pool(object):
def __init__(self):
self.sockets = set()
def reset(self):
# Close sockets before deleting them
sockets, self.sockets = self.sockets, set()
for sock_info in sockets: sock_info.close()
I thought that the swap would be atomic: the first thread to enter
reset()
would replace self.sockets with an empty set, then close all
the old sockets, and all subsequent threads would find that self.sockets
was empty. That turns out not to be the case.
The race condition was occasionally revealed in runs of PyMongo's huge test suite. One of the tests spins up 40 concurrent threads. Each thread queries MongoDB, calls reset(), and queries MongoDB again. Here's how the test fails:
test_disconnect (test.test_pooling.TestPooling) ... Exception in thread Thread-45:
Traceback (most recent call last):
< ... snip ... >
File "pymongo/pool.py", line 159, in reset
for sock_info in sockets: sock_info.close()
RuntimeError: Set changed size during iteration
As I said, I'd thought the swap was atomic, but in fact it takes half a dozen bytecode instructions. That one swap line:
sockets, self.sockets = self.sockets, set()
...disassembles to:
0 LOAD_FAST 0 (self)
3 LOAD_ATTR 0 (sockets)
6 LOAD_GLOBAL 1 (set)
9 CALL_FUNCTION 0
12 ROT_TWO <- this is the swap
13 STORE_FAST 1 (sockets)
16 LOAD_FAST 0 (self)
19 STORE_ATTR 0 (sockets)
Say that Thread 1 is executing this function. Thread 1 loads
self.sockets and the empty set onto its stack and swaps them, and before
it gets to STORE_ATTR
(where self.sockets is actually replaced), it
gets interrupted by Thread 2. Thread 2 runs some other part of the
connection pool's code, e.g.:
def return_socket(self, sock_info):
self.sockets.add(sock_info)
This disassembles to:
24 LOAD_FAST 0 (self)
27 LOAD_ATTR 1 (sockets)
30 LOAD_ATTR 3 (add)
33 LOAD_FAST 1 (sock_info)
36 CALL_FUNCTION 1
Let's say Thread 2 reaches the LOAD_ATTR 1
bytecode. Now it has
self.sockets on its stack, and it gets interrupted by Thread 1, which is
still in reset(). Thread 1 replaces self.sockets with the empty set. But
alas, Thread 1's "old" list of sockets and Thread 2's "self.sockets" are
the same set. Thread 1 starts iterating over the old list of
sockets, closing them:
for sock_info in sockets: sock_info.close()
...but it gets interrupted again by Thread 2, which does
self.sockets.add(sock_info)
, increasing the set's size by one. When
Thread 1 is next resumed, it tries to continue iterating, and raises the
"Set changed size during iteration" exception.
Let's dive deeper for a minute. You may be thinking that in practice two
Python threads wouldn't interrupt each other this often. Indeed, the
interpreter executes 100 bytecodes at a time before it even thinks of
switching
threads.
But in our case, Thread 1 is repeatedly calling socket.close()
, which
is written in socketmodule.c like this:
static PyObject * sock_close(PySocketSockObject *s) {
SOCKET_T fd;
if ((fd = s->sock_fd) != -1) {
s->sock_fd = -1;
Py_BEGIN_ALLOW_THREADS
(void) SOCKETCLOSE(fd);
Py_END_ALLOW_THREADS
}
Py_INCREF(Py_None);
return Py_None;
}
That Py_BEGIN_ALLOW_THREADS
macro releases the Global Interpreter
Lock and
Py_END_ALLOW_THREADS
waits to reacquire it. In a multithreaded Python
program, releasing the GIL makes it very likely that another thread
which is waiting for the GIL will immediately acquire it.
(Notwithstanding David Beazley's talk on the
GIL—he
demonstrates that CPU-bound and IO-bound threads competing for the GIL
on a multicore system interrupt each other too rarely, but in this
case I'm only dealing with IO-bound threads.)
So calling socket.close() in a loop ensures that this thread will be constantly interrupted. The probability that some thread in return_socket() gets a reference to the set, and modifies it, interleaved with some other thread in reset() getting a reference to the same set and iterating it, is high enough to break PyMongo's unittest about 1% of the time.
The solution was obvious once I understood the problem:
class Pool(object):
def __init__(self):
self.sockets = set()
self.lock = threading.Lock()
def reset(self):
self.lock.acquire()
try:
# Close sockets before deleting them
sockets, self.sockets = self.sockets, set()
finally:
self.lock.release()
# Now only this thread can have a reference to this set of sockets
for sock_info in sockets: sock_info.close()
def return_socket(self, sock_info):
self.lock.acquire()
try:
self.sockets.add(sock_info)
finally:
self.lock.release()
Single-bytecode instructions in Python are atomic, and if you can use this atomicity to avoid mutexes then I believe you should—not only is your code faster and simpler, but you avoid the risk of deadlocks, which are the worst concurrency bugs. But not everything that looks atomic is. When in doubt, use the dis module to examine your bytecode and find out for sure.