A. Jesse Jiryu Davis

Samantha Ritter And Me At Open Source Bridge 2015

I'm so excited to tell you, my colleague Samantha Ritter and I were accepted to speak in Portland, Oregon this June. Cat-herd's Crook: Enforcing Standards in 10 Programming Languages. Samantha and I helped specify and test how MongoDB [...]

Open source bridge 1

I'm so excited to tell you, my colleague Samantha Ritter and I were accepted to speak in Portland, Oregon this June.

Cat-herd's Crook: Enforcing Standards in 10 Programming Languages.

Samantha and I helped specify and test how MongoDB drivers behave in ten programming languages, and we persuaded dozens of open source programmers to implement these specifications the same. We created a test framework that verified compliance with YAML descriptions of how drivers should act, and we communicated updates to the specs by pushing changes to the YAML files.

We want to show you how we herded all the cats in the same direction.

How Do Python Coroutines Work?

I'll explain asyncio's new coroutine implementation in depth, including the mystical "yield from" statement. You'll know better than any of your peers how this amazing new programming model works. Plus, in a magical and entertaining feat of daring, I plan to live-code an asynchronous coroutine implementation before your very eyes!

Announcing PyMongo 3.0.1

It's my pleasure to announce the release of PyMongo 3.0.1, a bugfix release that addresses issues discovered since PyMongo 3.0 was released a couple weeks ago. The main bugs were related to queries and cursors in complex sharding setups, [...]

Leaf

It's my pleasure to announce the release of PyMongo 3.0.1, a bugfix release that addresses issues discovered since PyMongo 3.0 was released a couple weeks ago. The main bugs were related to queries and cursors in complex sharding setups, but there was an unintentional change to the return value of save, GridFS file-deletion didn't work properly, passing a hint with a count didn't always work, and there were some obscure bugs and undocumented features.

For the full list of bugs fixed in PyMongo 3.0.1, please see the release in Jira.

If you are using PyMongo 3.0, please upgrade immediately.

If you are on PyMongo 2.8, read the changelog for major API changes in PyMongo 3, and test your application carefully with PyMongo 3 before deploying.

Vote For Me At Open Source Bridge!

Update: Two of these talks were accepted, thanks for your support. This is a shameless plug. I'm not ashamed of how much I want to speak at Open Source Bridge this year—it's my favorite conference with my favorite people and I [...]

Portland old town

Update: Two of these talks were accepted, thanks for your support.

This is a shameless plug. I'm not ashamed of how much I want to speak at Open Source Bridge this year—it's my favorite conference with my favorite people and I desperately want to speak there again. So vote for me! Add stars and comments to the three talks I proposed.

Add star

Cat-herd's Crook: Enforcing Standards in 10 Programming Languages.

I'm proposing this talk with my colleague and first-time speaker Samantha Ritter. We helped MongoDB specify and test our specifications for driver APIs and behaviors, and we persuaded dozens of open source programmers to implement these specifications the same. We want to show you how we herded all the cats in the same direction.

Dodge Disasters and March to Triumph as a Mentor.

If you're ambitious to advance in your career, or you care about your junior colleagues' advancement, then it is time for you to learn how to be a great mentor. Especially if you’re committed to diversity: mentorship is critical to the careers of women and minorities in tech. I have failed at mentoring, then succeeded. Learn from me and march to mentorship triumph.

How Do Python Coroutines Work?

I'll explain asyncio's new coroutine implementation in depth, including the mystical "yield from" statement. You'll know better than any of your peers how this amazing new programming model works. Plus, in a magical and entertaining feat of daring, I plan to live-code an asynchronous coroutine implementation before your very eyes!

Caution: Critical Bug In PyMongo 3, "could not find cursor in cache"

If you use multiple mongos servers in a sharded cluster, be cautious upgrading to PyMongo 3. We've just discovered a critical bug related to our new mongos load-balancing feature. Update: PyMongo 3.0.1 was released April 21, 2015 with [...]

If you use multiple mongos servers in a sharded cluster, be cautious upgrading to PyMongo 3. We've just discovered a critical bug related to our new mongos load-balancing feature.

Update: PyMongo 3.0.1 was released April 21, 2015 with fixes for this and other bugs.

If you create a MongoClient instance with PyMongo 3 and pass the addresses of several mongos servers, like so:

client = MongoClient('mongodb://mongos1,mongos2')

...then the client load-balances among the lowest-latency of them. Read the load-balancing documentation for details. This works correctly except when retrieving more than 101 documents, or more than 4MB of data, from a cursor:

collection = client.db.collection
for document in collection.find():
    # ... do something with each document ...
    pass

PyMongo wrongly tries to get subsequent batches of documents from random mongos servers, instead of streaming results from the same server it chose for the initial query. The symptom is an OperationFailure with a server error message, "could not find cursor in cache":

Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 968, in __next__
        if len(self.__data) or self._refresh():
  File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 922, in _refresh
        self.__id))
  File "/usr/local/lib/python2.7/dist-packages/pymongo/cursor.py", line 838, in __send_message
        codec_options=self.__codec_options)
  File "/usr/local/lib/python2.7/dist-packages/pymongo/helpers.py", line 110, in _unpack_response
        cursor_id)
pymongo.errors.CursorNotFound: cursor id '1025112076089406867' not valid at server

PyCon Video: "Eventually Correct: Async Testing"

Async frameworks like Tornado scramble our usual unittest strategies: how can you validate the outcome when you do not know when to expect it? Here's my PyCon 2015 talk about Tornado's testing module. You can also read my article on the [...]

Async frameworks like Tornado scramble our usual unittest strategies: how can you validate the outcome when you do not know when to expect it? Here's my PyCon 2015 talk about Tornado's testing module. You can also read my article on the topic or see a screencast which I closed-captioned.

Screencast of "Eventually Correct: Async Testing With Tornado"

In preparation for my PyCon talk today in Montréal I recorded and closed-captioned a screencast of it. This is a talk about testing asynchronous Python code with Tornado. I also wrote an article that covers this talk, plus [...]

Toad vs Birdo

In preparation for my PyCon talk today in Montréal I recorded and closed-captioned a screencast of it. This is a talk about testing asynchronous Python code with Tornado. I also wrote an article that covers this talk, plus additional information about testing with coroutines.

You can visit my "Eventually Correct" landing page for further links about Tornado, testing, and asynchronous coroutines.

Eventually Correct: Async Testing With Tornado

Async frameworks like Tornado scramble our usual unittest strategies: how can you validate the outcome when you do not know when to expect it? Tornado ships with a tornado.testing module that provides two solutions: the wait / stop [...]

Toad vs Birdo

Async frameworks like Tornado scramble our usual unittest strategies: how can you validate the outcome when you do not know when to expect it? Tornado ships with a tornado.testing module that provides two solutions: the wait / stop pattern, and gen_test.

Wait / Stop

To begin, let us say we are writing an async application with feature like Gmail's undo send: when I click "send", Gmail delays a few seconds before actually sending the email. It is a funny phenomenon, that during the seconds after clicking "sending" I experience a special clarity about my email. It was too angry, or I forgot an attachment, most often both. If I click the "undo" button in time, the email reverts to a draft and I can tone it down, add the attachment, and send it again.

To write an application with this feature, we will need an asynchronous "delay" function, and we must test it. If we were testing a normal blocking delay function we could use unittest.TestCase from the standard library:

import time
import unittest

from my_application import delay


class MyTestCase(unittest.TestCase):
    def test_delay(self):
        start = time.time()
        delay(1)
        duration = time.time() - start
        self.assertAlmostEqual(duration, 1, places=2)

When we run this, it prints:

Ran 1 test in 1.000s
OK

And if we replace delay(1) with delay(2) it fails as expected:

=======================================================
FAIL: test_delay (delay0.MyTestCase)
-------------------------------------------------------
Traceback (most recent call last):
File "delay0.py", line 12, in test_delay
    self.assertAlmostEqual(duration, 1, places=2)
AssertionError: 2.000854969024658 != 1 within 2 places

-------------------------------------------------------
Ran 1 test in 2.002s
FAILED (failures=1)

Great! What about testing a delay_async(seconds, callback) function?

    def test_delay(self):
        start = time.time()
        delay_async(1, callback=)  # What goes here?
        duration = time.time() - start
        self.assertAlmostEqual(duration, 1, places=2)

An asynchronous "delay" function can't block the caller, so it must take a callback and execute it once the delay is over. (In fact we are just reimplementing Tornado's call_later, but please pretend for pedagogy's sake this is a new function that we must test.) To test our delay_async, we will try a series of testing techniques until we have effectively built Tornado's test framework from scratch—you will see why we need special test tools for async code and how Tornado's tools work.

So, we define a function done to measure the delay, and pass it as the callback to delay_async:

    def test_delay(self):
        start = time.time()

        def done():
            duration = time.time() - start
            self.assertAlmostEqual(duration, 1, places=2)

        delay_async(1, done)

If we run this:

Ran 1 test in 0.001s
OK

Success! ...right? But why does it only take a millisecond? And what happens if we delay by two seconds instead?

    def test_delay(self):
        start = time.time()

        def done():
            duration = time.time() - start
            self.assertAlmostEqual(duration, 1, places=2)

        delay_async(2, done)

Run it again:

Ran 1 test in 0.001s
OK

Something is very wrong here. The test appears to pass instantly, regardless of the argument to delay_async, because we neither start the event loop nor wait for it to complete. We have to actually pause the test until the callback has executed:

    def test_delay(self):
        start = time.time()
        io_loop = IOLoop.instance()

        def done():
            duration = time.time() - start
            self.assertAlmostEqual(duration, 1, places=2)
            io_loop.stop()

        delay_async(1, done)
        io_loop.start()

Now if we run the test with a delay of one second:

Ran 1 test in 1.002s
OK

That looks better. And if we delay for two seconds?

ERROR:tornado.application:Exception in callback
Traceback (most recent call last):
  File "site-packages/tornado/ioloop.py", line 568, in _run_callback
    ret = callback()
  File "site-packages/tornado/stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "delay3.py", line 16, in done
    self.assertAlmostEqual(duration, 1, places=2)
  File "unittest/case.py", line 845, in assertAlmostEqual
    raise self.failureException(msg)
AssertionError: 2.001540184020996 != 1 within 2 places

The test appears to fail, as expected, but there are a few problems. First, notice that it is not the unittest that prints the traceback: it is Tornado's application logger. We do not get the unittest's characteristic output. Second, the process is now hung and remains so until I type Control-C. Why?

The bug is here:

        def done():
            duration = time.time() - start
            self.assertAlmostEqual(duration, 1, places=2)
            io_loop.stop()

Since the failed assertion raises an exception, we never reach the call to io_loop.stop(), so the loop continues running and the process does not exit. We need to register an exception handler. Exception handling with callbacks is convoluted; we have to use a stack context to install a handler with Tornado:

from tornado.stack_context import ExceptionStackContext

class MyTestCase(unittest.TestCase):
    def test_delay(self):
        start = time.time()
        io_loop = IOLoop.instance()

        def done():
            duration = time.time() - start
            self.assertAlmostEqual(duration, 1, places=2)
            io_loop.stop()

        self.failure = None

        def handle_exception(typ, value, tb):
            io_loop.stop()
            self.failure = value
            return True  # Stop propagation.

        with ExceptionStackContext(handle_exception):
            delay_async(2, callback=done)

        io_loop.start()
        if self.failure:
            raise self.failure

The loop can now be stopped two ways: if the test passes, then done stops the loop as before. If it fails, handle_exception stores the error and stops the loop. At the end, if an error was stored we re-raise it to make the test fail:

=======================================================
FAIL: test_delay (delay4.MyTestCase)
-------------------------------------------------------
Traceback (most recent call last):
  File "delay4.py", line 31, in test_delay
    raise self.failure
  File "tornado/ioloop.py", line 568, in _run_callback
    ret = callback()
  File "tornado/stack_context.py", line 343, in wrapped
    raise_exc_info(exc)
  File "<string>", line 3, in raise_exc_info
  File "tornado/stack_context.py", line 314, in wrapped
    ret = fn(*args, **kwargs)
  File "delay4.py", line 17, in done
    self.assertAlmostEqual(duration, 1, places=2)
AssertionError: 2.0015950202941895 != 1 within 2 places
-------------------------------------------------------
Ran 1 test in 2.004s
FAILED (failures=1)

Now the test ends promptly, whether it succeeds or fails, with unittest's typical output.

This is a lot of tricky code to write just to test a trivial delay function, and it seems hard to get right each time. What does Tornado provide for us? Its AsyncTestCase gives us start and stop methods to control the event loop. If we then move the duration-testing outside the callback we radically simplify our test:

from tornado import testing

class MyTestCase(testing.AsyncTestCase):
    def test_delay(self):
        start = time.time()
        delay_async(1, callback=self.stop)
        self.wait()
        duration = time.time() - start
        self.assertAlmostEqual(duration, 1, places=2)

gen_test

But modern async code is not primarily written with callbacks: these days we use coroutines. Let us begin a new example test, one that uses Motor, my asynchronous MongoDB driver for Tornado. Although Motor supports the old callback style, it encourages you to use coroutines and "yield" statements, so we can write some Motor code to demonstrate Tornado coroutines and unittesting.

To begin, say we want to execute find_one and test its return value:

from motor import MotorClient
from tornado import testing

class MyTestCase(testing.AsyncTestCase):
    def setUp(self):
        super().setUp()
        self.client = MotorClient()

    def test_find_one(self):
        collection = self.client.test.collection
        document = yield collection.find_one({'_id': 1})
        self.assertEqual({'_id': 1, 'key': 'value'}, document)

Notice the "yield" statement: whenever you call a Motor method that does I/O, you must use "yield" to pause the current function and wait for the returned Future object to be resolved to a value. Including a yield statement makes this function a generator. But now there is a problem:

TypeError: Generator test methods should be decorated with tornado.testing.gen_test

Tornado smartly warns us that our test method is merely a generator—we must decorate it with gen_test. Otherwise the test method simply stops at the first yield, and never reaches the assert. It needs a coroutine driver to run it to completion:

from tornado.testing import gen_test

class MyTestCase(testing.AsyncTestCase):
    # ... same setup ...
    @gen_test
    def test_find_one(self):
        collection = self.client.test.collection
        document = yield collection.find_one({'_id': 1})
        self.assertEqual({'_id': 1, 'key': 'value'}, document)

But now when I run the test, it fails:

AssertionError: {'key': 'value', '_id': 1} != None

We need to insert some data in setUp so that find_one can find it! Since Motor is asynchronous, we cannot call its insert method directly from setUp, we must run the insertion in a coroutine as well:

from tornado import gen, testing

class MyTestCase(testing.AsyncTestCase):
    def setUp(self):
        super().setUp()
        self.client = MotorClient()
        self.setup_coro()

    @gen.coroutine
    def setup_coro(self):
        collection = self.client.test.collection

        # Clean up from prior runs:
        yield collection.remove()

        yield collection.insert({'_id': 0})
        yield collection.insert({'_id': 1, 'key': 'value'})
        yield collection.insert({'_id': 2})

Now when I run the test:

AssertionError: {'key': 'value', '_id': 1} != None

It still fails! When I check in the mongo shell whether my data was inserted, only two of the three expected documents are there:

> db.collection.find()
{ "_id" : 0 }
{ "_id" : 1, "key" : "value" }

Why is it incomplete? Furthermore, since the document I actually query is there, why did the test fail?

When I called self.setup_coro() in setUp, I launched it as a concurrent coroutine. It began running, but I did not wait for it to complete before beginning the test, so the test may reach its find_one statement before the second document is inserted. Furthermore, test_find_one can fail quickly enough that setup_coro does not insert its third document before the whole test suite finishes, stopping the event loop and preventing the final document from ever being inserted.

Clearly I must wait for the setup coroutine to complete before beginning the test. Tornado's run_sync method is designed for uses like this:

class MyTestCase(testing.AsyncTestCase):
    def setUp(self):
        super().setUp()
        self.client = MotorClient()
        self.io_loop.run_sync(self.setup_coro)

With my setup coroutine correctly executed, now test_find_one passes.

Further Study

Now we have seen two techniques that make async testing with Tornado as convenient and reliable as standard unittests. To learn more, see my page of links related to this article.

Plus, stay tuned for the next book in the Architecture of Open Source Applications series. It will be called "500 Lines or Less", and my chapter is devoted to the implementation of coroutines in asyncio and Python 3.

Announcing PyMongo 3

PyMongo 3.0 is a partial rewrite of the Python driver for MongoDB. More than six years after the first release of the driver, this is the biggest release in PyMongo's history. Bernie Hackett, Luke Lovett, Anna Herlihy, and I are proud of its [...]

Leaf

PyMongo 3.0 is a partial rewrite of the Python driver for MongoDB. More than six years after the first release of the driver, this is the biggest release in PyMongo's history. Bernie Hackett, Luke Lovett, Anna Herlihy, and I are proud of its many improvements and eager for you to try it out. I will shoehorn the major improvements into four shoes: conformance, responsiveness, robustness, and modernity.

(This article will be cross-posted on the MongoDB Blog.)


Conformance

The motivation for PyMongo's overhaul is to supersede or remove its many idiosyncratic APIs. We want you to have a clean interface that is easy to learn and closely matches the interfaces of our other drivers.

CRUD API

Mainly, "conformance" means we have implemented the same interface for create, read, update, and delete operations as the other drivers have, as standardized in Craig Wilson's CRUD API Spec. The familiar old methods work the same in PyMongo 3, but they are deprecated:

  • save
  • insert
  • update
  • remove
  • find_and_modify

These methods were vaguely named. For example, update updates or replaces some or all matching documents depending on its arguments. The arguments to save and remove are likewise finicky, and the many options for find_and_modify are intimidating. Other MongoDB drivers do not have exactly the same arguments in the same order for all these methods. If you or other developers on your team are using a driver from a different language, it makes life a lot easier to have consistent interfaces.

The new CRUD API names its methods like update_one, insert_many, find_one_and_delete: they say what they mean and mean what they say. Even better, all MongoDB drivers have exactly the same methods with the same arguments. See the spec for details.

One Client Class

In the past we had three client classes: Connection for any one server, and ReplicaSetConnection to connect to a replica set. We also had a MasterSlaveConnection that could distribute reads to slaves in a master-slave set. In November 2012 we created new classes, MongoClient and MongoReplicaSetClient, with better default settings, so now PyMongo had five clients! Even more confusingly, MongoClient could connect to a set of mongos servers and do hot failover.

As I wrote earlier, the fine distinctions between the client classes baffled users. And the set of clients we provided did not conform with other drivers. But since PyMongo is among the most-used of all Python libraries we waited long, and thought hard, before making major changes.

The day has come. MongoClient is now the one and only client class for a single server, a set of mongoses, or a replica set. It includes the functionality that had been split into MongoReplicaSetClient: it can connect to a replica set, discover all its members, and monitor the set for stepdowns, elections, and reconfigs. MongoClient now also supports the full ReadPreference API. MongoReplicaSetClient lives on for a time, for compatibility's sake, but new code should use MongoClient exclusively. The obsolete Connection, ReplicaSetConnection, and MasterSlaveConnection are removed.

The options you pass to MongoClient in the URI now completely control the client's behavior:

>>> # Connect to one standalone, mongos, or replica set member.
>>> client = MongoClient('mongodb://server')
>>>
>>> # Connect to a replica set.
>>> client = MongoClient(
...     'mongodb://member1,member2/?replicaSet=my_rs')
>>>
>>> # Load-balance among mongoses.
>>> client = MongoClient('mongodb://mongos1,mongos2')

This is exciting because PyMongo applications are now so easy to deploy: your code simply loads a MongoDB URI from an environment variable or config file and passes it to a MongoClient. Code and configuration are cleanly separated. You can move smoothly from your laptop to a test server to the cloud, simply by changing the URI.

Non-Conforming Features

PyMongo 2 had some quirky features it did not share with other drivers. For one, we had a copy_database method that only one other driver had, and which almost no one used. It was hard to maintain and we believe you want us to focus on the features you use, so we removed it.

A more pernicious misfeature was the start_request method. It bound a thread to a socket, which hurt performance without actually guaranteeing monotonic write consistency. It was overwhelmingly misused, too: new PyMongo users naturally called start_request before starting a request, but in truth the feature had nothing to do with its name. For the history and details, including some entertaining (in retrospect) tales of Python threadlocal bugs, see my article on the removal of start_request.

Finally, the Python team rewrote our distributed-systems internals to conform to the new standards we have specified for all our drivers. But if you are a Python programmer you may care only a little that the new code conforms to a spec; it is more interesting to you that the new code is responsive and robust.

Responsiveness

PyMongo 3's MongoClient can connect to a single server, a replica set, or a set of mongoses. It finds servers and reacts to changing conditions according to the Server Discovery And Monitoring spec, and it chooses which server to use for each operation according to the Server Selection Spec. David Golden and I explained these specs in general in the linked articles, but I can describe PyMongo’s implementation here.

Replica Set Discovery And Monitoring

In PyMongo 2, MongoReplicaSetClient used a single background thread to monitor all replica set members in series. So a slow or unresponsive member could block the thread for some time before the thread moved on to discover information about the other members, like their network latencies or which member is primary. If your application was waiting for that information—say, to write to the new primary after an election—these delays caused unneeded seconds of downtime.

When PyMongo 3's new MongoClient connects to a replica set it starts one thread per mongod server. The threads fan out to connect to all members of the set in parallel, and they start additional threads as they discover more members. As soon as any thread discovers the primary, your application is unblocked, even while the monitor threads collect more information about the set. This new design improves PyMongo's response time tremendously. If some members are slow or down, or you have many members in your set, PyMongo's discovery is still just as fast.

I explained the new design in Server Discovery And Monitoring In Next Generation MongoDB Drivers, and I'll actually demonstrate it in my MongoDB World talk, Drivers And High Availability: Deep Dive.

Mongos Load-Balancing

Our multi-mongos behavior is improved, too. A MongoClient can connect to a set of mongos servers:

>>> # Two mongoses.
>>> client = MongoClient('mongodb://mongos1,mongos2')

The behavior in PyMongo 2 was "high availability": the client connected to the lowest-latency mongos in the list, and used it until a network error prompted it to re-evaluate their latencies and reconnect to one of them. If the driver chose unwisely at first, it stayed pinned to a higher-latency mongos for some time. In PyMongo 3, the background threads monitor the client's network latency to all the mongoses continuously, and the client distributes operations evenly among those with the lowest latency. See mongos Load Balancing for more information.

Throughput

Besides PyMongo's improved responsiveness to changing conditions in your deployment, its throughput is better too. We have written a faster and more memory efficient pure python BSON module, which is particularly important for PyPy, and made substantial optimizations in our C extensions.

Robustness

Disconnected Startup

The first change you may notice is, MongoClient's constructor no longer blocks while connecting. It does not raise ConnectionFailure if it cannot connect:

>>> client = MongoClient('mongodb://no-host.com')
>>> client
MongoClient('no-host.com', 27017)

The constructor returns immediately and launches the connection process on background threads. Of course, foreground operations might time out:

>>> client.db.collection.find_one()
AutoReconnect: No servers found yet

Meanwhile, the client's background threads keep trying to reach the server. This is a big win for web applications that use PyMongo—in a crisis, your app servers might be restarted while your MongoDB servers are unreachable. Your applications should not throw an exception at startup, when they construct the client object. In PyMongo 3 the client can now start up disconnected; it tries to reach your servers until it succeeds.

On the other hand if you wrote code like this to check if mongod is up:

>>> try:
...     MongoClient()
...     print("it's working")
... except pymongo.errors.ConnectionFailure:
...     print("please start mongod")
...

This will not work any more, since the constructor never throws ConnectionFailure now. Instead, choose how long to wait before giving up by setting serverSelectionTimeoutMS:

>>> client = MongoClient(serverSelectionTimeoutMS=500)
>>> try:
...     client.admin.command('ping')
...     print("it's working")
... except pymongo.errors.ConnectionFailure:
...     print("please start mongod")

One Monitor Thread Per Server

Even during regular operations, connections may hang up or time out, and servers go down for periods; monitoring each on a separate thread keeps PyMongo abreast of changes before they cause errors. You will see fewer network exceptions than with PyMongo 2, and the new driver will recover much faster from the unexpected.

Thread Safety

Another source of fragility in PyMongo 2 was APIs that were not designed for multithreading. Too many of PyMongo's options could be changed at runtime. For example, if you created a database handle:

>>> db = client.test

...and changed the handle's read preference on a thread, the change appeared in all threads:

>>> def thread_fn():
...     db.read_preference = ReadPreference.SECONDARY

Making these options mutable encouraged such mistakes, so we made them immutable. Now you configure handles to databases and collections using thread-safe APIs:

>>> def thread_fn():
...     my_db = client.get_database(
...         'test',
...         read_preference=ReadPreference.SECONDARY)

Modernity

Last, and most satisfying to the team, we have completed our transition to modern Python.

While PyMongo 2 already supported the latest version of Python 3, it did so tortuously by executing auto2to3 on its source at install time. This made it too hard for the open source community to contribute to our code, and it led to some absurdly obscure bugs. We have updated to a single code base that is compatible with Python 2 and 3. We had to drop support for the ancient Pythons 2.4 and 2.5; we were encouraged by recent download statistics to believe that these zombie Python versions are finally at rest.

Motor

Motor, my async driver for Tornado and MongoDB, has not yet been updated to wrap PyMongo 3. The current release, Motor 0.4, wraps PyMongo 2.8. Motor's still compatible with the latest MongoDB server version, but it lacks the new PyMongo 3 features—for example, it doesn't have the new CRUD API, and it still monitors replica set members serially instead of in parallel. The next release, Motor 0.5, still won't wrap PyMongo 3, because Motor 0.5 will focus on asyncio support instead. It won't be until version 0.6 that I update Motor with the latest PyMongo changes.

Talk Python To Me Podcast: Python and MongoDB

I was honored to be Michael Kennedy's guest for the second episode of his new podcast. We talked about my career as a Python programmer and how I came to work for MongoDB. We discussed PyMongo, Motor, and Monary. You should subscribe to [...]

I was honored to be Michael Kennedy's guest for the second episode of his new podcast. We talked about my career as a Python programmer and how I came to work for MongoDB. We discussed PyMongo, Motor, and Monary.

Talk Python To Me, Episode 2

You should subscribe to Michael's podcast. His conversation with Nicola Iarocci in the previous episode was great, too, and the upcoming interviews promise to be informative.