Motor

Today is a good day: I've published a beta of Motor, my async Python driver for MongoDB. This version is the biggest upgrade yet. Help me beta-test it! Install with:

python -m pip install --pre motor==0.5b0

Motor 0.5 still depends on PyMongo 2.8.0 exactly. That PyMongo version is outdated, I know, but I've decided not to tackle that issue right now.

You'll forgive me, because this Motor release is huge:

asyncio

Motor can now integrate with asyncio, as an alternative to Tornado. My gratitude to Rémi Jolin, Andrew Svetlov, and Nikolay Novik for their huge contributions to Motor's asyncio integration.

The Tornado and asyncio APIs are kindred. Here is Motor with Tornado:

# Tornado API
from tornado import gen, ioloop
from motor.motor_tornado import MotorClient

@gen.coroutine
def f():
    result = yield client.db.collection.insert({'_id': 1})
    print(result)

client = MotorClient()
ioloop.IOLoop.current().run_sync(f)

And here's the new asyncio integration:

import asyncio
from motor.motor_asyncio import AsyncIOMotorClient

@asyncio.coroutine
def f():
    result = yield from client.db.collection.insert({'_id': 1})
    print(result)

client = AsyncIOMotorClient()
asyncio.get_event_loop().run_until_complete(f())

Unlike Tornado, asyncio does not include an HTTP implementation, much less a web framework. For those features, use Andrew Svetlov's aiohttp package. I wrote you a tiny example web application with Motor and aiohttp.

aggregate

MotorCollection.aggregate now returns a cursor by default, and the cursor is returned immediately without a yield. The old syntax is no longer supported:

# Motor 0.4 and older, no longer supported.
cursor = yield collection.aggregate(pipeline, cursor={})
while (yield cursor.fetch_next):
    doc = cursor.next_object()
    print(doc)

In Motor 0.5, simply do:

# Motor 0.5: no "cursor={}", no "yield".
cursor = collection.aggregate(pipeline)
while (yield cursor.fetch_next):
    doc = cursor.next_object()
    print(doc)

In asyncio this uses yield from instead:

# Motor 0.5 with asyncio.
cursor = collection.aggregate(pipeline)
while (yield from cursor.fetch_next):
    doc = cursor.next_object()
    print(doc)

Python 3.5

Motor is now compatible with Python 3.5, which required some effort. It was hard because Motor doesn't just work with your coroutines, it uses coroutines internally to implement some of its own features, like MotorClient.open and MotorGridFS.put. I had a method for writing coroutines that worked in Python 2.6 through 3.4, but 3.5 finally broke it. There is no single way to return a value from a Python 3.5 native coroutine or a Python 2 generator-based coroutine, so all Motor internal coroutines that return values were rewritten with callbacks. (See commit message dc19418c for an explanation.)

async and await

This is the payoff for my Python 3.5 effort. Motor works with native coroutines, written with the async and await syntax:

async def f():
    await collection.insert({'_id': 1})

Cursors from MotorCollection.find, MotorCollection.aggregate, or MotorGridFS.find can be iterated elegantly and very efficiently in native coroutines with async for:

async def f():
    async for doc in collection.find():
        print(doc)

How efficient is this? For a collection with 10,000 documents, this old-style code takes 0.14 seconds on my system:

# Motor 0.5 with Tornado.
@gen.coroutine
def f():
    cursor = collection.find()
    while (yield cursor.fetch_next):
        doc = cursor.next_object()
        print(doc)

The following code, which simply replaces gen.coroutine and yield with async and await, performs about the same:

# Motor 0.5 with Tornado, using async and await.
async def f():
    cursor = collection.find()
    while (await cursor.fetch_next):
        doc = cursor.next_object()
        print(doc)

But with async for it takes 0.04 seconds, three times faster!

# Motor 0.5 with Tornado, using async for.
async def f():
    cursor = collection.find()
    async for doc in cursor:
        print(doc)

However, MotorCursor's to_list still reigns:

# Motor 0.5 with Tornado, using to_list.
async def f():
    cursor = collection.find()
    docs = await cursor.to_list(length=100)
    while docs:
        for doc in docs:
            print(doc)
        docs = await cursor.to_list(length=100)

The function with to_list is twice as fast as async for, but it's ungraceful and requires you to choose a chunk size. I think that async for is stylish, and fast enough for most uses.


Try Me!

I haven't always published betas before Motor releases, but this time is different. The asyncio integration is brand new. And since it required pervasive refactoring of Motor's core, the existing Tornado integration is rewritten as well. Python 3.5 support required yet another internal overhaul. I'm anxious to get early reports of all my new code in the wild.

Additionally, the change to aggregate is an API break. (There are also two subtler changes, see the changelog.) So I'm giving you a chance to opt in explicitly with pip install --pre before I make Motor 0.5 official.

So please: try it out! Install the beta:

python -m pip install --pre motor==0.5b0

Test your application with the new code. If you find issues, file a bug and I'll respond promptly. And if the beta goes smoothly, don't be silent!—tweet at me @jessejiryudavis and tell me! It's the only way I'll know the beta is working for you.