Motor

Motor (yes, that's my non-blocking MongoDB driver for Tornado) has three methods for iterating a cursor: to_list, each, and next_object. I chose these three methods to match the Node.js driver's methods, but in Python they all have problems.

I'm writing to announce an improvement I made to next_object and to ask you for suggestions for further improvement.

Update: Here's the improvements I made to the API in response to your critique.

to_list

MotorCursor.to_list is clearly the most convenient: it buffers up all the results in memory and passes them to the callback:

@gen.engine
def f():
    results = yield motor.Op(collection.find().to_list)
    print results

But it's dangerous, because you don't know for certain how big the results will be unless you set an explicit limit. In the docs I exhort you to set a limit before calling to_list. Should I raise an exception if you don't, or just let the user beware?

each

MotorCursor's each takes a callback which is executed once for every document. This actually looks fairly elegant in Node.js, but because Python doesn't do anonymous functions it looks like ass in Python, with control jumping forward and backward in the code:

def each(document, error):
    if error:
        raise error
    elif document:
        print document
    else:
        # Iteration complete
        print 'done'

collection.find().each(callback=each)

Python's generators allow us to do inline callbacks with tornado.gen, which makes up for the lack of anonymous functions. each doesn't work with the generator style, though, so I don't think many people will use each.

next_object

Since tornado.gen is the most straightforward way to write Tornado apps, I designed next_object for you to use with tornado.gen, like this:

@gen.engine
def f():
    cursor = collection.find()
    while cursor.alive:
        document = yield motor.Op(cursor.next_object)
        print document

    print 'done'

This loop looks pretty nice, right? The improvement I just committed is that next_object prefetches the next batch whenever needed to ensure that alive is correct—that is, alive is True if the cursor has more documents, False otherwise.

Problem is, just because cursor.alive is True doesn't truly guarantee that next_object will actually return a document. The first call returns None if find matched no documents at all, so a proper loop is more like:

@gen.engine
def f():
    cursor = collection.find()
    while cursor.alive:
        document = yield motor.Op(cursor.next_object)
        if document:
            print document
        else:
            # No results at all
            break

This is looking less nice. A blocking driver could have reasonable solutions like making cursor.alive actually fetch the first batch of results and return False if there are none. But a non-blocking driver needs to take a callback for every method that does I/O. I'm considering introducing a MotorCursor.has_next method that takes a callback:

cursor = collection.find()
while (yield motor.Op(cursor.has_next)):
    # Now we know for sure that document isn't None
    document = yield motor.Op(cursor.next_object)
    print document

This will be a core idiom for Motor applications; it should be as easy as possible to use.

What do you think?

A. Jesse Jiryu Davis

Motor: Iterating Over Results

to_list

each

next_object