Motor (yes, that's my non-blocking MongoDB driver for Tornado) has three methods for iterating a cursor:
next_object. I chose these three methods to match the Node.js driver's methods, but in Python they all have problems.
I'm writing to announce an improvement I made to
next_object and to ask you for suggestions for further improvement.
Update: Here's the improvements I made to the API in response to your critique.
MotorCursor.to_list is clearly the most convenient: it buffers up all the results in memory and passes them to the callback:
@gen.engine def f(): results = yield motor.Op(collection.find().to_list) print results
But it's dangerous, because you don't know for certain how big the results will be unless you set an explicit limit. In the docs I exhort you to set a limit before calling
to_list. Should I raise an exception if you don't, or just let the user beware?
each takes a callback which is executed once for every document. This actually looks fairly elegant in Node.js, but because Python doesn't do anonymous functions it looks like ass in Python, with control jumping forward and backward in the code:
def each(document, error): if error: raise error elif document: print document else: # Iteration complete print 'done' collection.find().each(callback=each)
Python's generators allow us to do inline callbacks with
tornado.gen, which makes up for the lack of anonymous functions.
each doesn't work with the generator style, though, so I don't think many people will use
tornado.gen is the most straightforward way to write Tornado apps, I designed
next_object for you to use with
tornado.gen, like this:
@gen.engine def f(): cursor = collection.find() while cursor.alive: document = yield motor.Op(cursor.next_object) print document print 'done'
This loop looks pretty nice, right? The improvement I just committed is that
next_object prefetches the next batch whenever needed to ensure that
alive is correct—that is,
True if the cursor has more documents,
Problem is, just because
True doesn't truly guarantee that
next_object will actually return a document. The first call returns
find matched no documents at all, so a proper loop is more like:
@gen.engine def f(): cursor = collection.find() while cursor.alive: document = yield motor.Op(cursor.next_object) if document: print document else: # No results at all break
This is looking less nice. A blocking driver could have reasonable solutions like making
cursor.alive actually fetch the first batch of results and return
False if there are none. But a non-blocking driver needs to take a callback for every method that does I/O. I'm considering introducing a
MotorCursor.has_next method that takes a callback:
cursor = collection.find() while (yield motor.Op(cursor.has_next)): # Now we know for sure that document isn't None document = yield motor.Op(cursor.next_object) print document
This will be a core idiom for Motor applications; it should be as easy as possible to use.
What do you think?