A. Jesse Jiryu Davis

Motor 0.2 Released

Version 0.2 of Motor, the asynchronous MongoDB driver for Python and Tornado, has been released. This release is compatible with MongoDB 2.6 and PyMongo 2.7. It dramatically improves interoperability with Tornado coroutines, [...]

Motor

Version 0.2 of Motor, the asynchronous MongoDB driver for Python and Tornado, has been released.

This release is compatible with MongoDB 2.6 and PyMongo 2.7. It dramatically improves interoperability with Tornado coroutines, includes support for non-blocking DNS, and adds numerous smaller features.

Links:

If you encounter any issues, please file them in our bug tracker.

That's it! With the Motor release behind me, I'm looking forward to enjoying PyCon, and talking about async at 3:15pm tomorrow.

Announcing Motor 0.2 release candidate

I'm excited to offer you Motor 0.2, release candidate zero. Motor is my non-blocking driver for MongoDB and Tornado. The changes from Motor 0.1 to 0.2 are epochal. They were motivated primarily by three events: Motor wraps PyMongo, and [...]

Motor

I'm excited to offer you Motor 0.2, release candidate zero. Motor is my non-blocking driver for MongoDB and Tornado.

The changes from Motor 0.1 to 0.2 are epochal. They were motivated primarily by three events:

  • Motor wraps PyMongo, and PyMongo has improved substantially.
  • MongoDB 2.6 is nearly done, and Motor has added features to support it.
  • Tornado's support for coroutines and for non-blocking DNS has improved, and Motor 0.2 takes advantage of this.

Please read the changelog before upgrading. There are backwards-breaking API changes; you must update your code. I tried to make the instructions clear and the immediate effort small. A summary of the changes is in my post, "the road to 0.2".

Once you're done reading, upgrade:

pip install pymongo==2.7
pip install https://github.com/mongodb/motor/archive/0.2rc0.zip

The owner's manual is on ReadTheDocs. At the time of this writing, Motor 0.2's docs are in the "latest" branch:

http://motor.readthedocs.org/en/latest/

...and Motor 0.1's docs are in "stable":

http://motor.readthedocs.org/en/stable/

Enjoy! If you find a bug or want a feature, report it. If I don't hear of any bugs in the next week I'll make the release official.

In any case, tweet me if you're building something nifty with Motor. I want to hear from you.

PyMongo 2.7 Has Shipped

Source: inrideo on Flickr I announce with satisfaction that we've released PyMongo 2.7, the successor to PyMongo 2.6.3. The bulk of the driver's changes are to support MongoDB 2.6, which is currently a release candidate. The newest [...]

Amethystine scrub python

Source: inrideo on Flickr

I announce with satisfaction that we've released PyMongo 2.7, the successor to PyMongo 2.6.3. The bulk of the driver's changes are to support MongoDB 2.6, which is currently a release candidate. The newest MongoDB has an enhanced wire protocol and some big new features, so PyMongo 2.7 is focused on supporting it. However, the driver still supports server versions as old as 1.8.

Read my prior post for a full list of the features and improvements in PyMongo. Since I wrote that, we've fixed some compatibility issues with MongoDB 2.6, dealt with recent changes to the nose and setuptools packages, and made a couple memory optimizations.

Motor 0.2 is about to ship, as well. I'll give the details in my next post.

What's next for PyMongo? We now embark on a partial rewrite, which will become PyMongo 3.0. The next-generation driver will delete many deprecated APIs: safe will disappear, since it was deprecated in favor of w=1 years ago. Connection will walk off into the sunset, giving way to MongoClient. We'll make a faster and more thread-safe core for PyMongo, and we'll expose a clean API so Motor and ODMs can wrap PyMongo more neatly.

We'll discard PyMongo's current C extension for BSON-handling. We'll replace it with libbson, a common codec that our C team is building. If you're handling BSON in PyPy, we aim to give you a much faster pure-Python codec there, too.

An Enlightening Failure

This year I plan to rewrite PyMongo's BSON decoder. The decoder is written in C, and it's pretty fast, but I had a radical idea for how to make it faster. That idea turned out to be wrong, although it took me a long time to discover that. [...]

Facepalm

This year I plan to rewrite PyMongo's BSON decoder. The decoder is written in C, and it's pretty fast, but I had a radical idea for how to make it faster. That idea turned out to be wrong, although it took me a long time to discover that.

Discovering I'm wrong is the best way to learn. The second-best way is by writing. So I'll multiply the two by writing a story about my wrong idea.

The Story

Currently, when PyMongo decodes a buffer of BSON documents, it creates a Python dict (hashtable) for each BSON document. It returns the dicts in a list.

My radical idea was to make a maximally-lazy decoder. I wouldn't decode all the documents at once, I would decode each document just-in-time as you iterate. Even more radically, I wouldn't convert each document into a dict. Instead, each document would only know its offset in the BSON buffer. When you access a field in the document, like this:

document["fieldname"]

...I wouldn't do a hashtable lookup anymore. I'd do a linear-search through the BSON. I thought this approach might be faster, since the linear search would usually be fast, and I'd avoid the overhead of creating the hashtable. If a document was frequently accessed or had many fields, I'd eventually "inflate" it into a dict.

I coded up a prototype in C, benchmarked it, and it was eight times faster than the current code. I rejoiced, and began to develop it into a full-featured decoder.

At some point I applied our unicode tests to my decoder, and I realized I was using PyString_FromString to decode strings, when I should be using PyUnicode_DecodeUTF8. (I was targeting only Python 2 at this point.) I added the call to PyUnicode_DecodeUTF8, and my decoder started passing our unicode tests. I continued adding features.

Then next day I benchmarked again, and my code was no longer any faster than the current decoder. I didn't know which change had caused the slowdown, so I learned how to use callgrind and tried all sorts of things and went a little crazy. Eventually I used git bisect, and I was enlightened: my prototype had only been fast as long as it didn't decode UTF-8 properly. Once I had fixed that, I had the same speed as the current PyMongo.

Lessons Learned

  1. The cost of PyMongo's BSON decoding is typically dominated by UTF-8 decoding. There's no way to avoid it, and it's already optimized like crazy.
  2. Python's dict is really fast for PyMongo's kind of workload. It's not worth trying to beat it.
  3. When I care about speed, I need to run my benchmarks on each commit. I should use git bisect as the first resort, not the last.

This is disappointing, but I've learned a ton about the Python C API, BSON, and callgrind. On my next attempt to rewrite the decoder, I won't forget my hard-won lessons.

Testing Network Errors With MongoDB

Someone asked on Twitter today for a way to trigger a connection failure between MongoDB and the client. This would be terribly useful when you're testing your application's handling of network hiccups. You have options: you could use [...]

Someone asked on Twitter today for a way to trigger a connection failure between MongoDB and the client. This would be terribly useful when you're testing your application's handling of network hiccups.

You have options: you could use mongobridge to proxy between the client and the server, and at just the right moment, kill mongobridge.

Or you could use packet-filtering tools to accomplish the same: iptables on Linux and ipfw or pfctl on Mac and BSD. You could use one of these tools to block MongoDB's port at the proper moment, and unblock it afterward.

There's yet another option, not widely known, that you might find simpler: use a MongoDB "failpoint" to break your connection.

Failpoints are our internal mechanism for triggering faults in MongoDB so we can test their consequences. Read about them on Kristina's blog. They're not meant for public consumption, so you didn't hear about it from me.

The first step is to start MongoDB with the special command-line argument:

mongod --setParameter enableTestCommands=1

Next, log in with the mongo shell and tell the server to abort the next two network operations:

> db.adminCommand({
...   configureFailPoint: 'throwSockExcep',
...   mode: {times: 2}
... })
2014-03-20T20:31:42.162-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The server obeys you instantly, before it even replies, so the command itself appears to fail. But fear not: you've simply seen the first of the two network errors you asked for. You can trigger the next error with any operation:

> db.collection.count()
2014-03-20T20:31:48.485-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The third operation succeeds:

> db.collection.count()
2014-03-20T21:07:38.742-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2014-03-20T21:07:38.742-0400 reconnect 127.0.0.1:27017 (127.0.0.1) ok
1

There's a final "failed" message that I don't understand, but the shell reconnects and the command returns the answer, "1".

You could use this failpoint when testing a driver or an application. If you don't know exactly how many operations you need to break, you could set times to 50 and, at the end of your test, continue attempting to reconnect until you succeed.

Ugly, perhaps, but if you want a simple way to cause a network error this could be a reasonable approach.

Waiting For Multiple Events With Tornado

Recently I saw a question on Stack Overflow about waiting for multiple events with a Tornado coroutine, until one of the events completes. The inquirer wanted to do something like this: result = yield Any([future1, future2, future3]) If [...]

Recently I saw a question on Stack Overflow about waiting for multiple events with a Tornado coroutine, until one of the events completes. The inquirer wanted to do something like this:

result = yield Any([future1, future2, future3])

If the middle future has resolved and the other two are still pending, the result should be like:

[None, "<some result>", None]

Tornado doesn't provide a class like Any. How would you implement one?

You could make a class that inherits from Future, and wraps a list of futures. The class waits until one of its futures resolves, then gives you the list of results:

class Any(Future):
    def __init__(self, futures):
        super(Any, self).__init__()
        self.futures = futures
        for future in futures:
            # done_callback is defined just below.
            future.add_done_callback(self.done_callback)

    def done_callback(self, future):
        """Called when any future resolves."""
        try:
            self.set_result(self.make_result())
        except Exception as e:
            self.set_exception(e)

    def make_result(self):
        """A list of results.

        Includes None for each pending future, and a result for each
        resolved future. Raises an exception for the first future
        that has an exception.
        """
        return [f.result() if f.done() else None
                for f in self.futures]

    def clear(self):
        """Break reference cycle with any pending futures."""
        self.futures = None

Here's an example use of Any:

@gen.coroutine
def delayed_msg(seconds, msg):
    yield gen.Task(IOLoop.current().add_timeout,
                   time.time() + seconds)
    raise gen.Return(msg)


@gen.coroutine
def f():
    start = time.time()
    future1 = delayed_msg(2, '2')
    future2 = delayed_msg(3, '3')
    future3 = delayed_msg(1, '1')

    # future3 will resolve first.
    results = yield Any([future1, future2, future3])
    end = time.time()
    print "finished in %.1f sec: %r" % (end - start, results)

    # Wait for any of the remaining futures.
    results = yield Any([future1, future2])
    end = time.time()
    print "finished in %.1f sec: %r" % (end - start, results)

IOLoop.current().run_sync(f)

As expected, this prints:

finished in 1.0 sec: [None, None, '1']
finished in 2.0 sec: ['2', None]

But you can see there are some complications with this approach. For one thing, if you want to wait for the rest of the futures after the first one resolves, it's complicated to construct the list of still-pending futures. I suppose you could do:

futures = [future1, future2, future3]
results = yield Any(f for f in futures
                    if not f.done())

Not pretty. And not correct, either! There's a race condition: if a future is resolved in between consecutive executions of this code, you may never receive its result. On the first call, you get the result of some other future that resolves faster, but by the time you're constructing the list to pass to the second Any, your future is now "done" and you omit it from the list.

Another complication is the reference cycle: Any refers to each future, which refers to a callback which refers back to Any. For prompt garbage collection, you should call clear() on Any before it goes out of scope. This is very awkward.

Additionally, you can't distinguish between a pending future, and a future that resolved to None. You'd need a special sentinel value distinct from None to represent a pending future.

The final complication is the worst. If multiple futures are resolved and some of them have exceptions, there's no obvious way for Any to communicate all that information to you. Mixing exceptions and results in a list would be perverse.

Fortunately, there's a better way. We can make Any return just the first future that resolves, instead of a list of results:

class Any(Future):
    def __init__(self, futures):
        super(Any, self).__init__()
        for future in futures:
            future.add_done_callback(self.done_callback)

    def done_callback(self, future):
        self.set_result(future)

The reference cycle is gone, and the exception-handling question is answered: The Any class returns the whole future to you, instead of its result or exception. You can inspect it as you like.

It's also easy to wait for the remaining futures after some are resolved:

@gen.coroutine
def f():
    start = time.time()
    future1 = delayed_msg(2, '2')
    future2 = delayed_msg(3, '3')
    future3 = delayed_msg(1, '1')

    futures = set([future1, future2, future3])
    while futures:
        resolved = yield Any(futures)
        end = time.time()
        print "finished in %.1f sec: %r" % (
            end - start, resolved.result())
        futures.remove(resolved)

As desired, this prints:

finished in 1.0 sec: '1'
finished in 2.0 sec: '2'
finished in 3.0 sec: '3'

There's no race condition now. You can't miss a result, because you don't remove a future from the list unless you've received its result.

To test the exception-handling behavior, let's make a function that raises an exception after a delay:

@gen.coroutine
def delayed_exception(seconds, msg):
    yield gen.Task(IOLoop.current().add_timeout,
                   time.time() + seconds)
    raise Exception(msg)

Now, instead of returning a result, one of our futures will raise an exception:

@gen.coroutine
def f():
    start = time.time()
    future1 = delayed_msg(2, '2')
    # Exception!
    future2 = delayed_exception(3, '3')
    future3 = delayed_msg(1, '1')

    futures = set([future1, future2, future3])
    while futures:
        resolved = yield Any(futures)
        end = time.time()
        try:
            outcome = resolved.result()
        except Exception as e:
            outcome = e

        print "finished in %.1f sec: %r" % (
            end - start, outcome)
        futures.remove(resolved)

Now, the script prints:

finished in 1.0 sec: '1'
finished in 2.0 sec: '2'
finished in 3.0 sec: Exception('3',)

It took a bit of thinking, but our final Any class is simple. It lets you launch many concurrent operations and process them in the order they complete. Not bad.

How Thoroughly Are You Testing Your C Extensions?

You probably know how to find Python code that isn't exercised by your tests. Install coverage and run: $ coverage run --source=SOURCEDIR setup.py test Then, for a beautiful coverage report: $ coverage html But what about your C [...]

You probably know how to find Python code that isn't exercised by your tests. Install coverage and run:

$ coverage run --source=SOURCEDIR setup.py test

Then, for a beautiful coverage report:

$ coverage html

But what about your C extensions? They're harder to write than Python, so you better make sure they're thoroughly tested. On Linux, you can use gcov. First, recompile your extension with the coverage hooks:

$ export CFLAGS="-coverage"
$ python setup.py build_ext --inplace

In your build directory (named like build/temp.linux-x86_64-2.7) you'll now see some files with the ".gcno" extension. These are gcov's data files. Run your tests again and the directory will fill up with ".gcda" files that contain statistics about which parts of your C code were run.

You have a number of ways to view this coverage information. I use Eclipse with the gcov plugin installed. (Eclipse CDT includes it by default.) Delightfully, Eclipse on my Mac understands coverage files generated on a Linux virtual machine, with no hassle at all.

lcov can make you some nice HTML reports. Run it like so:

$ lcov --capture --directory . --output-file coverage.info
$ genhtml coverage.info --output-directory out

Here's a portion of its report for PyMongo's BSON decoder:

lcov table

Our C code coverage is significantly lower than our Python coverage. This is mainly because such a large portion of the C code is devoted to error handling: it checks for every possible error, but we only trigger a subset of all possible errors in our tests.

A trivial example is in _write_regex_to_buffer, when we ensure the buffer is large enough to hold 4 more bytes. We check that realloc, if it was called, succeeded:

lcov source: No Memory

We don't run out of memory during our tests, so these two lines of error-handling are never run. A more realistic failure is in decode_all:

lcov source

This is the error handler that runs when a message is shorter than five bytes. Evidently the size check runs 56,883 times during our tests, but this particular error never occurs so the error-handler isn't tested. This is the sort of insight that'd be onerous to attain without a tool like gcov.

Try it for yourself and see: are you testing your C code as thoroughly as your Python?


You might also like my article on automatically detecting refcount errors in C extensions, or the one on making C extensions compatible with mod_wsgi.

Begging

This May I'll spend four days homeless, with a Zen teacher named Genro and a small group of fellow Buddhists. We'll live, sleep, and meditate on the streets together and eat at soup kitchens. I think the retreat has a triple purpose: First, [...]

7161960026 e92ea3c4bb

This May I'll spend four days homeless, with a Zen teacher named Genro and a small group of fellow Buddhists. We'll live, sleep, and meditate on the streets together and eat at soup kitchens. I think the retreat has a triple purpose: First, briefly abandoning the comfort and certainty of my regular life helps me practice non-attachment, the same as it helped the first Buddhist monks. Second, it gives me a taste of what it's like to be homeless, so I can better understand the homeless people I meet in NYC. And finally, it's an opportunity to raise money for homeless services.

People often ask me whether, by doing street retreat, we're competing with homeless people for scarce resources. I think not—we stay at the back of the line in soup kitchens, in case there's not enough food, and we sleep on sidewalks instead of in shelters. We beg for a few dollars on the street, but we donate thousands of dollars.

The rule is that I must raise $500 by May 8. The money will be distributed among the organizations that help us while we're on the street, and it will support the social service activities of the Hudson River Zen Center. I have to beg for the money—I'm not allowed to just donate $500 of my own.

So I'm begging you: Will you please donate?

(Tax-deductible. If want a receipt, let me know.)

Update: I've now raised my minimum, but if you'd still like to donate, please do!

Stealing

My mother recently discovered some stories about my grandfather Milton Rubin's arbitration career. He was the impartial chairman of arbitrations between labor unions and employers, so he was called "Mr. Impartial": As Milt told it: On [...]

My mother recently discovered some stories about my grandfather Milton Rubin's arbitration career. He was the impartial chairman of arbitrations between labor unions and employers, so he was called "Mr. Impartial":

As Milt told it: On one occasion, he was asked to preside over the discharge of Ramon, a dress cutter. It seems there was a big family reunion in San Juan, and that Ramon's wife had been substantially inflating accounts of his success, and their family status in New York. As far as anyone back home knew, Ramon was a virtual industrial captain of (as it is known) the Rag Trade. Desiring that, for the triumphant return, she dressed the part of her fictionalized grandeur, Ramon's wife prevailed upon him to steal a designer dress from the stock, and he complied. He was discovered, discharged, and the case came before Milt.

At arbitration, the Union ignored the usual litigation process and turned directly to the President of the company at the start of the hearing, making an impassioned plea: "Ramon has been with you for thirty years—he's a good man, he's been a good and loyal worker. He made a mistake. Please, please don't throw him on the street."

As Milt describes it, the president was moved, but obviously conflicted. He turned to Milt and asked: "Mr. Impartial, what should I do?" Milt responded: "Don't ask me what you should do—I have a different responsibility; this is a theft case, the parties have asked for a decision on just cause." While everyone in the room sat, the president got up, went to the windows and stood looking out over 34th Street for the longest time. Finally, he turned around and said, "OK, OK, he can come back to work." The people were jubilant.

When the room had cleared out, the president turned to Milt and asked, "What do you think, Mr. Impartial, did I do the right thing?" Milt turned his palms up and said: "It's a very difficult problem you had—he stole." The president stood up and once more looked out the windows: "Yes, Mr. Impartial," he said softly, "and I steal a little, too."


From a National Academy of Arbitrators History Committee Report, 2007. As told to NAA Past President Richard Bloch. Compiled by Herb Marx.