A. Jesse Jiryu Davis

Let Us Now Praise ResourceWarnings

[Source] Luckily, Pythons aren't poisonous. A couple years ago when I began using Python 3, its new ResourceWarnings infuriated me and I ranted against them. Python core developer Nick Coghlan patiently corrected me, and I wrote a [...]

Poisonous snake warning sign

[Source]

Luckily, Pythons aren't poisonous.

A couple years ago when I began using Python 3, its new ResourceWarnings infuriated me and I ranted against them. Python core developer Nick Coghlan patiently corrected me, and I wrote a followup, "Mollified About ResourceWarnings".

And now, a ResourceWarning has saved my tuchus.

A few months ago I was fixing a bug in Motor, my asynchronous driver for MongoDB. Motor has a copy_database method which I'll summarize thus:

@gen.coroutine
def copy_database(self, source, target):
    pool, socket = None, None
    try:
        pool = self.get_pool()
        socket = pool.get_socket()
        # ... several operations with the socket ...
    finally:
        if pool and socket:
            pool.return_socket(socket)

The bug occurred when the source database was password-protected. The get_socket call didn't ensure it was authenticated before it attempted to copy the database. I fixed the bug like so:

@gen.coroutine
def copy_database(self, source, target):
    pool, socket = None, None
    try:
        member = self.get_cluster_member()
        socket = self.get_authenticated_socket_from_member(member)
        # ... several operations with the socket ...
    finally:
        if pool and socket:
            pool.return_socket(socket)

Whoops. I fixed the authentication bug, but introduced a socket leak. Since pool is now always None, the code in the finally clause never runs. In this example the bug is obvious, but the real method is 60 lines long—just long enough for me not to see the mismatch between its first and final lines.

I blithely released the bug in Motor 0.2.

Apparently my users don't call copy_database much, since no one reported the socket leak. I'm not surprised: Motor is optimized for high-concurrency web applications, not for administrative scripts that copy databases around. If you want to copy a database you'd use the regular driver, PyMongo, instead. And so the bug lurked for three months.

This weekend I teased Motor apart, into two modules: a "core" module that talks to MongoDB, and a "framework" module that uses Tornado for asynchronous I/O. Once I had separated the two aspects of Motor, I made a second "framework" module that uses Python 3.4's new asyncio framework instead of Tornado. copy_database was among the first methods I tested in the new Motor-on-asyncio. It's relatively complex so I used it to give my new code a workout.

copy_database worked with asyncio! But I wasn't ready to celebrate yet:

ResourceWarning: unclosed <socket.socket fd=9, laddr=('127.0.0.1', 54065), raddr=('127.0.0.1', 27017)>

That damn ResourceWarning. I did a bit of binary-searching through my test code until I found it: I wasn't returning the socket in copy_database. The fix is obvious:

@gen.coroutine
def copy_database(self, source, target):
    member, socket = None, None
    try:
        member = self.get_cluster_member()
        socket = self.get_authenticated_socket_from_member(member)
        # ... several operations with the socket ...
    finally:
        if socket:
            member.pool.return_socket(socket)

I've released this fix today in Motor 0.3.2.

One lesson learned is: I was foolish when I made my code "robust" against unexpected conditions. The earlier code had returned the socket if pool and socket. But if socket isn't null, pool shouldn't be, either. So if socket alone should be sufficient. This simpler code, that only handles the case I expect to arise, would have failed immediately when I introduced the bug. The misguided robustness of my earlier code masked my bug for months.

Another lesson is: I finally understand the value of ResourceWarnings. They force me to decide when costly objects are deallocated, and they warn me if I mess it up. I'm reviewing my test procedures to ensure that ResourceWarnings are displayed. Ideally, a ResourceWarning should be converted to an exception that causes my unittests to fail. Do you know how to make that happen?

Motor 0.3.2 Released

Today I released version 0.3.2 of Motor, the asynchronous MongoDB driver for Python and Tornado. This release is compatible with MongoDB 2.2, 2.4, and 2.6. It requires PyMongo 2.7.1. This release fixes a socket leak in the [...]

Motor

Today I released version 0.3.2 of Motor, the asynchronous MongoDB driver for Python and Tornado. This release is compatible with MongoDB 2.2, 2.4, and 2.6. It requires PyMongo 2.7.1.

This release fixes a socket leak in the "copy_database" method that has been present since Motor 0.2. Evidently Motor users don't call "copy_database" much. I've written about the bug and lessons learned in "Let Us Now Praise ResourceWarnings".

Get the latest version with pip install --upgrade motor. The documentation is on ReadTheDocs. If you encounter any issues, please file them in Jira.

Toro 0.6 Released

I've just released version 0.6 of Toro. Toro provides semaphores, queues, and so on, for advanced control flows with Tornado coroutines. Get it with "pip install --upgrade toro". Toro's documentation, with plenty of examples, is on [...]

Toro

I've just released version 0.6 of Toro. Toro provides semaphores, queues, and so on, for advanced control flows with Tornado coroutines. Get it with "pip install --upgrade toro". Toro's documentation, with plenty of examples, is on ReadTheDocs.

There is one bugfix in this release. A floating point maxsize had been treated as infinite. So if you did this:

q = toro.Queue(maxsize=1.3)

...then the queue would never be full. In the newest version of Toro, a maxsize of 1.3 now acts like a maxsize of 2.

Shouldn't Toro just require that maxsize be an integer? Well, the Python standard Queue allows a floating-point number. So when Vajrasky Kok noticed that asyncio's Queue treats a floating-point maxsize as infinity, he proposed a fix that handles floats the same as the standard Queue does. (That asyncio bug was my fault, too.)

Once Guido van Rossum accepted that fix, I updated Toro to comply with the other two Queues.

Motor 0.3.1 Released

Today I released version 0.3.1 of Motor, the asynchronous MongoDB driver for Python and Tornado. This release is compatible with MongoDB 2.2, 2.4, and 2.6. It requires PyMongo 2.7.1. There are no new features. Changes: Fix an error with [...]

Motor

Today I released version 0.3.1 of Motor, the asynchronous MongoDB driver for Python and Tornado. This release is compatible with MongoDB 2.2, 2.4, and 2.6. It requires PyMongo 2.7.1.

There are no new features. Changes:

Get the latest version with pip install --upgrade motor. The documentation is on ReadTheDocs. If you encounter any issues, please file them in Jira.

Meanwhile, I'm prototyping asyncio support alongside Tornado for Motor's next major release.

Resources For Writing About Programming

Today was Open Source Bridge's unconference day. I led a session about improving our writing skills. I wanted to gather more ideas to supplement my talk and my article on "Writing an Excellent Programming Blog". A half-dozen smart people [...]

Today was Open Source Bridge's unconference day. I led a session about improving our writing skills. I wanted to gather more ideas to supplement my talk and my article on "Writing an Excellent Programming Blog". A half-dozen smart people showed up with tips and links. Here are my notes.

Some examples of unusually well-written programming books:

  • Programming Pearls by Jon Bentley. The author describes it as "a collection of essays about a glamorous aspect of software: programming pearls whose origins lie beyond solid engineering, in the realm of insight and creativity."
  • MongoDB: The Definitive Guide by Kristina Chodorow. As I said before, Kristina has a warm and funny style that sustains you through any dry topic.
  • The Mythical Man-Month by Fred Brooks. Apart from its sexism, the book is full of wisdom and unusually fine prose.
  • Expert C Programming: Deep C Secrets by Peter van der Linden. His funny anecdotes about ancient C programmers might outweigh the actual coding instruction.

Non-programming books: I like to occasionally reread Strunk & White. The simple style it advocates and exemplifies is particularly well-suited for describing complex topics. I've also read E. B. White's The Points Of My Compass to renew his influence on me.

More great programming blogs:

  • Liz Keogh, in particular Pixie-Driven Development is an innovative way to explain object-oriented design.
  • Joel On Software seems able to promulgate new ideas that catch fire regularly.
  • Coding Horror is informative and entertaining (although many say his "What Can Men Do?" article is a misstep).
  • XKCD's What-If is the most fun thing to read. What about it is replicable for people who write about programming?
  • Julia Evans, whom I praised in my talk, was mentioned again as an exemplary blogger.
  • Reginald Braithwaite writes about Javascript and Coffeescript.

Techniques for improving:

  • When you encounter effective writing, think critically about what makes it so. Can you emulate it?
  • When you read ineffective writing, ask why it doesn't work. How can it be improved?
  • Get reviews. Share a draft of your article with an expert, to ensure accuracy, and a non-expert who can tell you what is unclear. Have a process for improving drafts before they are published.
  • Consider the audience. Imagine a target reader as someone whom you guide along a path from one landmark to the next. Stop to describe each notable point along the way.
  • Help yourself to get started. Write an outline, or begin by brain-dumping, or casually describe to a friend the article you want to write.
  • Reuse content. If you've written a detailed email, or a comment block, or a talk, adapt it for an article.
  • Set yourself constraints. If you have trouble beginning to write, or finishing an article, you can set yourself a deadline. Setting a maximum length for your article can ensure you finish it, and can make the task less formidable and help you begin.
  • Read a lot. Read a variety of styles and choose which to imitate. You'll tend to write like what you're reading; read the authors you aspire to resemble.

Uncategorized:

  • Know the policies of the site you're publishing on. Wikipedia has very detailed policies, for example. Even if you're publishing on your own site, the aggregators of which your site is a member have policies you need to be aware of.
  • Code examples. Ideally, provide both inline snippets and a link to a full, executable example. Trim and rewrite code snippets mercilessly for clarity.

Write An Excellent Programming Blog

I want you to write. Not just code. Also words. If you're a member of the open source community, you can help us by writing about programming, just as much as by actually programming. And writing helps you, too: you can become better known and [...]

Vermeer, Lady Writing a Letter with her Maid

I want you to write. Not just code. Also words.

If you're a member of the open source community, you can help us by writing about programming, just as much as by actually programming. And writing helps you, too: you can become better known and promote your ideas.

Even more importantly, writing is thinking. There is no more thorough way to understand than to explain in writing.

Why?

It doesn't matter how narrow your expertise is. If you know better than anyone how to parse New York City subway schedules, I want you to write about it. If you've taught your cat to care for a Tamagotchi, I definitely want you to write about it. Whatever your expertise is, show me what you know. Then when I have a question about your specialty, I'll know to come to you for help.

For example, the author of HappyBase, a Python driver for HBase, emailed me for advice when he began his project. He knew from my blog that I work on a couple MongoDB drivers, and he had very sophisticated questions for me about connection pooling. Working with him was stimulating, and it was a very efficient way for me to contribute to a popular project.

Being known in your community as an expert or as a cogent explainer helps you. You're more likely to get patches accepted by projects, get talks accepted by conferences, get users, get a job.

Writing an explanation of a bug requires you to think it through, better than any other technique. My faith in writing-as-thinking is so fervent that when I see a tricky bug my first step is to start an article. That is what I did when I hit a bug in PyMongo's connection pool last year. It turns out that in Python 2.6, assigning to a threadlocal is not thread-safe. I am not nearly smart enough, dear reader, to discover such an intricate race condition unless I consolidate each step of the discovery by explaining it in writing.

What?

I notice roughly five formats among the best articles by programmers: stories, opinions, how-tos, how things work, and reviews. If you want to write but you haven't chosen a topic, or don't know how to approach it, this will get you started.

Story

"I'm going to tell you a story about Foo, how it taught me Bar, and led to Baz. First this happened, then that happened. And that's the story of Foo."

Ideas:

We are innately interested in stories about people. You need not be confessional or icky. If I get to know you a little through your blog, your technical articles feel warmer, and I remember who you are.

Opinion

"Thesis. Points of evidence. Response to likely objections. Restatement of thesis." Just like we learned in high school. The most important thing is that you don't simply have an opinion, but you have a compelling and interesting argument to support it.

Avoid useless controversy like "product Foo is bad" or "Bar is better than Foo." You have nothing to gain by attacking others. Mr. Miyagi says: Karate for defense only.

Ideas:

How-To

"Doing Foo is important under the given conditions. I'm going to show you how to Foo. Do this, then do that. There, now I've shown you how to Foo. You should go out and do Foo."

A how-to must be motivated: you must begin by telling your reader when and why it is important to know.

Ideas:

How Something Works

"Do you wonder how Foo works? I'm going to show you how Foo is implemented. It does this and that. Now I've shown you how it works."

Almost every technology I hold in awe has been easier to understand, when I read its source code, than I feared it would be. Writing an explanation of it is a good excuse to dive in and find out.

A "how something works" article need not be motivated. Sure, some readers might want to know how something works in order to use it better. But there are people like me who want to know how almost everything works, and we are your audience for this article. People who don't want to know can move along.

Ideas:

Reviews

"I read, saw, played, or used something. This is what it is. This is what my experience was like. The thing has these strengths and weaknesses. In conclusion, it's best when evaluated by certain criteria."

It's tempting to evaluate a book, movie, game, or project on a good–bad axis, but this isn't very useful. Mostly describe and analyze instead of evaluating. Tell me what the thing is good for.

Ideas:

  • Technical books! They need not be recent, though staying abreast of publications in your area is good. Reading a book with the intent to review it makes you read more closely, and teaches you about good technical writing too. Here's my review of O'Reilly's "Building Node Applications with MongoDB and Backbone".
  • Others' software projects. But be gentle.
  • Games, movies, music, books that are not about programming: Writing reviews is excellent practice, and warms your blog.

How do you find an audience?

Please don't care about reach. Reach is not the aim of writing about programming. Seth Godin: "Most of the time, you'll aim to delight the masses and you'll fail. So much easier to aim for the smallest possible audience, not the largest, to build long-term value among a trusted, delighted tribe, to create work that matters and stands the test of time."

Since you don't care about reach, don't waste time on SEO. Your goal is not to get lots of hits. You aren't BuzzFeed. Random visitors aren't valuable to you, because you don't show ads. Instead, your goal is to attract specialists in your field so you can share ideas with them. Luckily, you can reach these specialists without much effort.

First, find the aggregator your community reads. I write a lot about Python, and the Planet Python aggregator is by far my best channel for distributing articles. Any article I put in the "Python" category is included in its feed, and that guarantees a few hundred visits. Write planet@python.org to get included. (And be patient, it's run by volunteers with busy lives.) If you write about another language or technology with a large community, find its aggregators and request to be included in them.

Tweeting your articles has some value. More valuable is posting your article on a relevant subreddit. And use your own home page to send visitors to your best articles, don't just display the most recent ones there.

You can try your luck on Hacker News, but I have concluded that no one serious goes there, so I do not either.

How do you improve?

Write. Emulate the best bloggers and the best articles.

Don't emulate Daring Fireball, GigaOM, or TechCrunch. These are industry news sites, ultimately devoted to one question: which companies' stocks will go up, and which will go down? This is not your expertise. Besides, it's boring. Emulate writers like these instead:

Glyph Lefkowitz. In "Unyielding", he argues that async's greatest strength is not that it is efficient, but that it makes race conditions easier to avoid.

Kristina Chodorow. "Stock Option Basics" tells a personal story about quitting a startup, and describes better than most writers how stock options work. In "Why Command Helpers Suck" she publicly and convincingly criticizes some MongoDB design decisions because they discourage users from advancing their understanding. Her article on MongoDB unittesting is very specialized, but I've referred to it many times since it fills a woeful gap in the MongoDB developer docs. Kristina's funny and amiable voice works well on her blog and in O'Reilly's "MongoDB: The Definitive Guide".

Armin Ronacher. His "Exec in Python" article is long, incredibly thorough, and timeless. It takes an hour or more to read. It took him days to write, I'm sure, and he continued correcting it after publication. It brings insights into the Python 2 and 3 runtimes, how web2py works, and builds an argument for how it should work instead.

Julia Evans is loose and exuberant in her blog, the same as when she speaks. But don't be fooled, her argument for anonymizing conference proposals is meticulous. So is her guide to applying machine learning to business problems.

Graham Dumpleton's magisterial series on Python decorators obsoletes all other writing on that topic.

At Open Source Bridge a group of writers made a list of additional writers to emulate, and tips for improving our writing about programming.

How do you make the time?

Writing doesn't have to be a regular time-suck, because there is no need to publish regularly or often. Most of your value is in infrequent, deep articles. Furthermore, there's no rush to write an article now, because the best subjects are evergreen. Patrick McKenzie: "You can, and should, make the strategic decision that you'll primarily write things which retain their value. It takes approximately the same amount of work to create great writing which lasts versus creating great writing which ages quickly."

(That's my second link to the same essay, it's that good.)

So take your time. When you have a good idea or an unusual experience, you'll be moved to write.

Conclusion

You know something very specific about programming that's worth explaining. Or you have a new way to explain a common topic. Either way, I want you to share your knowledge in writing. Explaining will deepen your understanding as nothing else can. If you don't know what to write about, riff off the ideas I suggested, or get inspired by great blogs. Craft articles of lasting value.

Goodbye MongoDB World, Hello Open Source Bridge

Today I talked to a small audience at MongoDB World in New York City. I described how I made a visualization of global weather data using MongoDB, Python, Monary, and Matplotlib. The visualization looks like this: My talk was one of a [...]

Talking at Mongodb World

Today I talked to a small audience at MongoDB World in New York City. I described how I made a visualization of global weather data using MongoDB, Python, Monary, and Matplotlib. The visualization looks like this:

Contour map of global temperature using Matplotlib

My talk was one of a three-part series called "The Weather of the Century". My colleague Randall Hunt gave a schema design talk about weather data, and André Spiegel, who was the brains behind the whole idea, analyzed the performance of various MongoDB configurations executing various operations on terabytes of data.

In the next week I plan to write an article on how I generated this visualization. I also want to write an introduction to Monary: it's a little-known, highly specialized MongoDB driver capable of shocking throughput.

Tomorrow I'm flying to Portland, Oregon to catch the second half of Open Source Bridge. I'll exhort the audience to write an excellent programming blog. I hope they take my advice!

Rules of Thumb for Methods and Functions

[Source] The Python team at MongoDB is partially rewriting PyMongo. The next version, 3.0, aims to be faster, more flexible, and more maintainable than the current 2.x series. There is nothing like the satisfaction of pulling out the [...]

Le Pouce, sculpture in Paris[Source]

The Python team at MongoDB is partially rewriting PyMongo. The next version, 3.0, aims to be faster, more flexible, and more maintainable than the current 2.x series. There is nothing like the satisfaction of pulling out the weeds and making a fresh patch of ground for new code.

A design flaw in the current PyMongo is that a large number of instance methods have return values and side effects. For example, MongoClient has a private _check_response_to_last_error method. It takes a binary message from the server and returns a parsed version of it. But depending on what errors it finds in the server message, the method sometimes clears the client's connection pool, or changes all threads' socket affinities, or wipes its cached information about who the primary server is. Just looking at the method's signature doesn't tell me all the things it could do: since it's an instance method of MongoClient, it could change any part of the MongoClient's state.

This gets gnarly, quickly.

In most cases these mixed methods did one thing at first: they only returned a value, or only changed state. And then we had to fix something and the easiest way was to add a side-effect or add a return value. And so the road to hell was paved.

I want to minimize the temptation for these mixed methods in PyMongo 3. My main strategy is to minimize methods, period. My rules of thumb are these:

  • If it accesses private instance variables, it's an instance method. Everything else can and should be a function.
  • When a method is necessary, it should set a private variable, or it should have a return value. Not both.

No rule should followed without exception, of course. And there will be a handful of exceptions to these rules. But on the whole I think this limits the risk and complexity of methods in PyMongo. What do you think?

Street Retreat 2014 Recap

Although I did my street retreat a month ago, before my Taipei trip, I'm just now finding time to write about it. A street retreat is a Zen practice in which a group of people join to live on the streets together for a few days. We leave our money, [...]

Tompkins Square Park

Although I did my street retreat a month ago, before my Taipei trip, I'm just now finding time to write about it.

A street retreat is a Zen practice in which a group of people join to live on the streets together for a few days. We leave our money, phones, and everything behind. We sleep on the sidewalk, meditate in parks, and eat at soup kitchens. It's not exactly homelessness, but it's a taste of it, and it's the best way I know to genuinely meet homeless people. It's also a way to raise a few thousand dollars to give to homeless services. And the retreat is a chance to practice like the first Buddhist monks: They were homeless too.

If you want to know more about street retreats read my invitation for the most recent one, or my recap of last year.

This year's retreat was marked by great abundance. At our opening council in Washington Square Park we had twelve people, the biggest group I've seen. And over the course of the retreat we kept adding more. We had Batman with us—he's a long-time student of Zen teacher Bernie Glassman, and he was homeless for decades. He knows everything about the streets and everyone on the streets. On the morning of our second day, he ran into a friend at the Bowery Mission. Her name was Fatima and she was a dynamo, a master of panhandling. Each day she gave us tips on how to beg, and demonstrated her techniques. "Romeo and Juliet!" she yelled hoarsely at any couple who walked by. If she got them to laugh, then she asked for money.

Normally a group on a street retreat likes to blend in with the homeless population, but Fatima wasn't playing by our rules. She announced us to the Bowery Mission staff and explained what we were doing on the street. I'm not sure how it happened, but she obliged us to work a volunteer shift at the Mission. I and some of the men in our group worked in the Mission's clothing-distribution room. A resident named Lucky was running the show. That day the Mission offered showers to homeless men. As men arrived from the showers they brought a list of items they needed. We fulfilled the orders for shirts, pants, and so on. We saw that the Mission had a pile of sleeping bags and we asked for five of them, which Lucky gave us.

That night when we unrolled the sleeping bags they were revealed to contain street survival kits: shirt, toothbrush, hand sanitizer, scripture.

I ate and slept better than I ever have on a retreat. Fatima found us a place to sleep, under a construction scaffold on Mott Street. Batman picked up a case of discarded juice from Organic Avenue, still cold and fresh, and distributed it among us. "These go for eight dollars each!" A security guard came on duty around midnight; Batman instantly befriended him. They talked half the night. "I'm your security guard," he announced. "This place is your home. Go ahead and sleep, I'll keep an eye out for you." It rained hard at night but the scaffold kept us dry.

Somewhere along the way Fatima ran into a friend, Julius, who joined our retreat too. He was a black man my age, with a monkish vibe. He had a meditation practice of his own, and he was in the habit of sleeping on the A train with an old homeless man who needed someone to watch over him.

One evening we begged our dinner from the Union Square farmer's market. The vendors gave us a quiche, chêvre, feta, a big bag of mustard greens, many loaves of bread. There was a group at the Square called Occupy Kitchen, they gave us a tray full of pasta, a bag of salmon salad, plates and forks. It rained on us in spurts as we sat in the square chanting sutras. A few local panhandlers joined for the ceremony and talked with us as we shared the food.

This retreat our group was porous, welcoming. There were too many of us, and we were too mostly white, to blend in, so street people came up and asked us what we were doing. Some joined us for a bit. Even our security guard joined on the final morning, staying with us long enough to participate in the closing ceremony.

This was my fourth retreat. Each time, it's less of an adventure and more of a practice. I did my first street retreat to have an adventure and survive it. Now, I continue to do it as a practice, to train myself, to refresh the lessons retreat teaches me: to be humble, to be generous, and to gratefully receive generosity.

Refactoring Tornado Coroutines

[Source] Sometimes writing callback-style asynchronous code with Tornado is a pain. But the real hurt comes when you want to refactor your async code into reusable subroutines. Tornado's coroutines make refactoring easy. I'll [...]

Tornado [Source]

Sometimes writing callback-style asynchronous code with Tornado is a pain. But the real hurt comes when you want to refactor your async code into reusable subroutines. Tornado's coroutines make refactoring easy. I'll explain the rules.

(This article updates my old "Refactoring Tornado Code With gen.engine". The updated code here demonstrates the current syntax for Tornado 3 and Motor 0.3.)

For Example

I'll use this blog to illustrate. I built it with Motor-Blog, a trivial blog platform on top of Motor, my asynchronous MongoDB driver for Tornado.

When you came here, Motor-Blog did three or four MongoDB queries to render this page.

1: Find the blog post at this URL and show you this content.

2 and 3: Find the next and previous posts to render the navigation links at the bottom.

Maybe 4: If the list of categories on the left has changed since it was last cached, fetch the list.

Let's go through each query and see how Tornado coroutines make life easier.

Fetching One Post

In Tornado, fetching one post takes a little more work than with blocking-style code:

db = motor.MotorClient().my_blog_db

class PostHandler(tornado.web.RequestHandler):
    @tornado.asynchronous
    def get(self, slug):
        db.posts.find_one({'slug': slug}, callback=self._found_post)

    def _found_post(self, post, error):
        if error:
            raise tornado.web.HTTPError(500, str(error))
        elif not post:
            raise tornado.web.HTTPError(404)
        else:
            self.render('post.html', post=post)

Not so bad. But is it better with a coroutine?

class PostHandler(tornado.web.RequestHandler):
    @gen.coroutine
    def get(self, slug):
        post = yield db.posts.find_one({'slug': slug})
        if not post:
            raise tornado.web.HTTPError(404)

        self.render('post.html', post=post)

Much better. If you don't pass a callback to find_one, then it returns a Future instance. A Future is nothing special, it's just a little object that represents an unresolved value. Some time hence, Motor will resolve the Future with a value or an exception. To wait for the Future to be resolved, yield it.

The yield statement makes this function a generator. gen.coroutine is a brilliant invention that runs the generator until it's complete. Each time the generator yields a Future, gen.coroutine schedules the generator to be resumed when the Future is resolved. Read the source code of the Runner class for details, it's exhilarating. Or just enjoy the glow of putting all your logic in a single function again, without defining any callbacks.

Even better, you get normal exception handling: if find_one gets a network error or some other failure, it raises an exception. Tornado knows how to turn an exception into an HTTP 500, so we no longer need special code for errors.

This coroutine is much more readable than a callback, but it doesn't look any nicer than multithreaded code. It will start to shine when you need to parallelize some tasks.

Fetching Next And Previous

Once Motor-Blog finds the current post, it gets the next and previous posts so it can display their titles. Since the two queries are independent we can save a few milliseconds by doing them in parallel. How does this look with callbacks?

@tornado.asynchronous
def get(self, slug):
    db.posts.find_one({'slug': slug}, callback=self._found_post)

def _found_post(self, post, error):
    if error:
        raise tornado.web.HTTPError(500, str(error))
    elif not post:
        raise tornado.web.HTTPError(404)
    else:
        _id = post['_id']
        self.post = post

        # Two queries in parallel.
        # Find the previously published post.
        db.posts.find_one(
            {'pub_date': {'$lt': post['pub_date']}}
            sort=[('pub_date', -1)],
            callback=self._found_prev)

        # Find subsequently published post.
        db.posts.find_one(
            {'pub_date': {'$gt': post['pub_date']}}
            sort=[('pub_date', 1)],
            callback=self._found_next)

def _found_prev(self, prev_post, error):
    if error:
        raise tornado.web.HTTPError(500, str(error))
    else:
        self.prev_post = prev_post
        if self.next_post:
            # Done
            self._render()

def _found_next(self, next_post, error):
    if error:
        raise tornado.web.HTTPError(500, str(error))
    else:
        self.next_post = next_post
        if self.prev_post:
            # Done
            self._render()

def _render(self)
    self.render(
        'post.html',
        post=self.post,
        prev_post=self.prev_post,
        next_post=self.next_post)

This is completely disgusting and it makes me want to give up on async. We need special logic in each callback to determine if the other callback has already run or not. All that boilerplate can't be factored out. Will a coroutine help?

@gen.coroutine
def get(self, slug):
    post = yield db.posts.find_one({'slug': slug})
    if not post:
        raise tornado.web.HTTPError(404)
    else:
        future_0 = db.posts.find_one(
            {'pub_date': {'$lt': post['pub_date']}}
            sort=[('pub_date', -1)])

        future_1 = db.posts.find_one(
            {'pub_date': {'$gt': post['pub_date']}}
            sort=[('pub_date', 1)])

        prev_post, next_post = yield [future_0, future_1]
        self.render(
            'post.html',
            post=post,
            prev_post=prev_post,
            next_post=next_post)

Yielding a list of Futures tells the coroutine to wait until they are all resolved.

Now our single get function is just as nice as it would be with blocking code. In fact, the parallel fetch is far easier than if you were multithreading instead of using Tornado. But what about factoring out a common subroutine that request handlers can share?

Fetching Categories

Every page on my blog needs to show the category list on the left side. Each request handler could just include this in its get method:

categories = yield db.categories.find().sort('name').to_list(10)

But that's terrible engineering. Here's how to factor it into a coroutine:

@gen.coroutine
def get_categories(db):
    categories = yield db.categories.find().sort('name').to_list(10)
    raise gen.Return(categories)

This coroutine does not have to be part of a request handler—it stands on its own at the module scope.

The raise gen.Return() statement is the weirdest syntax in this example. It's an artifact of Python 2, in which generators aren't allowed to return values. To hack around this limitation, Tornado coroutines raise a special kind of exception called a Return. The coroutine catches this exception and treats it like a returned value. In Python 3, a simple return categories accomplishes the same result.

To call my new coroutine from a request handler, I do:

class PostHandler(tornado.web.RequestHandler):
    @gen.coroutine
    def get(self, slug):
        categories = yield get_categories(db)
        # ... get the current, previous, and
        # next posts as usual, then ...
        self.render(
            'post.html',
            post=post,
            prev_post=prev_post,
            next_post=next_post,
            categories=categories)

Since get_categories is a coroutine now, calling it returns a Future. To wait for get_categories to complete, the caller can yield the Future. Once get_categories completes, the Future it returned is resolved, so the caller resumes. It's almost like a regular function call!

Now that I've factored out get_categories, it's easy to add more logic to it. This is nice because I want to cache the categories between page views. get_categories can be updated very simply to use a cache:

categories = None

@gen.coroutine
def get_categories(db):
    global categories
    if not categories:
        categories = yield db.categories.find().sort('name').to_list(10)

    raise gen.Return(categories)

(Note for nerds: I invalidate the cache whenever a post with a new category is added. The "new category" event is saved to a capped collection in MongoDB, which all the Tornado servers are always tailing. This is a simple way to use MongoDB as an event queue, which the multiple Tornado processes use to communicate with each other.)

Conclusion

Tornado's excellent documentation shows briefly how a method that makes a few async calls can be simplified using gen.coroutine, but the power really comes when you need to factor out a common subroutine. There are only three steps:

  1. Decorate the subroutine with @gen.coroutine.
  2. In Python 2, the subroutine returns its result with raise gen.Return(result).
  3. Call the subroutine from another coroutine like result = yield subroutine().

That's all there is to it. Tornado's coroutines make asynchronous code efficient, clean—even beautiful.