A. Jesse Jiryu Davis

Testing Network Errors With MongoDB

Someone asked on Twitter today for a way to trigger a connection failure between MongoDB and the client. This would be terribly useful when you're testing your application's handling of network hiccups. You have options: you could use [...]

Someone asked on Twitter today for a way to trigger a connection failure between MongoDB and the client. This would be terribly useful when you're testing your application's handling of network hiccups.

You have options: you could use mongobridge to proxy between the client and the server, and at just the right moment, kill mongobridge.

Or you could use packet-filtering tools to accomplish the same: iptables on Linux and ipfw or pfctl on Mac and BSD. You could use one of these tools to block MongoDB's port at the proper moment, and unblock it afterward.

There's yet another option, not widely known, that you might find simpler: use a MongoDB "failpoint" to break your connection.

Failpoints are our internal mechanism for triggering faults in MongoDB so we can test their consequences. Read about them on Kristina's blog. They're not meant for public consumption, so you didn't hear about it from me.

The first step is to start MongoDB with the special command-line argument:

mongod --setParameter enableTestCommands=1

Next, log in with the mongo shell and tell the server to abort the next two network operations:

> db.adminCommand({
...   configureFailPoint: 'throwSockExcep',
...   mode: {times: 2}
... })
2014-03-20T20:31:42.162-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The server obeys you instantly, before it even replies, so the command itself appears to fail. But fear not: you've simply seen the first of the two network errors you asked for. You can trigger the next error with any operation:

> db.collection.count()
2014-03-20T20:31:48.485-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed

The third operation succeeds:

> db.collection.count()
2014-03-20T21:07:38.742-0400 trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2014-03-20T21:07:38.742-0400 reconnect 127.0.0.1:27017 (127.0.0.1) ok
1

There's a final "failed" message that I don't understand, but the shell reconnects and the command returns the answer, "1".

You could use this failpoint when testing a driver or an application. If you don't know exactly how many operations you need to break, you could set times to 50 and, at the end of your test, continue attempting to reconnect until you succeed.

Ugly, perhaps, but if you want a simple way to cause a network error this could be a reasonable approach.

Waiting For Multiple Events With Tornado

Recently I saw a question on Stack Overflow about waiting for multiple events with a Tornado coroutine, until one of the events completes. The inquirer wanted to do something like this: result = yield Any([future1, future2, future3]) If [...]

Recently I saw a question on Stack Overflow about waiting for multiple events with a Tornado coroutine, until one of the events completes. The inquirer wanted to do something like this:

result = yield Any([future1, future2, future3])

If the middle future has resolved and the other two are still pending, the result should be like:

[None, "<some result>", None]

Tornado doesn't provide a class like Any. How would you implement one?

You could make a class that inherits from Future, and wraps a list of futures. The class waits until one of its futures resolves, then gives you the list of results:

class Any(Future):
    def __init__(self, futures):
        super(Any, self).__init__()
        self.futures = futures
        for future in futures:
            # done_callback is defined just below.
            future.add_done_callback(self.done_callback)

    def done_callback(self, future):
        """Called when any future resolves."""
        try:
            self.set_result(self.make_result())
        except Exception as e:
            self.set_exception(e)

    def make_result(self):
        """A list of results.

        Includes None for each pending future, and a result for each
        resolved future. Raises an exception for the first future
        that has an exception.
        """
        return [f.result() if f.done() else None
                for f in self.futures]

    def clear(self):
        """Break reference cycle with any pending futures."""
        self.futures = None

Here's an example use of Any:

@gen.coroutine
def delayed_msg(seconds, msg):
    yield gen.Task(IOLoop.current().add_timeout,
                   time.time() + seconds)
    raise gen.Return(msg)


@gen.coroutine
def f():
    start = time.time()
    future1 = delayed_msg(2, '2')
    future2 = delayed_msg(3, '3')
    future3 = delayed_msg(1, '1')

    # future3 will resolve first.
    results = yield Any([future1, future2, future3])
    end = time.time()
    print "finished in %.1f sec: %r" % (end - start, results)

    # Wait for any of the remaining futures.
    results = yield Any([future1, future2])
    end = time.time()
    print "finished in %.1f sec: %r" % (end - start, results)

IOLoop.current().run_sync(f)

As expected, this prints:

finished in 1.0 sec: [None, None, '1']
finished in 2.0 sec: ['2', None]

But you can see there are some complications with this approach. For one thing, if you want to wait for the rest of the futures after the first one resolves, it's complicated to construct the list of still-pending futures. I suppose you could do:

futures = [future1, future2, future3]
results = yield Any(f for f in futures
                    if not f.done())

Not pretty. And not correct, either! There's a race condition: if a future is resolved in between consecutive executions of this code, you may never receive its result. On the first call, you get the result of some other future that resolves faster, but by the time you're constructing the list to pass to the second Any, your future is now "done" and you omit it from the list.

Another complication is the reference cycle: Any refers to each future, which refers to a callback which refers back to Any. For prompt garbage collection, you should call clear() on Any before it goes out of scope. This is very awkward.

Additionally, you can't distinguish between a pending future, and a future that resolved to None. You'd need a special sentinel value distinct from None to represent a pending future.

The final complication is the worst. If multiple futures are resolved and some of them have exceptions, there's no obvious way for Any to communicate all that information to you. Mixing exceptions and results in a list would be perverse.

Fortunately, there's a better way. We can make Any return just the first future that resolves, instead of a list of results:

class Any(Future):
    def __init__(self, futures):
        super(Any, self).__init__()
        for future in futures:
            future.add_done_callback(self.done_callback)

    def done_callback(self, future):
        self.set_result(future)

The reference cycle is gone, and the exception-handling question is answered: The Any class returns the whole future to you, instead of its result or exception. You can inspect it as you like.

It's also easy to wait for the remaining futures after some are resolved:

@gen.coroutine
def f():
    start = time.time()
    future1 = delayed_msg(2, '2')
    future2 = delayed_msg(3, '3')
    future3 = delayed_msg(1, '1')

    futures = set([future1, future2, future3])
    while futures:
        resolved = yield Any(futures)
        end = time.time()
        print "finished in %.1f sec: %r" % (
            end - start, resolved.result())
        futures.remove(resolved)

As desired, this prints:

finished in 1.0 sec: '1'
finished in 2.0 sec: '2'
finished in 3.0 sec: '3'

There's no race condition now. You can't miss a result, because you don't remove a future from the list unless you've received its result.

To test the exception-handling behavior, let's make a function that raises an exception after a delay:

@gen.coroutine
def delayed_exception(seconds, msg):
    yield gen.Task(IOLoop.current().add_timeout,
                   time.time() + seconds)
    raise Exception(msg)

Now, instead of returning a result, one of our futures will raise an exception:

@gen.coroutine
def f():
    start = time.time()
    future1 = delayed_msg(2, '2')
    # Exception!
    future2 = delayed_exception(3, '3')
    future3 = delayed_msg(1, '1')

    futures = set([future1, future2, future3])
    while futures:
        resolved = yield Any(futures)
        end = time.time()
        try:
            outcome = resolved.result()
        except Exception as e:
            outcome = e

        print "finished in %.1f sec: %r" % (
            end - start, outcome)
        futures.remove(resolved)

Now, the script prints:

finished in 1.0 sec: '1'
finished in 2.0 sec: '2'
finished in 3.0 sec: Exception('3',)

It took a bit of thinking, but our final Any class is simple. It lets you launch many concurrent operations and process them in the order they complete. Not bad.

How Thoroughly Are You Testing Your C Extensions?

You probably know how to find Python code that isn't exercised by your tests. Install coverage and run: $ coverage run --source=SOURCEDIR setup.py test Then, for a beautiful coverage report: $ coverage html But what about your C [...]

You probably know how to find Python code that isn't exercised by your tests. Install coverage and run:

$ coverage run --source=SOURCEDIR setup.py test

Then, for a beautiful coverage report:

$ coverage html

But what about your C extensions? They're harder to write than Python, so you better make sure they're thoroughly tested. On Linux, you can use gcov. First, recompile your extension with the coverage hooks:

$ export CFLAGS="-coverage"
$ python setup.py build_ext --inplace

In your build directory (named like build/temp.linux-x86_64-2.7) you'll now see some files with the ".gcno" extension. These are gcov's data files. Run your tests again and the directory will fill up with ".gcda" files that contain statistics about which parts of your C code were run.

You have a number of ways to view this coverage information. I use Eclipse with the gcov plugin installed. (Eclipse CDT includes it by default.) Delightfully, Eclipse on my Mac understands coverage files generated on a Linux virtual machine, with no hassle at all.

lcov can make you some nice HTML reports. Run it like so:

$ lcov --capture --directory . --output-file coverage.info
$ genhtml coverage.info --output-directory out

Here's a portion of its report for PyMongo's BSON decoder:

lcov table

Our C code coverage is significantly lower than our Python coverage. This is mainly because such a large portion of the C code is devoted to error handling: it checks for every possible error, but we only trigger a subset of all possible errors in our tests.

A trivial example is in _write_regex_to_buffer, when we ensure the buffer is large enough to hold 4 more bytes. We check that realloc, if it was called, succeeded:

lcov source: No Memory

We don't run out of memory during our tests, so these two lines of error-handling are never run. A more realistic failure is in decode_all:

lcov source

This is the error handler that runs when a message is shorter than five bytes. Evidently the size check runs 56,883 times during our tests, but this particular error never occurs so the error-handler isn't tested. This is the sort of insight that'd be onerous to attain without a tool like gcov.

Try it for yourself and see: are you testing your C code as thoroughly as your Python?


You might also like my article on automatically detecting refcount errors in C extensions, or the one on making C extensions compatible with mod_wsgi.

Begging

This May I'll spend four days homeless, with a Zen teacher named Genro and a small group of fellow Buddhists. We'll live, sleep, and meditate on the streets together and eat at soup kitchens. I think the retreat has a triple purpose: First, [...]

7161960026 e92ea3c4bb

This May I'll spend four days homeless, with a Zen teacher named Genro and a small group of fellow Buddhists. We'll live, sleep, and meditate on the streets together and eat at soup kitchens. I think the retreat has a triple purpose: First, briefly abandoning the comfort and certainty of my regular life helps me practice non-attachment, the same as it helped the first Buddhist monks. Second, it gives me a taste of what it's like to be homeless, so I can better understand the homeless people I meet in NYC. And finally, it's an opportunity to raise money for homeless services.

People often ask me whether, by doing street retreat, we're competing with homeless people for scarce resources. I think not—we stay at the back of the line in soup kitchens, in case there's not enough food, and we sleep on sidewalks instead of in shelters. We beg for a few dollars on the street, but we donate thousands of dollars.

The rule is that I must raise $500 by May 8. The money will be distributed among the organizations that help us while we're on the street, and it will support the social service activities of the Hudson River Zen Center. I have to beg for the money—I'm not allowed to just donate $500 of my own.

So I'm begging you: Will you please donate?

(Tax-deductible. If want a receipt, let me know.)

Update: I've now raised my minimum, but if you'd still like to donate, please do!

Stealing

My mother recently discovered some stories about my grandfather Milton Rubin's arbitration career. He was the impartial chairman of arbitrations between labor unions and employers, so he was called "Mr. Impartial": As Milt told it: On [...]

My mother recently discovered some stories about my grandfather Milton Rubin's arbitration career. He was the impartial chairman of arbitrations between labor unions and employers, so he was called "Mr. Impartial":

As Milt told it: On one occasion, he was asked to preside over the discharge of Ramon, a dress cutter. It seems there was a big family reunion in San Juan, and that Ramon's wife had been substantially inflating accounts of his success, and their family status in New York. As far as anyone back home knew, Ramon was a virtual industrial captain of (as it is known) the Rag Trade. Desiring that, for the triumphant return, she dressed the part of her fictionalized grandeur, Ramon's wife prevailed upon him to steal a designer dress from the stock, and he complied. He was discovered, discharged, and the case came before Milt.

At arbitration, the Union ignored the usual litigation process and turned directly to the President of the company at the start of the hearing, making an impassioned plea: "Ramon has been with you for thirty years—he's a good man, he's been a good and loyal worker. He made a mistake. Please, please don't throw him on the street."

As Milt describes it, the president was moved, but obviously conflicted. He turned to Milt and asked: "Mr. Impartial, what should I do?" Milt responded: "Don't ask me what you should do—I have a different responsibility; this is a theft case, the parties have asked for a decision on just cause." While everyone in the room sat, the president got up, went to the windows and stood looking out over 34th Street for the longest time. Finally, he turned around and said, "OK, OK, he can come back to work." The people were jubilant.

When the room had cleared out, the president turned to Milt and asked, "What do you think, Mr. Impartial, did I do the right thing?" Milt turned his palms up and said: "It's a very difficult problem you had—he stole." The president stood up and once more looked out the windows: "Yes, Mr. Impartial," he said softly, "and I steal a little, too."


From a National Academy of Arbitrators History Committee Report, 2007. As told to NAA Past President Richard Bloch. Compiled by Herb Marx.

Invitation to a Zen Street Retreat, May 2014

Roshi Genro Gauntt, of The Zen Peacemakers and Hudson River Zen Center, invites you to join a street retreat in New York City, May 8–11, 2014. We will sleep, meditate, and live on the streets together. It’s a chance to practice [...]

Streat Retreat

Roshi Genro Gauntt, of The Zen Peacemakers and Hudson River Zen Center, invites you to join a street retreat in New York City, May 8–11, 2014. We will sleep, meditate, and live on the streets together. It’s a chance to practice with a Zen teacher and a devoted group, to make friends with homeless people, and to feel the liberation of having nothing, like the first Buddhist monks. It's a plunge into the unknown. The barest poke at renunciation.

Logistics

The retreat starts on Thursday, May 8th at 3pm and will end on Sunday the 11th by noon. Partial participation is not an option. You can only join for the entire retreat. Our group will be together almost all of the time. We will conduct daily meditation, liturgy, and council.

Bring only a poncho, a blanket, a water bottle, and a photo ID. No money, credit cards, phone, change of clothes, books, toiletries, etc. Bring your prescription medicine if needed, of course.

Raising a Mala

We will be supported throughout by social service agencies and charities. Since we are homeless by choice, we want to make donations to those who will be supporting our lives. Raise $500 by begging from your family, friends, and associates, or just on the street. You may not use your own money. To sincerely engage in this experience we need to humble ourselves at the outset, attempt to explain to others our reasons for participating, and beg for their support. This is a hugely challenging and ultimately hugely rewarding experience. When we are sincere and truly speak from the heart, it’s no problem.

One-third of the funds will support the wide ranging social service activities of the Hudson River Zen Center, and we as a group will decide at the end of the retreat where two-thirds of the offerings should go.

To inquire about joining this retreat, please email me, A. Jesse Jiryu Davis, jesse@emptysquare.net.

Announcing PyMongo 2.7 release candidate

Yesterday afternoon Bernie Hackett and I shipped a release candidate for PyMongo 2.7, with substantial contributions from Amalia Hawkins and Kyle Erf. This version supports new features in the upcoming MongoDB 2.6, and includes major [...]

Leaf

Yesterday afternoon Bernie Hackett and I shipped a release candidate for PyMongo 2.7, with substantial contributions from Amalia Hawkins and Kyle Erf. This version supports new features in the upcoming MongoDB 2.6, and includes major internal improvements in the driver code. We rarely make RCs before releases, but given the scope of changes it seems wise.

Install the RC like:

pip install \
  https://github.com/mongodb/mongo-python-driver/archive/2.7rc0.tar.gz

Please tell us if you find bugs.

MongoDB 2.6 support

For the first time in years, the MongoDB wire protocol is changing. Bernie Hackett updated PyMongo to support the new protocol, while maintaining backwards compatibility with old servers. He also added support for MongoDB's new parallelCollectionScan command, which scans a whole collection with multiple cursors in parallel.

Amalia Hawkins wrote a feature for setting a server-side timeout for long-running operations with the max_time_ms method:

try:
    for doc in collection.find().max_time_ms(1000):
        pass
except ExecutionTimeout:
    print "Aborted after one second."

She also added support for the new aggregation operator, $out, which creates a collection directly from an aggregation pipeline. While she was at it, she made PyMongo log a warning whenever your read preference is "secondary" but a command has to run on the primary:

>>> client = MongoReplicaSetClient(
...     'localhost',
...     replicaSet='repl0',
...     read_preference=ReadPreference.SECONDARY)
>>> client.db.command({'reIndex': 'collection'})
UserWarning: reindex does not support SECONDARY read preference
and will be routed to the primary instead.
{'ok': 1}

Bulk write API

Bernie added a bulk write API. It's now possible to specify a series of inserts, updates, upserts, replaces, and removes, then execute them all at once:

bulk = db.collection.initialize_ordered_bulk_op()
bulk.insert({'_id': 1})
bulk.insert({'_id': 2})
bulk.find({'_id': 1}).update({'$set': {'foo': 'bar'}})
bulk.find({'_id': 3}).remove()
result = bulk.execute()

PyMongo collects the operations into a minimal set of messages to the server. Compared to the old style, bulk operations have lower network costs. You can use PyMongo's bulk API with any version of MongoDB, but you only get the network advantage when talking to MongoDB 2.6.

Improved C code

After great effort, I understand why our C extensions didn't like running in mod_wsgi. I wrote an explanation that's more detailed than you want to read. But even better, Bernie fixed our C code so mod_wsgi no longer slows it down or makes it log weird warnings. Finally, I put clear configuration instructions in the PyMongo docs.

Bernie fixed all remaining platform-specific C code. Now you can run PyMongo with its C extensions on ARM, for example if you talk to MongoDB from a Raspberry Pi.

Thundering herd

I overhauled MongoClient so its concurrency control is closer to what I did for MongoReplicaSetClient in the last release. With the new MongoClient, a heavily multithreaded Python application will be much more robust in the face of network hiccups or downed MongoDB servers. You can read details in the bug report.

GridFS cursor

We had several feature requests for querying GridFS with PyMongo, so Kyle Erf implemented a GridFS cursor:

>>> fs = gridfs.GridFS(client.db)
>>> # Find large files:
...
>>> fs.find({'length': {'$gt': 1024}}).count()
42
>>> # Find files whose names start with "Kyle":
...
>>> pattern = bson.Regex('kyle.*', 'i')
>>> cursor = fs.find({'filename': pattern})
>>> for grid_out_file in cursor:
...     print grid_out_file.filename
...
Kyle
Kyle1
Kyle Erf

You can browse all 53 new features and fixes in our tracker.

Enjoy!

Escaping Callback Hell

Even though he was the most charming singer in the world, Orpheus couldn't rescue Eurydice from hell. As he was leading her out of Hades he turned to call back to her, and lost her forever. But you can rescue your Python programs from callback [...]

Orpheus, Antonio Canova

Even though he was the most charming singer in the world, Orpheus couldn't rescue Eurydice from hell. As he was leading her out of Hades he turned to call back to her, and lost her forever.

But you can rescue your Python programs from callback hell! In the coming month I plan to release Motor 0.2 with full support for Tornado coroutines. You'll get an asynchronous interface to MongoDB that uses coroutines and Futures, and there won't be a callback in sight. (Unless you want them.)

Josh Austin at MaaSive.net is using a development version of Motor in production for the sake of its cleaner API. I wouldn't recommend using unreleased code, but Josh has written a superb article justifying his choice. He details the evolution of Motor's asynchronous API and shows how the latest style is simpler and less error-prone.

Go read his article, and escape Orpheus's tragic fate.

GreenletProfiler, A Fast Python Profiler For Gevent

If you use Gevent, you know it's great for concurrency, but alas, none of the Python performance profilers work on Gevent applications. So I'm taking matters into my own hands. I'll show you how both cProfile and Yappi stumble on programs [...]

If you use Gevent, you know it's great for concurrency, but alas, none of the Python performance profilers work on Gevent applications. So I'm taking matters into my own hands. I'll show you how both cProfile and Yappi stumble on programs that use greenlets, and I'll demonstrate GreenletProfiler, my solution.

cProfile Gets Confused by Greenlets

I'll write a script that spawns two greenlets, then I'll profile the script to look for the functions that cost the most. In my script, the foo greenlet spins 20 million times. Every million iterations, it yields to Gevent's scheduler (the "hub"). The bar greenlet does the same, but it spins only half as many times.

import cProfile
import gevent
import lsprofcalltree

MILLION = 1000 * 1000

def foo():
    for i in range(20 * MILLION):
        if not i % MILLION:
            # Yield to the Gevent hub.
            gevent.sleep(0)

def bar():
    for i in range(10 * MILLION):
        if not i % MILLION:
            gevent.sleep(0)

profile = cProfile.Profile()
profile.enable()

foo_greenlet = gevent.spawn(foo)
bar_greenlet = gevent.spawn(bar)
foo_greenlet.join()
bar_greenlet.join()

profile.disable()
stats = lsprofcalltree.KCacheGrind(profile)
stats.output(open('cProfile.callgrind', 'w'))

Let's pretend I'm a total idiot and I don't know why this program is slow. I profile it with cProfile, and convert its output with lsprofcalltree so I can view the profile in KCacheGrind. cProfile is evidently confused: it thinks bar took twice as long as foo, although the opposite is true:

CProfile bar vs foo

cProfile also fails to count the calls to sleep. I'm not sure why cProfile's befuddlement manifests this particular way. If you understand it, please explain it to me in the comments. But it's not surprising that cProfile doesn't understand my script: cProfile is built to trace a single thread, so it assumes that if one function is called, and then a second function is called, that the first must have called the second. Greenlets defeat this assumption because the call stack can change entirely between one function call and the next.

Yappi Stumbles Over Greenlets

Next let's try Yappi, the excellent profiling package by Sumer Cip. Yappi has two big advantages over cProfile: it's built to trace multithreaded programs, and it can measure CPU time instead of wall-clock time. So maybe Yappi will do better than cProfile on my script? I run Yappi like so:

yappi.set_clock_type('cpu')
yappi.start(builtins=True)

foo_greenlet = gevent.spawn(foo)
bar_greenlet = gevent.spawn(bar)
foo_greenlet.join()
bar_greenlet.join()

yappi.stop()
stats = yappi.get_func_stats()
stats.save('yappi.callgrind', type='callgrind')

Yappi thinks that when foo and bar call gevent.sleep, they indirectly call Greenlet.run, and eventually call themselves:

Yappi call graph

This is true in some philosophical sense. When my greenlets sleep, they indirectly cause each other to be scheduled by the Gevent hub. But it's wrong to say they actually call themselves recursively, and it confuses Yappi's cost measurements: Yappi attributes most of the CPU cost of the program to Gevent's internal Waiter.get function. Yappi also, for some reason, thinks that sleep is called only once each by foo and bar, though it knows it was called 30 times in total.

Yappi costs

GreenletProfiler Groks Greenlets

Since Yappi is so great for multithreaded programs, I used it as my starting point for GreenletProfiler. Yappi's core tracing code is in C, for speed. The C code has a notion of a "context" which is associated with each thread. I added a hook to Yappi that lets me associate contexts with greenlets instead of threads. And voilà, the profiler understands my script! foo and bar are correctly measured as two-thirds and one-third of the script's total cost:

GreenletProfiler costs

Unlike Yappi, GreenletProfiler also knows that foo calls sleep 20 times and bar calls sleep 10 times:

GreenletProfiler call graph

Finally, I know which functions to optimize because I have an accurate view of how my script executes.

Conclusion

I can't take much credit for GreenletProfiler, because I stand on the shoulders of giants. Specifically I am standing on the shoulders of Sumer Cip, Yappi's author. But I hope it's useful to you. Install it with pip install GreenletProfiler, profile your greenletted program, and let me know how GreenletProfiler works for you.

The Time Capsule

This winter, my Zen group is studying the Diamond Sutra. The text is a staticky, garbled transmission from the Buddhists of India more than two thousand years ago. But we have managed to receive it, thanks to its authors' longing to [...]

Pioneer plaque

This winter, my Zen group is studying the Diamond Sutra. The text is a staticky, garbled transmission from the Buddhists of India more than two thousand years ago. But we have managed to receive it, thanks to its authors' longing to communicate with us, and the faithful effort of the generations since.

The Sutra is a dialog between the Buddha and his student Subhuti. The two monks spout contradictions, and delightedly agree with each other. For example, in Paul Harrison's translation, here is how Subhuti recounts the teachings of the Buddha, who is called the Tathagata here:

That man whom the Tathagata has described as full-bodied and big-bodied has, Lord, been described by the Tathagata as bodiless. That is why he is called full-bodied and big-bodied.

Subhuti is merely contradictory, but the Buddha is incomprehensible. Referring to himself in the third person, he says:

The Tathagata has preached, Subhuti, that the so-called 'dispositions of a field' are dispositionless. That is why they are called 'dispositions of a field.'

Some sections of the Diamond Sutra have defeated the best scholars of our day. Edward Conze writes,

The second part of the Sutra presents the commentator with exceptional and so far insuperable difficulties. It is not impossible that one day someone may succeed in offering a satisfactory explanation. None has yet been found.... Our bewilderments are perhaps due to invincible obtuseness. It is equally possible that they derive from the state of the text which has been transmitted to us. Far from representing a coherent whole, the second part of the Diamond Sutra may very well be no more than a chance medley of stray sayings.

And yet, there is a meaning the Sutra wants very badly to transmit to the future—to us. Subhuti is worried we won't get the message, but the Buddha reassures him:

The Venerable Subhuti said this to the Lord, "Can it be, Lord, that there will be any living beings at a future time, when the final five hundred years come to pass, who, when the words of such discourses as these are being spoken, will conceive the idea that they are the truth?" The Lord said, "...there will be bodhisattvas and mahasattvas at a future time, when in the final five hundred years the destruction of the true dharma is coming to pass, who will be endowed with moral conduct, good qualities, and insight."

In order to communicate, the Sutra must survive, and a large portion of the text is devoted to its survival. The sutra offers boundless merit to "someone who after copying it would learn it, memorize it, recite it, master it, and elucidate it in full for others." Copying and memorizing the sutra were particularly important in ancient India, where paper deteriorated quickly. Just like a chain letter that survives because it promises good luck to those who forward it, the Sutra is marked with the attributes that helped it reproduce and multiply.

The Sutra's authors longed to speak to us across the distance. It reminds me of the gold plaques attached to the Pioneer probes when they were launched in the 1970s. They're illustrated with symbols which, it is hoped, smart aliens can use to understand who we were and where the probes came from. Two million years from now, if a spacefaring folk arises near Aldebaran in time to intercept one of the probes, how much will they comprehend? Maybe after long study, they'll realize that the series of dots along the bottom of the plaque indicate the sizes of the planets in our solar system. Some Aldebaranian grad student's career will be made when it discovers that the starburst pattern on the plaque is a map of pulsars. But the drawing of a man and woman will never communicate anything—scholars will wonder if it is their invincible obtuseness, or the state in which the plaque has arrived, that prevents a satisfactory explanation for the squiggly lines.

The Diamond Sutra is the same. There are some important things it wanted to tell us, which we will never understand. And yet, a lot of it makes plenty of sense. "How," Subhuti asks, "should one who has set out on the bodhisattva path take his stand, how should he proceed, how should he control the mind?" The Buddha responds plainly, telling him to be unattached to concepts: "Anybody to whom the idea of a living being occurs, or the idea of a soul or the idea of a person occurs, should not be called a bodhisattva." We should devote ourselves to wisdom and compassion, without getting hung up on philosophical questions about who we are or whom we're helping. To the extent that we follow that path, the transmission has been received.