Yes, Every MongoDB Driver Supports Every Command
This post is in response to a persistent form of question I receive about MongoDB drivers: "Does driver X support feature Y?" The answer is nearly always "yes," but you can't know that unless you understand MongoDB commands.
There are only four kinds of operations a MongoDB driver can perform on the server: insert, update, remove, query, and commands.
Almost two years ago my colleague Kristina wrote about "Why Command Helpers Suck," and she is still right: if you only use the convenience methods without understanding the unifying concept of a "command," you're unnecessarily tied to a particular driver's API, and you don't know how MongoDB really works.
So let's do a pop quiz:
- Which MongoDB drivers support the Aggregation Framework?
- Which support the "group" operation?
- Which drivers are compatible with MongoDB's mapreduce feature?
- Which drivers let you run "count" or "distinct" on a collection?
If you answered, "all of them," you're right—every driver supports commands, and all the features I asked about are commands.
Let's consider three MongoDB drivers for Python and show examples of using the distinct
command in each.
PyMongo
PyMongo has two convenience methods for distinct
. One is on the Collection
class, the other on Cursor
:
>>> from pymongo import MongoClient
>>> db = MongoClient().test
>>> db.test_collection.distinct('my_key')
[1.0, 2.0, 3.0]
>>> db.test_collection.find().distinct('my_key')
[1.0, 2.0, 3.0]
But this all boils down to the same MongoDB command. We can look up its arguments in the MongoDB Command Reference and see that distinct takes the form:
{ distinct: collection, key: <field>, query: <query> }
So let's use PyMongo's generic command
method to run distinct
directly. We'll pass the collection
and key
arguments and omit query
. We need to use PyMongo's SON
class to ensure we pass the arguments in the right order:
>>> from bson import SON
>>> db.command(SON([('distinct', 'test_collection'), ('key', 'my_key')]))
{u'ok': 1.0,
u'stats': {u'cursor': u'BasicCursor',
u'n': 3,
u'nscanned': 3,
u'nscannedObjects': 3,
u'timems': 0},
u'values': [1.0, 2.0, 3.0]}
The answer is in values
.
Motor
My async driver for Tornado and MongoDB, called Motor, supports a similar conveniences for distinct
. It has both the MotorCollection.distinct
method:
>>> from tornado.ioloop import IOLoop
>>> from tornado import gen
>>> import motor
>>> from motor import MotorConnection
>>> db = MotorConnection().open_sync().test
>>> @gen.engine
... def f():
... print (yield motor.Op(db.test_collection.distinct, 'my_key'))
... IOLoop.instance().stop()
...
>>> f()
>>> IOLoop.instance().start()
[1.0, 2.0, 3.0]
... and MotorCursor.distinct
:
>>> @gen.engine
... def f():
... print (yield motor.Op(db.test_collection.find().distinct, 'my_key'))
... IOLoop.instance().stop()
...
>>> f()
>>> IOLoop.instance().start()
[1.0, 2.0, 3.0]
Again, these are just convenient alternatives to using MotorDatabase.command
:
>>> @gen.engine
... def f():
... print (yield motor.Op(db.command,
... SON([('distinct', 'test_collection'), ('key', 'my_key')])))
... IOLoop.instance().stop()
...
>>> f()
>>> IOLoop.instance().start()
{u'ok': 1.0,
u'stats': {u'cursor': u'BasicCursor',
u'n': 3,
u'nscanned': 3,
u'nscannedObjects': 3,
u'timems': 0},
u'values': [1.0, 2.0, 3.0]}
AsyncMongo
AsyncMongo is another driver for Tornado and MongoDB. Its interface isn't nearly so rich as Motor's, so I often hear questions like, "Does AsyncMongo support distinct
? Does it support aggregate
? What about group
?" In fact, it's those questions that prompted this post. And of course the answer is yes, AsyncMongo supports all commands:
>>> from tornado.ioloop import IOLoop
>>> import asyncmongo
>>> db = asyncmongo.Client(
... pool_id='mydb', host='127.0.0.1', port=27017,
... maxcached=10, maxconnections=50, dbname='test')
>>> @gen.engine
... def f():
... results = yield gen.Task(db.command,
... SON([('distinct', 'test_collection'), ('key', 'my_key')]))
... print results.args[0]
... IOLoop.instance().stop()
...
>>> f()
>>> IOLoop.instance().start()
{u'ok': 1.0,
u'stats': {u'cursor': u'BasicCursor',
u'n': 3,
u'nscanned': 3,
u'nscannedObjects': 3,
u'timems': 0},
u'values': [1.0, 2.0, 3.0]}
Exceptions
There are some areas where drivers really differ, like Replica Set support, or Read Preferences. 10gen's drivers are much more consistent than third-party drivers. But if the underlying operation is a command, then all drivers are essentially the same.
So Go Learn How To Run Commands
So the next time you're about to ask, "Does driver X support feature Y," first check if Y is a command by looking for it in the command reference. Chances are it's there, and if so, you know how to run it.