Announcing PyMongo 2.8 Release Candidate
By Jebulon, via Wikimedia Commons
We've just tagged a release candidate of PyMongo, the standard MongoDB driver for Python. You can install it like:
pip install git+git://github.com/mongodb/mongo-python-driver.git@2.8rc0
Most of the changes between PyMongo 2.8 and the previous release, 2.7.2, are for compatibility with the upcoming MongoDB 2.8 release. (By coincidence, PyMongo and MongoDB are at the same version number right now.)
Compatibility
SCRAM-SHA-1 authentication
MongoDB 2.8 adds support for SCRAM-SHA-1 authentication and makes it the new default, replacing our inferior old protocol MONGODB-CR ("MongoDB Challenge-Response"). PyMongo's maintainer Bernie Hackett added support for the new protocol. PyMongo and MongoDB work together to make this change seamless: you can upgrade PyMongo first, then your MongoDB servers, and authentication will keep working with your existing passwords. When you choose to, you can upgrade how your passwords are hashed within the database itself—we'll document how to do that when we release MongoDB 2.8.
SCRAM-SHA-1 is more secure than MONGODB-CR, but it's also slower: the new protocol requires the client to do 10,000 iterations of SHA-1 by default, instead of one iteration of MD5. This has two implications for you.
First, you must create one MongoClient or MongoReplicaSetClient instance when your application starts up, and keep using it for your application's lifetime. For example, consider this little Flask app:
from pymongo import MongoClient
from flask import Flask
# This is the right thing to do:
db = MongoClient('mongodb://user:password@host').test
app = Flask(__name__)
@app.route('/')
def home():
doc = db.collection.find_one()
return repr(doc)
app.run()
That's the right way to build your app, because it lets PyMongo reuse connections to MongoDB and maintain a connection pool.
But time and again and I see people write request handlers like this:
@app.route('/')
def home():
# Wrong!!
db = MongoClient('mongodb://user:password@host').test
doc = db.collection.find_one()
return repr(doc)
When you create a new MongoClient for each request like this, it requires PyMongo to set up a new TCP connection to MongoDB for every request to your application, and then shut it down after each request. This already hurts your performance.
But if you're using authentication and you upgrade to PyMongo 2.8 and MongoDB 2.8, you'll also pay for SHA-1 hashing with every request. So if you aren't yet following my recommendation and reusing one client throughout your application, fix your code now.
Second, you should install backports.pbkdf2—it speeds up the hash computation, especially on Python older than 2.7.8, or on Python 3 before Python 3.4.
I've updated PyMongo's copy_database
so you can use SCRAM-SHA-1 authentication to copy between servers. More information about SCRAM-SHA-1 is in PyMongo's latest auth documentation.
count with hint
Starting in MongoDB 2.6 the "count" command can take a hint that tells it which index to use, by name. In PyMongo 2.8 Bernie added support for count with hint:
from pymongo import ASCENDING
collection.create_index([('field', ASCENDING)], name='my_index')
collection.find({
'field': {'$gt': 10}
}).hint('my_index').count()
This will work with MongoDB 2.6, and in MongoDB 2.8 count support hints by index specs, not just index names:
collection.find({
'field': {'$gt': 10}
}).hint([('field', ASCENDING)]).count()
PyMongo improvements
SON performance
Don Mitchell from EdX generously offered us a patch that improves the performance of SON, PyMongo's implementation of an ordered dict. His patch avoids unnecessary copies of field names in many of SON's methods.
socketKeepAlive
In some network setups, users need to set the SO_KEEPALIVE flag on PyMongo's TCP connections to MongoDB, so Bernie added a socketKeepAlive option to MongoClient and MongoReplicaSetClient.
Deprecation warnings
Soon we'll release a PyMongo 3.0 that removes many obsolete features from PyMongo and gives you a cleaner, safer, faster new API. But we want to make the upgrade as smooth as possible for you. To begin with, I documented our compatibility policy. I explained how to test your code to make sure you use no deprecated features of PyMongo.
Second, I deprecated some features that will be removed in PyMongo 3.0:
start_request
is deprecated and will be removed in PyMongo 3.0, because it's not the right way to ensure consistency, and it doesn't work with sharding in MongoDB 2.8. Further justifications can be found here.
MasterSlaveConnection
is deprecated and will be removed, since master-slave setups are themselves obsolete. Replica sets are superior to master-slave, especially now that replica sets can have more than 12 members. Anyway, even if you still have a master-slave setup, PyMongo's MasterSlaveConnection
wasn't very useful.
And finally, copy_database
is deprecated. We asked customers if they used it and the answer was no, people use the mongo shell for copying databases, not PyMongo. For the sake of backwards compatibility I upgraded PyMongo's copy_database
to support SCRAM-SHA-1, anyway, but in PyMongo 3.0 we plan to remove it. Let me know in the comments if you think this is the wrong decision.
Bugs
The only notable bugfix in PyMongo 2.8 is the delightfully silly mod_wsgi error I wrote about last month. But if you find any new bugs, please let us know by opening an issue in Jira, I promise we'll handle it promptly.