I’m the referee for a road rally. You have to drive from New York to San Francisco in 48 hours, but—here’s the catch—I’m going to start my stopwatch before you can look at the map. Worse, there are hundreds of other drivers who need the map, and only one driver can use it at a time. If you’re unlucky, you could spend the whole 48 hours waiting in line.
Sound fair? Not to me. But this was how my library, Motor, worked with asyncio. In this article, the third of my four-part series about Python’s
getaddrinfo on Mac, I’ll tell you how Guido van Rossum, Yury Selivanov, and I fixed asyncio so it could referee a fair race.
An Unfair Stopwatch
Motor is my async Python driver for MongoDB. Back in December, a data scientist at the Washington Post reported that on his Mac, Motor timed out trying to connect to MongoDB, even if MongoDB was running on the local machine. The cause is this: his script had begun to download hundreds of remote feeds, and each of those downloads required a DNS lookup. On Mac OS X, Python only permits one call to
getaddrinfo at a time.
It’s like my unfair road rally: Motor starts a 20-second timer, then calls asyncio’s
create_connection. Now asyncio needs to the
getaddrinfo lock, but there are hundreds of tasks in line ahead of it. By the time it gets the lock, resolves “localhost” and starts to open a socket, the timeout has ended and Motor cancels the task.
Fixing The Rules
I proposed three solutions to the asyncio team, none perfect. Guido responded with two more:
- Modify asyncio so if you pass it something that looks like a numerical address it skips calling
The idea here is for Motor to run
getaddrinfo itself. Then it starts the timer and passes the IP address to asyncio. Now the race is fair: Motor only counts how long asyncio spends actually connecting.
Guido’s other idea seemed daunting:
- Do the research to prove that
getaddrinfoon OS X is thread-safe and submit a patch that avoids the
getaddrinfolock on those versions of OS X.
I decided to leave the archeological research for another day when I was feeling more Indiana Jonesy. I could modify asyncio right away.
Fixing The Rules Isn’t Simple
Guido’s initial proposal sounded easy enough. If Motor has resolved a host to the IP address “220.127.116.11”, and executes this:
yield from loop.create_connection(Protocol, '18.104.22.168', 80)
… then asyncio should see that “22.214.171.124” is already an IP address, and skip the
getaddrinfo call. Instead, asyncio should choose the proper address family, socket type, protocol, and so on, as if it had called
getaddrinfo, but without ever waiting in line.
It would be as if your co-pilot showed up with a route already planned. You wouldn’t get in line to use the map; you’d jump in your rally car and start driving.
I set off to write some Python 3 code that recognizes an IP address and constructs a fake
getaddrinfo response. A useful module called
ipaddress was added to the standard library in Python 3.3, so implementing recognition went swimmingly:
try: addr = ipaddress.IPv4Address(host) except ValueError: try: addr = ipaddress.IPv6Address(host.partition('%')) except ValueError: # Host isn't an IP address, can't skip getaddrinfo. return None
partition call is needed to remove the IP address’s zone index if it has one. For example the IPv6 address for “localhost” might be “fe80::1%lo0”, which specifies the “loopback 0” interface. Yury, Guido, and I had never heard of zone indexes before, but we figured it out and carried on.
Recognizing an IP address isn’t enough, because converting a host name to an IP address isn’t all that
getaddrinfo does. What about the other parameters: family, socket type, protocol, flags? Each of these inputs to
getaddrinfo influences its return value. I needed to reproduce this logic accurately in pure Python, without getting in line to use the actual
getaddrinfo infers the protocol from the socket type:
SOCK_STREAM implies TCP,
SOCK_DGRAM implies UDP. So I tried the obvious code:
if socket_type == SOCK_STREAM: proto = IPPROTO_TCP elif socket_type == SOCK_DGRAM: proto = IPPROTO_UDP
But on Linux, and Linux only, the socket type is a bitmask that can be combined with the flags
SOCK_CLOEXEC. So the check
socket_type == SOCK_STREAM is wrong on Linux. I had to test a bitwise “and” instead:
if socket_type & SOCK_STREAM: proto = IPPROTO_TCP elif socket_type & SOCK_DGRAM: proto = IPPROTO_UDP
Again and again, I thought I had finished, but the socket-programming API has innumerable corners and platform differences. Although Guido’s idea was simple—teach asyncio to recognize an already-resolved IP address and simulate a
getaddrinfo response—it required 15 revisions before I’d straightened out the kinks and satisfied Yury and Guido.
On the 16th revision, we merged. The fix will ship with Python 3.4.5, 3.5.2, and 3.6.0.
This has certainly been educational.
For me too. :)
This Is Not The End
So I gave you a way to skip the line. If you planned your route beforehand, you can get right in your car and start driving.
And yet: as you peel out from the starting line and accelerate toward San Francisco, you feel a pang. You glance in the rearview mirror and see all those other drivers, the ones who didn’t come prepared, waiting in line to use the map. Wasn’t there some better way? Couldn’t they all just…share the map?
They can. Stay tuned for the final installment of this series about Python’s
getaddrinfo on Mac.
- The asyncio pull request to allow skipping
- Mailing list discussion with the asyncio team.
- The Motor bug report.
- This four-part series about