How To Play Foul With getaddrinfo()
I'm the referee for a road rally. You have to drive from New York to San Francisco in 48 hours, but—here's the catch—I'm going to start my stopwatch before you can look at the map. Worse, there are hundreds of other drivers who need the map, and only one driver can use it at a time. If you're unlucky, you could spend the whole 48 hours waiting in line.
Sound fair? Not to me. But this was how my library, Motor, worked with asyncio. In this article, the third of my four-part series about Python's getaddrinfo
on Mac, I'll tell you how Guido van Rossum, Yury Selivanov, and I fixed asyncio so it could referee a fair race.
An Unfair Stopwatch
Motor is my async Python driver for MongoDB. Back in December, a data scientist at the Washington Post reported that on his Mac, Motor timed out trying to connect to MongoDB, even if MongoDB was running on the local machine. The cause is this: his script had begun to download hundreds of remote feeds, and each of those downloads required a DNS lookup. On Mac OS X, Python only permits one call to getaddrinfo
at a time.
It's like my unfair road rally: Motor starts a 20-second timer, then calls asyncio's create_connection
. Now asyncio needs to the getaddrinfo
lock, but there are hundreds of tasks in line ahead of it. By the time it gets the lock, resolves "localhost" and starts to open a socket, the timeout has ended and Motor cancels the task.
Fixing The Rules
I proposed three solutions to the asyncio team, none perfect. Guido responded with two more:
- Modify asyncio so if you pass it something that looks like a numerical address it skips calling
getaddrinfo
.
The idea here is for Motor to run getaddrinfo
itself. Then it starts the timer and passes the IP address to asyncio. Now the race is fair: Motor only counts how long asyncio spends actually connecting.
Guido's other idea seemed daunting:
- Do the research to prove that
getaddrinfo
on OS X is thread-safe and submit a patch that avoids thegetaddrinfo
lock on those versions of OS X.
I decided to leave the archeological research for another day when I was feeling more Indiana Jonesy. I could modify asyncio right away.
Fixing The Rules Isn't Simple
Guido's initial proposal sounded easy enough. If Motor has resolved a host to the IP address "1.2.3.4", and executes this:
yield from loop.create_connection(Protocol, '1.2.3.4', 80)
... then asyncio should see that "1.2.3.4" is already an IP address, and skip the getaddrinfo
call. Instead, asyncio should choose the proper address family, socket type, protocol, and so on, as if it had called getaddrinfo
, but without ever waiting in line.
It would be as if your co-pilot showed up with a route already planned. You wouldn't get in line to use the map; you'd jump in your rally car and start driving.
I set off to write some Python 3 code that recognizes an IP address and constructs a fake getaddrinfo
response. A useful module called ipaddress
was added to the standard library in Python 3.3, so implementing recognition went swimmingly:
try:
addr = ipaddress.IPv4Address(host)
except ValueError:
try:
addr = ipaddress.IPv6Address(host.partition('%')[0])
except ValueError:
# Host isn't an IP address, can't skip getaddrinfo.
return None
That partition
call is needed to remove the IP address's zone index if it has one. For example the IPv6 address for "localhost" might be "fe80::1%lo0", which specifies the "loopback 0" interface. Yury, Guido, and I had never heard of zone indexes before, but we figured it out and carried on.
Recognizing an IP address isn't enough, because converting a host name to an IP address isn't all that getaddrinfo
does. What about the other parameters: family, socket type, protocol, flags? Each of these inputs to getaddrinfo
influences its return value. I needed to reproduce this logic accurately in pure Python, without getting in line to use the actual getaddrinfo
call.
Consider how getaddrinfo
infers the protocol from the socket type: SOCK_STREAM
implies TCP, SOCK_DGRAM
implies UDP. So I tried the obvious code:
if socket_type == SOCK_STREAM:
proto = IPPROTO_TCP
elif socket_type == SOCK_DGRAM:
proto = IPPROTO_UDP
But on Linux, and Linux only, the socket type is a bitmask that can be combined with the flags SOCK_NONBLOCK
and SOCK_CLOEXEC
. So the check socket_type == SOCK_STREAM
is wrong on Linux. I had to test a bitwise "and" instead:
if socket_type & SOCK_STREAM:
proto = IPPROTO_TCP
elif socket_type & SOCK_DGRAM:
proto = IPPROTO_UDP
Again and again, I thought I had finished, but the socket-programming API has innumerable corners and platform differences. Although Guido's idea was simple—teach asyncio to recognize an already-resolved IP address and simulate a getaddrinfo
response—it required 15 revisions before I'd straightened out the kinks and satisfied Yury and Guido.
On the 16th revision, we merged. The fix will ship with Python 3.4.5, 3.5.2, and 3.6.0.
I wrote,
This has certainly been educational.
Yury replied,
For me too. :)
This Is Not The End
So I gave you a way to skip the line. If you planned your route beforehand, you can get right in your car and start driving.
And yet: as you peel out from the starting line and accelerate toward San Francisco, you feel a pang. You glance in the rearview mirror and see all those other drivers, the ones who didn't come prepared, waiting in line to use the map. Wasn't there some better way? Couldn't they all just...share the map?
They can. Stay tuned for the final installment of this series about Python's getaddrinfo
on Mac.
Links:
- The asyncio pull request to allow skipping
getaddrinfo
. - Mailing list discussion with the asyncio team.
- The Motor bug report.
- This four-part series about
getaddrinfo
on Mac.
Images: