Fast, easy, event driven servers in Python with Eventlet

Saturday August 17 2013
python async web server stackless

Event-driven web servers are presently all the rage. But what does event-driven mean? To understand why using event-driven servers is advantageous we need to comprehend processes, threads, and preemption. If you don't care for the technical background, feel free to skip over it.

Multitasking

Any operating system consists of a kernel that implements some basic functionality. One of the fundmental reasons for having an operating system is implementing a single system to allow multiple processes to exist at once. This doesn't neccessarily mean the processes are all executing at the same time. What is process anyways? A process is the executing copy of a program. For example, the web browser Google Chrome has the ability to start additional processes for each tab you open. This means that if something goes wrong (a program crash) in one tab, the others should not be affected. Almost everything you do on your computer involves interacting with a process.

But how can you have multiple processes running at once? Modern computers certainly do have CPUs with multiple cores, each one capable of executing a seperate process. But, multitasking was popular long before modern CPUs and even modern CPUs only have four or so cores. The average desktop computer has more than one hundred processes running on it at any point in time. The kernel of the operating system uses a strategy of timesharing the available CPU amongst multiple processes. In essence, it switches between all of the executing processes quickly to give the appearance of them all running simultaneous. This is great if you are listening to music and browsing the web at the same time. The kernel is able to switch back and forth between the web browser and your music player quickly. Browsing the web is still smooth, and the music doesn't skip.

All of this switching back and forth comes at a cost. Each time a switch is made, the kernel of the operating system has to take control of the CPU. It then decides what process gets to run next. Then, it saves the current process off for later and resumption. Finally, it restores the process that is next to line in run. None of these operations are cost-free. This all adds up to an overhead cost. There has been a tremendous amount of work put into the open source kernels like the BSD project and the Linux kernel to minimize the overhead.

In the world of servers, rather than desktops, a different factor than the need for apparent smoothness of running applications dominates what process gets to run next. In determining what process gets to run next, the kernel is primarily looking at who is ready to do IO(Input and Output). For example, consider a HTTP server that is serving an on-disk file to a browser. When the server gets the request, it is a relatively small amount of information: just the HTTP request headers explaining what resource is being requested. The first part of the reply from the browser is also pretty small: just the response headers as well. Reading and writing this data can happen quickly. But actually serving the file may take a larger period of time. First the process has to actually open the file and wait for the disk to be ready to read it. This amount of time depends entirely on the speed of the hard disk. Then, the process has to stream the bytes from the file back over the connection to the brower. If the file is small, this may happen all at once. If the file is large, this may involve multiple cycles of the process writing to the stream and alternately reading from the disk. No matter what, the web server spends most of its time just waiting to do input and output.

How can a web server handle multiple requests at the same time? The design of web servers like Apache is to start a new process for each new connection. The kernel can then switch between the processes each time one becomes ready to do something. In the previous example, that would be reading from disk or writing to the stream. This results in a simple, easy to implement design. However, starting a new process is not free and we already discussed that having the kernel switch between running processes is quite expensive. The cost of starting a process can be slightly mitigated by using a concept known as a thread. A thread is essentially a new process that actively shares many resources, including all of its memory, with the process that started it. Even if creating a new thread were free, this doesn't do anything to remove the kernel's overhead of switching between them.

In order to allow server processes to implement more effective multi-connection handling strategies, the kernel of operating systems can expose basic information about each connection. Using that information, the server can answer questions like "Is the connection ready for me send data?" or "Does the connection have data ready for me to receive?". Since the server process has this information, it can intelligently read and write from connections as they become ready. With this, the server doesn't need to start additional processes and rely on the kernel's expensive mechanisms for switching processes. Simply put, the job of scheduling what happens next has been shifted from the kernel to the web server. This pattern results in something called event driven programming. Web servers like Nginx use this model to great success. Python frameworks such Tornado have brilliantly exposed this model. This allows event driven web servers to be written in Python rather in lower-level languages like C.

For the reader looking for a very technical explanation of event-driven IO, consult the man pages for select, epoll (Linux), and kqueue (BSD).

The problem with event-driven programming is it winds up being a very non-intuitive programming style. Most programs wind up extensively using callbacks or state machines. This isn't inherently bad, but understanding and maintaining such softare often ends up more time intensive than a program written in a sequential manner. The Eventlet framework is able to side step all of these problems by using a different model of threading inside the Python interpreter. To the kernel, the Python interpreter appears as a single process with many open connections. This is analagous to the way Nginx operates. Once inside the Python interpreter however,seperate threads of execution are available. Each thread of execution is implemented in as lightweight manner as possible. Additionally, the interpreter does not do rapid switching between threads. Thread execution is only stopped whenever the thread would be waiting to do IO. Eventlet is able to do this by hooking into the standard Python modules by a process known as monkey patching. Whenever a thread calls a function to do input or output, it actually is calling an Eventlet function. If the connection the thread is reading or writing to, everything goes on as normal. If not, the framework switches to another thread and lets it run. This process goes on indefinitely. This allows programs to be written as traditional sequential programs and still take advantage of event driven programs.

It's worth mentioning at this point that the traditional CPython interpreter does not support these threads in the same manner. If your environment comes preloaded with a Python interpreter, it's probably CPython. This isn't realy a problem as Stackless Python exists and implements this threading module. It's also compatible with all the existing Python modules you know and love.

Basic Eventlet usage

Lets jump directly into an Eventlet example.

import eventlet

def connection_handler(fd):
    print "client connected"
    while True:
        # pass through every non-eof line
        x = fd.readline()
        if not x: break
        fd.write(x)
        fd.flush()
        print "echoed", x,
    print "client disconnected"

print "server socket listening on port 6000"
server = eventlet.listen(('0.0.0.0', 6000))

while True:
    try:
        new_sock, address = server.accept()
        print "accepted", address
        eventlet.spawn_n(connection_handler, new_sock.makefile('rw'))
    except (SystemExit, KeyboardInterrupt):
        break

This program accepts new TCP connections on port 6000. On each connection it reads lines of input and simply writes them back to the client until the client disconnects. Let's figure out how it does that.

Listening for connections

In the example program, this line opens the listening socket

server = eventlet.listen(('0.0.0.0', 6000))

The function eventlet.listen is a convenience function for opening TCP listening sockets. It's passed a tuple consisting of an IP address and a port. The IP address can be a string. In this case it's the string '0.0.0.0' which means to listen on any and all available IP addresses the machine has. The second is an integer and is the port number to listen on. Once the socket is open, no actual IO is done on that socket. It's just for listening for new connections. At this point, the socket has just been created. Nothing is actually being done with it.

Accepting connections

Inside of a forever loop, the following line accepts connections

new_sock, address = server.accept()

This line waits forever for a connection to happen on the listening port. But wait, didn't you say that servers using the Eventlet framework can serve multiple requests simultaneously? They can, that's the magic of it. Inside the call to accept() the framework actually just does some internal book keeping. If a connection ever actually does occur on the socket, the framework resumes this thread of execution. Other than that, it's off switching between as many threads area actually running. As a programmer you needn't worry about any of this.

Once the call finally returns, it returns a two tuple consisting of the newly opened socket and another tuple that is the IP address and port.

Starting a new thread

Now that the program has a brand new connection, its time to start up a brand new thread to handle this connection. The call to eventlet.spawn_n() does this. The first argument is a callable object. In this case it's a function, but it can be any object implementing the __call__ method. The remaining arguments are passed to the callable. In this case, the makefile() method is invoked on the new socket so that it has the same semantics as a File object from the open() method. The new thread is started, connection_handler is called and passed this file object. It's a pretty simple function that just echos back the input.

Example run

In one terminal you can start the server. In another you can use the nc command to start netcat. This nifty program opens a TCP connection and takes input from standard in and writes received bytes to standard output.

Server Terminal

ericu@eric-phenom-linux:~/tmp$ python echoserver.py 
server socket listening on port 6000
accepted ('127.0.0.1', 44363)
client connected
echoed foo
echoed bar
echoed qux
client disconnected

Client Terminal

ericu@eric-phenom-linux:~$ nc 127.0.0.1 6000
foo
foo
bar
bar
qux
qux
^C
ericu@eric-phenom-linux:~$

It's really that simple

That's about all there is to writing server applications. There really isn't that much else to worry about. There's no events to receive, no worrying about what order things happen in, etc.

I highly reccomend you read each example available in the official documentation. If you want to write an HTTP backend, Eventlet already includes everything you need to run a WSGI application. WSGI is the preferred method for writing web applications in Python, not just using Eventlet. To learn more about WSGI take a look at PEP 333.

Error Handling

One great advantage to the multithreaded approach of Eventlet is what happens when something goes horribly, awfully wrong in a thread the framework kills that thread and cleans up any associated descriptors. It's business as usual for the rest of threads, including the main one that accepts connections. Take a look at this example

import eventlet

def connection_handler(sock):
    filepath = ''
    while '\x00' not in filepath:
        filepath += sock.recv(1)

    filepath = filepath.split('\x00')[0]

    print 'client requested: %s' % filepath
    with open(filepath) as fin:
        sock.sendall(fin.read())

    sock.close()

server = eventlet.listen(('0.0.0.0', 6000))

while True:
    try:
        new_sock, address = server.accept()
        print "accepted", address
        eventlet.spawn_n(connection_handler, new_sock)
    except (SystemExit, KeyboardInterrupt):
        break

This program accepts connections, reads until it finds a null byte and then tries to open that string as file. If it's successful, it writes the file back to the client and closes the connection. But if the file doesn't exist open() throws an exception. For this example I have a file on my hard disk at /tmp/y. But first I'll request /tmp/x which does not exist. Then I'll request /tmp/y from the same server.

Server console

ericu@eric-phenom-linux:~/tmp$ python fileserver.py 
accepted ('127.0.0.1', 45475)
client requested: /tmp/x
Traceback (most recent call last):
  File "/home/ericu/local/lib/python2.7/site-packages/eventlet-0.13.0-py2.7.egg/eventlet/hubs/hub.py", line 346, in fire_timers
    timer()
  File "/home/ericu/local/lib/python2.7/site-packages/eventlet-0.13.0-py2.7.egg/eventlet/hubs/timer.py", line 56, in __call__
    cb(*args, **kw)
  File "fileserver.py", line 11, in connection_handler
    with open(filepath) as fin:
IOError: [Errno 2] No such file or directory: '/tmp/x'
accepted ('127.0.0.1', 45513)
client requested: /tmp/y

Client console

ericu@eric-phenom-linux:~$ echo -en '/tmp/x\0' | nc 127.0.0.1 6000
^C
ericu@eric-phenom-linux:~$ echo -en '/tmp/y\0' | nc 127.0.0.1 6000
foo bar qux
ericu@eric-phenom-linux:~$

As you can see, there is a nice messy exception stack thrown when /tmp/x is requested. But the server keeps running just fine and is able to handle the request for /tmp/y with no problem. This is because when the exception is thrown, only that thread dies. This is no excuse for sloppy programming behavior, but overall I find it very nice to have this kind of built in safety.

Synchronization between threads

Any programmer reading this who is experienced with multithread programming practices such as pthreads is thinking to themselves "This guy is nuts! Threads for everything? I'm going to have mutexes everywhere!". It's a perfectly logical response. However, traditional multithreaded programs allow for a thread to be interrupted at any point in time. This leads to most code looking something like this

void * workerThread(void * someSharedData)
{
    mutex_lock();
    do_processing(someSharedData);
    mutex_unlock();
}

These practices are absolutely necessary in such an environment. However, in the Eventlet framework a thread cannot be arbitrarily interrupted. Instead, a thread is only interrupted when it goes to do IO. In other words, code like this example is perfectly safe.

sharedData = []

def worker(sock):
    factorial = ord(sock.read(1))
    result = 1
    for i in range(1,factorial+1):
        result *= i
    sock.write(str(result))
    sharedData.append((factorial,result))

The list sharedData is global across any threading running worker(). However, the thread cannot be interrupted while executing sharedData.append(). This isn't to say that all code in Eventlet is somehow inherently thread safe. If you read or write to a socket in the middle of operations on a shared data structure, you absolutely need to use synchronization primitives. Eventlet provides them in eventlet.semaphore.