Skip to content

Instantly share code, notes, and snippets.

@brson
Last active January 4, 2016 13:29
Show Gist options
  • Save brson/8627898 to your computer and use it in GitHub Desktop.
Save brson/8627898 to your computer and use it in GitHub Desktop.

Network Communication And Serialization In Rust

Computers surround us. While some devices are useful on their own, many are unable to function unless networked to other, remote systems. Accordingly, network communication is an obvious use case where Rust's value proposition can be demonstrated (XXX this isn't making the case that Rust is good at networking - what is the value proposition re: networking + Rust. It may be better to emphasize that Rust has particular strengths for networking applications, or not try to make this point at all). While my previous post in this series focused on file-IO (XXX: 'file I/O'), this post will explore the standard library's API for communication over TCP/IP, providing code for a primitive server that responds to requests from a web browser, as a nod to utility and expedience (XXX I don't understand the 'utility and expedience' point). As a further complement to IO (XXX: 'I/O'), Rust's serialization API will be introduced. These two things together help programmers compose robust, networked applications and services that communicate with structured data.

This post assumes a basic acquaintence (XXX: 'acquaintance') with Rust and its semantics. The Rust Tutorial is a good place to start if you'd like to get up to speed. The code in this example is written for the 0.9 Release (XXX: '0.9 release') of Rust's compiler and standard library from early January, 2014.

A Simple TCP Server

The code below shows a simple server that binds, listens for data, immediately writes a response and then terminates. The server avoids parsing the request and what is written back is a bare bones, valid HTTP response.

use std::str::from_utf8;
use std::io::{Acceptor, Listener};
use std::io::net::tcp::TcpListener;

fn main() {
    let resp_bytes = bytes!("HTTP/1.0 200 OK\nContent-Type: text/html\n\n<html><head><title>hello, world!</title></head><body>foo</body></html>");

    // XXX: real code would rather write `expect("malformed address")` so you should probably do that. If you want to
    // explain that `expect` is introducing a possible failure, maybe do that in a comment.
    let addr =
        FromStr::from_str("127.0.0.1:8080").expect("fails on malformed address");
    let listener = TcpListener::bind(addr).expect("fails if we can't bind"); // XXX: ditto
    let mut acceptor = listener.listen();

    // XXX: This block expression is not a very familiar construct. If it's required to solve a borrowck error you might
    // want to explain it; if not then you might remove it.
    let req = {
        let mut stream = acceptor.accept()
            .expect("fails if there was an error on accept"); // XXX: ditto
        let mut buf = [0u8, ..1024];
        stream.read(buf);
        stream.write(resp_bytes);
        from_utf8(buf).to_owned()
    };
    println!("{:s}", req);
}

The topmost block imports required functionality from the standard library, followed by the main() function. First, a resp_bytes value is created with the help of the bytes! macro (taking a static string and converting it to a byte vector). Next, a SocketAddr value is created for 127.0.0.1, port 8080 (stored in addr). SocketAddr implements the FromStr trait, which the compiler is able to figure out by inference. The addr is then used to initialize a TcpListener.

XXX: Hyperlinks to rustdocs for these types (and others used in the article) would be awesome.

In the previous two calls, expect is a invoked because they both return Option<T> values. expect is a convenience method used when receiving Option<T> values where it's preferrabe to fail if the value is None (you expect a Some<T>).

The next block of code is a scope, beginning with let req = {, that encapsulates the lifetime of a single client's connection. In this example, the application exits after a single request is processed. A real application would probably enter a loop at this point to handle incoming traffic. The TCP stream connecting the server to the client is initialized with the call to accept (again, expect comes in handy).

XXX: Above, an explanation for the curious expression block would be helpful. 'the lifetime of a single client's connection' is not clear. What's a 'lifetime'?

With the stream initialized, a mutable byte buffer, buf, initialized to zeros, is created on the stack to contain data from the client. The length of the buffer (or an explicit slice pointing at a section of it) represents the maximum amount of data a call to read will pull in. Immediately after writing, the resp_bytes value constructed above is sent back to the client with the call to write. The scope then ends by returning a string, parsed from the bytes read into buf. from_utf8 can detect the end of the string (many parsing/conversion operations would require a length argument (XXX: why? is this true in Rust in general, in a more full-featured servo, in comparisons to other languages)). The stream's destructor will be invoked at the end of this scope, closing the TCP connection with the client.

The program ends with printing the returned req string to stdout. As with the end of the scope encapsulating the client's connection, leaving main() will result in the acceptor value being destroyed and its underlying binding to 127.0.0.1, port 8080 ending.

Handling Errors In IO With Conditions

The above example, while succinct, leaves out a key aspect of safe applications: error handling. In the IO portion of Rust's stdlib, this is currently accomplished via Conditions (see below for why this will soon be changing). Much like a try/catch block in other languages, Conditions provide a means to designate a handler for when a certain class of error, raised within the callee(s), is encountered. Here is the previous example with integrated condition handling:

use std::str::from_utf8;
use std::io::{Acceptor, Listener};
use std::io::net::tcp::TcpListener;
use std::io::io_error;

fn main() {
    let resp_bytes = bytes!("HTTP/1.0 200 OK\nContent-Type: text/html\n\n<html><head><title>hello, world!</title></head><body>foo</body></html>");

    // XXX: There's a `from_str` free function now that can be used to make this look nicer.
    let addr =
        FromStr::from_str("127.0.0.1:8080").expect("fails on malformed address");
    io_error::cond.trap(|e| {
        // XXX: This handles the condition in a way that is exactly the same as if the condition hadn't been handled.
        // It may be illustrative to *not* fail, though the code will be more unwieldy (through some profuse appologies
        // in about how conditions kinda suck).
        fail!("handle or fail, as needed. Here's the error: {:?}", e);
        // ...
    }).inside(|| {
        let listener = TcpListener::bind(addr).unwrap();
        let mut acceptor = listener.listen();
        let req = {
            let mut stream = acceptor.accept().unwrap();
            let mut buf = [0u8, ..1024];
            stream.read(buf);
            stream.write(resp_bytes);
            from_utf8(buf).to_owned()
        };
        println!("{:s}", req);
    });
}

Unlike the previous example, this program brings in the condition we expect to handle via use std::io::io_error. In contrast to the try/catch pattern common in other languages, Rust Conditions declare the condition-type-to-be-handled (io_error in this case) up-front, along with the handler (the block within cond.trap). The wrapped code is within the closure passed to inside.

Rust's Condition (XXX: 'condition') system is meant to integrate w/ Tasks (XXX: 'tasks') and be handled within the current unit of execution. At Runtime, if a condition is raised, the code looks in Task Local Storage (XXX: 'task local storage' - is this capitalization an O'Rielly convention?) to see if a handler for the particular condition is registered. If so, it evaluates it and, if the Condition's trap handler is designed to substitute a new value, it will do so. If no handler is registered within the current Task, then the code fail!s. This means, obviously, that Conditions must be handled within their current Task. (XXX: 'task', and 'condition')

If a line in the program was changed, like so:

    // XXX: Again, just `from_str`.
    // won't be able to bind to this port unless running with
    // elevated privileges
    let addr =
        FromStr::from_str("127.0.0.1:25").expect("fails on malformed address"); // XXX: "malformed address"

The following error would occurr (XXX: 'occur') at run time:

task '<main>' failed at 'handle or fail, as needed. Here's the error: std::io::IoError{kind: PermissionDenied, desc: "permission denied", detail: None}', example.rs:12

Alas, in keeping with the reality that Rust is still evolving, conditions are on their way out (at least from the stdlib). There is a pull request open that will set the stage for the removal of Conditions from the library, replaced with a Result<TOk, TErr-based scheme, partnered with language/lint support for catalog where possibly-erroring API calls go unhandled.

Variation: Building The Response With JSON Serialization

Building upon the previous example, a means to easily return structured data to the client would be quite useful. Luckily, Rust ships a generlized (XXX: 'generalized') serialization scheme in extra::serialize. Additionaly (XXX: 'additionally') a reference implementation for JSON is provided in extra::json.

Instead of reworking the previous example, a get_resp_bytes function is shown, using serialization to build a response.

extern mod extra;
use std::io::Decorator;
use std::io::mem::MemWriter;
use extra::serialize::{Encodable};
use extra::json;

#[deriving(Encodable)]
struct Echo { msg: ~str }

fn get_resp_bytes() -> ~[u8] {
    let echo = Echo { msg: ~"Hi Hi Hi" };
    let mut mem_buf = MemWriter::new();
    // XXX: Another borrowck-block probably deserves explanation
    {
        let mut encoder = json::Encoder::new(&mut mem_buf as &mut std::io::Writer);
        echo.encode(&mut encoder);
    }
    let json_bytes = mem_buf.inner();
    let headers = bytes!("HTTP/1.0 200 OK\nContent-Type: application/json\n\n");
    let mut bytes: ~[u8] = ~[];
    bytes.push_all_move(headers.to_owned());
    bytes.push_all_move(json_bytes);
    bytes
}

fn main() {
    let resp_bytes = get_resp_bytes();
    // ...
}

Rust's approach to serialization of values is simple and elegant. If your type consists solely of fields that implement Decodable and/or Encodable, you can get serialization support "for free" with a single attribute annotation (#[deriving(Encodable)], above). All of Rust's primitive values (numbers, strings, etc) implement these traits, as well as fundamental container types like vectors and HashMap<K, V>.

In the above example, an extern mod extra; statement appears at the top of the code (Rust's convention for linking an external library). Unlike libstd, libextra isn't linked by default despite being an official Rust library. This allows use'ing the serialize and json modules. Also, the Decorator trait and MemWriter type are neccesary to get the bytes from serialized Rust values (the inner method is a part of Decorator/MemWriter).

A simple Echo struct is declared for this example. Applying the #[deriving(Encodable)] attribute is our "for free" serialization support. deriving is a shortcut to get common traits for simple "Plain Ol' Data" types (XXX: deriving works for more than just POD types. your previous explanation of deriving was more correct). More information about deriving can be found in the Rust Manual.

Within get_resp_bytes, an Echo instance is created and stored in echo. Following that, a MemWriter is created and stored in mem_buf. The MemWriter acts as a std::io::Writer implementation used by the json::Endoder (XXX: 'encoder') instance used in the example. Depending on how your code was structured, this could be written so that the encoder writes direct into the TcpStream representing the client connection. The rest of the function is concerned with building a byte vector consisting of some canned HTTP response headers, plus the serialized JSON. In this case, since the size of the byte array to-be-returned isn't known, a growable vector (~[]) is used. Currently, this is the only way to safely build and return vectors values of unknown size from functions.

XXX: An explanation of bytes!

Learning More About IO And The Rust Runtime

All of the fundamental traits used in this post (Reader, Writer, Decorator, etc) are a part of the top-level API specified in std::io. The officla documentation (XXX: 'official') has a number of examples to get a curious developer started on the path to productivity.

Furthermore, this post in no way endorses crafting your own HTTP servers in rust, as the community already has a superb solution in rust-http.

The official Rust documentation hosts a guide to the Runtime, meant to inform developers on the benefits and trade offs of the provided M:N and 1:1 implementations, in addition to what drove the design and how to opt-out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment