Rust vs. Go

This post is a head-to-head comparison of Rust vs. Go for NTPsec’s purposes. Read it bearing in mind that the NTP codebase has an unusual combination of traits - not as hard-core a systems-programming problem as an OS kernel, but with some critical regions that are soft realtime.

Comparisons on individual axes

This is an outside view, written after I had programming experience in Go but before I had any in Rust. The Rust parts of this section are therefore theory extracted from the documentation. The section after this one includes a report from practice.

Learning curve

NTP’s potential contributor base is mainly C programmers. Thus, advantage goes to the language with the easier learning curve for that population.

Go was actually designed to be an easy tool upgrade for C coders and succeeds at that objective. I was able to become fluent in Go in four days from a standing start; four days into Rust I was still struggling to even fully understand the data-ownership and object systems, let alone apply them. This is +1 for Go.

It doesn’t help Rust that it would be a difficult upgrade from 'anywhere', not just C. The amount of complexity and ritual required by Rust’s ownership system is high and there is no other language I know of that is really good preparation for it.

Translation distance

We have 62KLOC to move to either language. That’s a lot, and puts a premium on ease of hand translation. Go gets +1 here, if only because adding Rust’s ownership/lifetime annotations to C would be a huge job.

This could reverse, however. When we’re ready to move, one of the first things to try will be automated translation to Rust via Corrode. If that works well it will be a big advantage, like a +3, for Rust. But it could reverse again if Go’s nigh-undocumented c2go turns out to be useful.

Concurrency

(Note: this matters to NTP because, though the main algorithms have an intrinsically serial rendezvous, we have a need to do DNS lookups with potentially arbitrary delays which must therefore run asynchronously.)

Rust supports an equivalent of conventional mutexes locking shared state via its data-ownership system. It also has an implementation of CSP channels. Go has CSP channels as a core primitive, with also the ability to set up conventional mutex locking through library functions.

Rust’s native shared-state/mutex system looks fussy and overcomplicated compared to CSP, and its set of primitives is a known defect attractor in any language. While I give the Rust designers credit for course-correcting by including CSP, Go got this right the first time and their result is better integrated into the core language. This is +1 for Go.

Embedded deployment

The zero-overhead abstractions of Rust are obviously better matched to deployment on constrained and embedded systems than are Go’s runtime and fat binaries. A clear +1 for Rust here.

It is worth noting, however, that Google’s business case for fixing this in order to use Go as an Android development platform is strong, and the Go devteam has articulated this as a future direction. While no language with GC will ever be as well matched to embedded as Rust, Rust’s advantage here may be partly unstable if Google exerts sufficient cleverness, and the Go team has a lot of heavy-caliber cleverness to deploy.

Latency and soft-realtime performance

Again, zero-overhead abstractions and no stop-the-world GC pauses give Rust a clear +1.

It is worth noting that Go came nearest disqualifying itself entirely here. If it were not possible to lock out Go’s GC in critical regions Rust would win by default. If NTP’s challenges tilted even a little more towards hard real time, or its critical regions were less confined, Rust would also win by default.

Security and safety

Both Rust and Go have strong type systems intended to foreclose the kinds of buffer overruns and stale-reference bugs that are endemic in C. They take different approaches, with Go leaning on GC and Rust more on its RAII-like model of provable correctness.

Rust gets a +1 here, because exposed invariants that you can reason about are better than relying on a runtime that you cannot readily audit in practice (even though as open source it is possible in principle).

The view from inside Rust

My chosen learning project in Rust was to write a simple IRC server. As a service daemon with no real-time requirements written around a state machine of the kind I can code practically in my sleep, I thought this would make a good warmup for NTP.

I wrote the following summary four days into the attempt…

In practice, I found Rust painful to the point of near unusability. The learning curve was far worse than I expected; it took me those four days of struggling with inadequate documentation to write 67 lines of wrapper code for the server.

Even things that should be dirt-simple in Rust, like string concatenation, are unreasonably difficult. The language demands a huge amount of fussy, obscure ritual before you can get anything done.

The contrast with Go is extreme. By four days in of exploring Go I had mastered most of the language, had a working program and tests, and was adding features to taste.

Then I found out that a feature absolutely critical for writing network servers is plain missing from Rust. Contemplate this bug report: Is there some API like "select/poll/epoll_wait"? and get a load of this answer:

We do not currently have an epoll/select abstraction. The current answer is "spawn a task per socket".

Upon further investigation I found that there are no proposals to actually fix this problem in the core language. The comments acknowledge that there are a welter of half-solutions in third-party crates but describe no consensus about which to adopt.

After publishing a version of this critique on my personal blog I learned a good deal more from various thoughtful members of the Rust community (as well as attracting a lot of flamage from some less thoughtful members).

At the present time, the following problems - acknowledged by those more thoughtful Rustaceans - make Rust unsuitable as an implementation language for NTP:

The language is not yet mature enough to present a core API that can be expected to be stable over 10-year timescales.
As a special case of the previous point, primitives required for NTP such as select/epoll are not yet a stable part of the language. While implementations exist in the crate system, there is not yet any guarantee that any of the alternatives will be maintained over 10-year timescales.
The language documentation describes the Rust compiler well, but fails to bridge the explanatory gap to parts of the crate system vitally needed to use the language as a production tool.
More generally, a lot of crucial information about the language in practice is not yet documented at all; you have to be part of the swarm around the development group and follow their communications channels to learn it.

These problems seem to reflect a social/philosophical confusion in the Rust community about whether and when they will "bless" crates, designating them as part of a core API with stability guarantees. Rust has a Conway’s Law problem: the decentralized structure of the crate system tends to discourage making this kind of commitment at all.

Go has a not entirely dissimilar package system, but the Go designers made an early choices to define a set of core modules much more extensive than Rust’s - one which does include, in particular, all the primitives required to implement network service daemons.

Additionally, the core Rust language has a serious learning-curve problem that present documentation and tutorials don’t address effectively enough. Relatedly, the friction cost of important features like the borrow checker is pretty high. This would translate into continuing barriers to entry for NTP developers.

There’s some question as to whether this game is worth the candle. Rust’s theory and its focus on zero-overhead abstractions make sense for hard-realtime applications or an OS kernel; it is much less clear that this is the right emphasis for writing network service daemons, even daemons with soft-realtime critical regions like NTP implementations.

Conclusion

For comparison, I switched from Rust to Go and, in the same amount of time I had spent struggling to make even a <100 LOC partial implementation work, was able to write and test the entire exterior of an IRC server - all the socket-fu and concurrency handling - leaving only the IRC-protocol state machine to be done.

The lesson is extremely clear. Go is a better fit for our requirements than Rust. It’s not even close.

UPDATE: I do not mean to imply that I think Rust is generally useless. It’s still immature, but I think a mature version could be quite effective for some niches I have already implied: hard realtime, OS kernels, and especially firmware with very high assurance requirements such as aerospace and medical that justify the high effort.