Having seen a lot of positive buzz around the Rust language, I decided to look into it myself aswell. So, past couple of weeks I have been reading the second edition of The Rust Programming Language book which is an introductory to the language. This blog post is a round-up of my first impressions about the language and its features, and it also serves as a quick intro for others interested in it. In my day job I work mostly with C and C++ so those are my main reference points when learning a new language.
In the first chapter of the book the main idea of Rust is coined as follows.
Rust is a programming language that’s focused on safety, speed, and concurrency. Its design lets you create programs that have the performance and control of a low-level language, but with the powerful abstractions of a high-level language.
Coming from the embedded world I especially like the fact that Rust has a strong focus on safety, performance and concurrency. Many other modern languages have chosen an approach with garbage collection and many also require a runtime which can make them unsuitable for low-level programming. This is probably one of the main reasons that languages such as C and C++ are still very widely used.
Rust is a compiled language and the binaries are compiled directly to target architecture. Rust also does the safety checks during compilation so the added safety does not have runtime performance penalty, which makes it quite unique.
Ownership is a key concept in Rust. Each value has a variable that is owner of that data, and there can only be one owner at a time. If the value is moved to a new variable, the ownership is transferred and the old variable becomes invalid. Values can also be borrowed using references but there can only be one mutable reference at a time. These rules are enforced at compile time by a borrow checker.
In contrast, C/C++ do not have a concept of ownership similar to Rust’s. For instance, in C++ there can be multiple pointers and references to same data simultaneously, and it is the programmer’s responsibility to make sure that there are, for instance, no data races. At first it might feel that the borrow checker “just get’s in the way” because it produces a lot of compiler errors (at least before you get used to the ownership concept), but in the end it provides valuable safe-guards against many common issues.
Developers who are familiar with languages like C++, Python and Java should also be familiar with exceptions as an error handling mechanism. When an error occurs, an exception is thrown and the call stack is unwind until the exception is caught. If the exception is not caught at all, the program is terminated. This scheme allows some parts of the code, for instance libraries, to easily let the application logic to handle errors in a meaningful way. One drawback however, especially in C++, is that it is hard to tell which exceptions can be thrown and also the error handling code can be far away from the part that triggered the error.
The error handling approach in Rust is not to use exceptions. Instead, Rust has two different mechanisms for error handling. If the error is unrecoverable, the program will panic (i.e. terminate). Alternatively, for recoverable errors Result<T,E> type is used. With this approach the compiler can enforce that the error variant is always handled when the return value is obtained from the Result. This is also much more robust than encoding the failure information to the returned value (e.g. returning -1). Moreover, Rust also provides syntax to easily propagate errors from functions with ? operator.
In the example above, if either open or metadata call fails, the returned error is automatically propagated without the need to explicitly read and return it. This removes unnecessary boilerplate code and makes the resulting code simpler and cleaner.
Enums on steroids
I have strong C/C++ background so for me enumerations are just a collection of explicitly named constants that use integral type as underlying type. In Rust enums are much more than that, and they are better described as a type that represents one or many variants. These variants can also include data, and what’s more each variant can also have different data.
This makes the enum type suitable not only to enumerate the possible variants but also to model their data into the same type. Examples can be found from the Rust by example website.
It’s a match!
Another cool feature in Rust are patterns. Patterns and pattern matching are used in many places but probably the most powerful is the match operator. It can be thought of as, well, switch-case on steroids. Instead of matching simple values, it is possible to have complex patterns and have additional match guards as well. Also, the compiler enforces that all enum variants are always handled. Here’s an example:
The first case (called a match arm in Rust) matches if the first field in the tuple is 2. The second arm captures first and last field (so that they can be used in the match body) and matches if the first field is larger than 2. Finally the third arm has a wildcard and will mach all remaining cases. This also satisfies the requirement that all cases need to be handled. This is just a simple example and there is a whole chapter dedicated for patterns in the Rust book. Basically, the very flexible enums combined with pattern matching provides a very expressive way to model and manipulate data.
Type inference and coercions
Rust is a statically typed language which means that the compiler must know all the types during compilation. In other statically typed languages like C this means that the programmer must explicitly tell what types variables, parameters and return values are. In Rust a technique called type inference is used to deduce the types. So, even though the language is statically typed, most of the time programmer does not need to write the type explicitly. The compiler is able to figure it out from the context.
Rust also uses deref coercion which is easily demonstrated with a simple example.
First atomically reference counted String object is created (String wrapped in Arc). Then a function that takes a reference to string is called with this object. First the object inside Arc is explicitly accessed with * operator. Deref coercion allows us to just type &s because it is clear that we intend to use the value inside the Arc. Overall, the coercions and inference allow to simplify the code in situations where the compiler can figure out proper types without explicit annotations. Modern C++ has auto keyword, but Rust takes this concept further.
One of the selling points of Rust are zero-cost abstractions. In practice this means an ability to use high-level concepts such as closures and iterator adapters without any performance costs. Also, when using these high level aspects the produced code is as fast or faster than hand coded implementation.
Here’s an example that shows functional programming in Rust:
The code calculates occurences of characters from input string, and generates HashMap with char-count pairs. The example above outputs:
There’s no null
One interesting idea in Rust is that the language does not have a concept of null. At first thought this might seem very strange since we’re all so used to it, but it does remove a myriad of common bugs altogether. In other languages like C/C++ pointers have to be explicitly checked for null or the application will crash and burn if the null object is dereferenced.
Usually null value is used to indicate that some resource is not valid, and this information is indicated by a special null value. The problem is that the null condition needs to be manually checked which is error prone. The concept of something not necessarily being valid is still needed in Rust. It is just implemented differently.
Instead of indicating the invalid state of an object by special value, Rust implements this by wrapping the actual object in Option<T> or Result<T,E> type. This way when the concrete type is accessed the compiler will enforce that also the fail case is handled or the code won’t compile.
One of the big goals, and also one of the chapters in the Rust Book, is fearless concurrency. As one would expect, Rust provides the familiar primitives like threads, mutexes and also channels for message passing. However, the lifetime model and ownership rules makes working with concurrency much safer than in many other languages.
Rust is, for instance, able to detect data races between threads. That is, when multiple threads try to modify same data without proper mutual exclusion. The ownership system and the way mutexes are implemented also guarantees that locks are always acquired and released. Also the trait system is used to “mark” types that are thread-safe which means that the program will not compile if unsafe methods are used in a threaded context.
Convention over configuration
Rust uses a convention over configuration design paradigm which aims to decrease the number of decisions developers need to make. This paradigm is used heavily on code and test organization. Code modules, source files, unit tests and integrations tests are organized in a certain way and the compiler is able to find these items without explicit configuration.
One of the main benefits of this paradigm, besides the reduced configuration, is that different projects are organized in a similar manner which makes navigating other project’s sources easier. This is certainly not the case with C++.
Useful compiler error messages
If you have ever made a mistake with C++ templates (for instance the STL), you probably know what useful error message does not look like. You’ll most likely get couple of screenfuls of incomprehensible text. The error messages can be cryptic even for the most simplest of errors.
My experience with Rust so far is that the compiler errors are really helpful and often point out the exact error and even suggest the fix. For instance, if the file_len example earlier had an extra ; in the Ok(File::open(“test.txt”)?.metadata()?.len()); line, the compiler error is:
“Consider removing this semicolon” and it is even pointed out in the code. Pretty useful. This is important because especially in the beginning you are likely to get errors often (at least I did) due to the many checks Rust does. This of course is a good thing because the checks point out potential bugs in the code. But it certainly helps a lot when the compiler points you out to right direction.
Strict types and casts
In C and C++ the compiler converts numeric types automatically. So a C/C++ developer would expect that assigning a uint8 to uint32 should just work.
Well, Rust is more strict about types and the code above would produce an error. The programmer must explicitly cast the value to correct type let b: i32 = a as i32; .
Object-oriented or not?
Inheritance sets Rust apart from many common high-level languages. That is because Rust does not have it even though it does have other object-oriented features like encapsulation. On the other hand, Rust has a feature called traits to describe a common behavior same way as interfaces are used in other languages. Also trait objects can be used in a same way as polymorphic types in languages with inheritance.
Whether Rust is categorized as object-oriented depends on the definition. Object-oriented patterns can still be implemented in Rust, but probably the approach needs to be a bit different from the “textbook” implementation. There is a complete chapter dedicated to this discussion in the Rust Book.
There is so much more that could be written about Rust. For instance, support for unit and integration testing or the cargo tool and crates.io. Though, the main idea here was to briefly introduce some aspects about the language and standard library that I find especially interesting. If this got you interested, I would really recommend the Rust Book. It’s easy to follow, covers the language comprehensively and best of all is completely free.
From what I have learned so far, Rust seems to address many of the common challenges in programming. Best of all, these additional checks and guarantees are done during compilation so they do not have a run-time penalty. The language is also compiled directly to machine code which makes it suitable for embedded targets (where garbage collection or runtimes would be unsuitable). Overall my first impression is that a lot has been done right in this language. Can’t wait to dive into deeper!