Comparison of two methods to append to a list in Python

A colleague recently told me that the append method on a Python list is more efficient than using the += operator but provided no justification. Curious, I investigated whether this was true.

String validation as an example

Consider the following function as an example. It checks whether a string is semantically correct, i.e. whether it satisifies some set of requirements that are dictated by the needs of the application. In this example, the requirements are

  1. The string cannot be longer than 10 characters

  2. The string cannot contain a number

import re

def validate(input: str) -> str:
    reasons = []

    if len(input) > 10:
        reasons += ["Input is too long"]

    if bool(re.search(r"\d", input)):
        reasons += ["Input contains a number"]

    if reasons:
         raise Exception(" | ".join(reasons))

    return input

If either of these conditions are violated, an exception is raised containing a message. The message describes which condition(s) is(are) violated. The reasons list contains zero, one, or two elements, and it is built by appending to the list using the += operator. I could have instead used the append method of list.

+= vs. append

To test the speed of the two methods, I use the timeit package from the Python standard library in the program below. The test consists of the following:

  1. Create an empty list

  2. Append error strings to the list one at a time

import string
import timeit

reasons = [
    "This is an error message",
    "This is another error message",
    "Let's add another for good measure",
]
def test_plus_equals():
    result = []
    for reason in reasons:
        result += [reason]

def test_append():
    result = []
    for reason in reasons:
        result.append(reason)

number = 1000000
repeat = 5
results_plus_equals = min(
    timeit.repeat(
        "test_plus_equals()",
        number=number,
        repeat=repeat,
        setup="from __main__ import test_plus_equals"
    )
)

results_append = min(
    timeit.repeat(
        "test_append()",
        number=number,
        repeat=repeat,
        setup="from __main__ import test_append")
)

if __name__ == "__main__":
    print("1E+6 loops per test")
    print(f"+= (best of five tests):\t{results_plus_equals:0.4f} s")
    print(f"append (best of five tests):\t{results_append:0.4f} s")

Results

1E+6 loops per test
+= (best of five tests):     0.2392 s
append (best of five tests): 0.2241 s

In this test the append method of Python's list does appear to be faster by a factor of 6% or 7%. append took about 0.224 microseconds per loop, whereas the += operator took 0.239 microseconds.

The advantage of the append method is probably only noticeable if you need to append to a list many millions of times per second.

Out of the Tar Pit: a Summary

Out of the Tar Pit is a 2006 paper by Ben Moseley and Peter Marks about the causes and effects of complexity in software systems. The thesis of the paper is stated already in the second sentence of the paper:

The biggest problem in the development and maintenance of large-scale software systems is complexity — large systems are hard to understand.

As implied by the authors, complexity is a property of a software system that represents the degree of difficulty that is experienced when trying to understand the system. State—in particular mutable state—is the primary cause of complexity. Additional causes are code volume and control flow, but these are of secondary importance.

Complexity

Of the four properties described by Brooks in the paper entitled No Silver Bullet that make building software hard (complexity, conformity, changeability, invisibility), the authors state that complexity is the only meaningful one:

Complexity is the root cause of the vast majority of problems with software today. Unreliability, late delivery, lack of security — often even poor performance in large-scale systems can all be seen as deriving ultimately from unmanageable complexity.

By complexity, the authors mean "that which makes large systems hard to understand," not the field of computer science that is concerned with the resources that are consumed by an algorithm.

Approaches to Understanding

To better establish their definition of complexity, the authors explore the ways in which developers attempt to understand a system. There are two main ways:

  1. Testing This is a way to understand the system from the outside.
  2. Informal Reasoning This is a way to understand the system from the inside.
Of the two, informal reasoning is the most important by far. This is because — as we shall see below — there are inherent limits to what can be achieved by testing, and because informal reasoning (by virtue of being an inherent part of the development process) is always used. The other justification is that improvements in informal reasoning will lead to less errors being created whilst all that improvements in testing can do is to lead to more errors being detected.

The primary problem with testing is that a test will only tell you about the behavior of a system subject to the particular range of inputs used by the test. A test will tell you absolutely nothing about the system's behavior under a different set of inputs. In large systems, the set of all possible inputs is too large to fully explore with testing.

Have you performed the right tests? The only certain answer you will ever get to this question is an answer in the negative — when the system breaks.

Informal reasoning, on the other hand, is what is used when a developer builds a mental model about how the system works while looking at the code. Because it is the most important way to understand a system, simplicity is a vital characteristic of well-functioning, large-scale systems.

Causes of Complexity

State

The presence of state (particularly mutable state) makes programs difficult to understand. The authors offer the following example to explain the problems of state:

Anyone who has ever telephoned a support desk for a software system and been told to “try it again”, or “reload the document”, or “restart the program”, or “reboot your computer” or “re-install the program” or even “re-install the operating system and then the program” has direct experience of the problems that state causes for writing reliable, understandable software.

State makes testing difficult by making flakiness more likely. (Flakiness describes a set of tests that randomly fail for seemingly no reason.) This fact, combined with the large number of inputs to a program, combine together horribly (emphasis the authors).

In addition, state complicates informal reasoning by hindering the developer from understanding the system "from the inside." It contaminates a system in the sense that even mostly stateless systems become difficult to understand when coupled to components with mutable state.

Control

The authors claim that the next most important barrier to understanding is control.

Control is basically about the order in which things happen. The problem with control is that very often we do not want to have to be concerned with this.

Complexity caused by control very much depends on the choice of language; some languages make control flow explicit, whereas other, more declarative languages, make control flow implicit. Having to explicitly deal with control creates complexity.

The same is true of concurrency. Explicit concurrency in particular makes both testing and informal reasoning about programs hard.

Code volume

Increasing the amount of code does increase complexity, but effective management of state and control marginalizes its impact.

There are indeed other causes of complexity than the three listed above, but they all reduce to three basic principles:

  1. Complexity breeds complexity
  2. Simplicity is hard
  3. Power corrupts

The last principle states that mistakes and poor decisions will be made when a language allows it. For this reason, restrictive, declarative languages and tools should be preferred.

Classical approaches to managing complexity

To better understand the ways in which programmers manage complexity, the authors explore three major styles of programming:

  1. Imperative (more precisely, object-oriented)
  2. Functional
  3. Logic

Object-orientation

Object-oriented programming (OOP) is one of the most dominant styles of programming today for computers that are based on the von Neumann architecture and is presumably inspired largely by its state-based form of computation.

OOP enforces integrity constraints on data by combining an object's state with a set of procedures to access and modify it. This characteristic is known as encapsulation. Problems may arise when multiple procedures contend for access to the same state.

OOP also views objects as being uniquely identifiable, regardless of the object's attributes. In other words, two objects with the exact same set of attributes and values are condsidered distinct. This property is known as intensional identity and contrasts with extensional identity in which things are considered the same if their attributes are the same.

For these two reasons, OOP is not suitable for avoiding the problems of complexity:

The bottom line is that all forms of OOP rely on state (contained within objects) and in general all behaviour is affected by this state. As a result of this, OOP suffers directly from the problems associated with state described above, and as such we believe that it does not provide an adequate foundation for avoiding complexity.

Functional programming

Modern functional programming (FP) languages can be classified as pure (e.g. Haskell) and impure (e.g. the ML family of languages).

The primary strength of functional programming is that by avoiding state (and side-effects) the entire system gains the property of referential transparency - which implies that when supplied with a given set of arguments a function will always return exactly the same result (speaking loosely we could say that it will always behave in the same way)...

It is this cast iron guarantee of referential transparency that obliterates one of the two crucial weaknesses of testing as discussed above. As a result, even though the other weakness of testing remains (testing for one set of inputs says nothing at all about behaviour with another set of inputs), testing does become far more effective if a system has been developed in a functional style.

Informal reasoning is also more effective in the functional approach to programming. By enforcing referential transparency, mutable state is generally avoided. However, in spite of its properties, nothing in FP can prevent somenone from effectively simulating multiple state, so some care must still be taken.

The authors concede that by sacrificing state in FP, one does lose a degree of modularity.

Working within a stateful framework it is possible to add state to any component without adjusting the components which invoke it. Working within a functional framework the same effect can only be achieved by adjusting every single component that invokes it to carry the additional information around.

However,

The trade-off is between complexity (with the ability to take a shortcut when making some specific types of change) and simplicity (with huge improvements in both testing and reasoning). As with the discipline of (static) typing, it is trading a one-off up-front cost for continuing future gains and safety (“one-off” because each piece of code is written once but is read, reasoned about and tested on a continuing basis).

FP remains relatively unpopular despite its advantages. The authors state that the reason is that problems arise when programmers attempt to use it in problems that require mutable state.

Logic programming

Logic programming is like FP in the sense that it is declarative: it emphasizes what needs to be done, not how it is done. The primary example of a logic programming language is Prolog.

Pure logic programming is the approach of doing nothing more than making statements about the problem (and desired solutions). This is done by stating a set of axioms which describe the problem and the attributes required of something for it to be considered a solution. The ideal of logic programming is that there should be an infrastructure which can take the raw axioms and use them to find or check solutions. All solutions are formal logical consequences of the axioms supplied, and “running” the system is equivalent to the construction of a formal proof of each solution.

Pure logic programming does not suffer from the same problems of state and control as OOP. However, it appears that real logic programming languages need to make some pragmatic tradeoffs in their implementations which introducs small amounts of state and control elements.

Accidents and Essence

The authors define two different types of complexity:

  • Essential complexity is inherent in, and the essence of, the problem as seen by the user.
  • Accidental complexity is all the rest - complexity with which the development team would not have to deal with in the real world.

It is important to understand the degree of strictness in the definition of essential complexity. Only complexity related to the problem domain of the user falls into this category. Everything related to the implementation details - bytes, transistors, operating systems, programming languages, etc. - is accidental complexity.

We hence see essential complexity as "the complexity with which the team will have to be concerned, even in the ideal world"... Note that the "have to" part of this observation is critical — if there is any possible way that the team could produce a system that the users will consider correct without having to be concerned with a given type of complexity then that complexity is not essential.

The authors disagree with Brooks's assertion that most complexity in software is essential.

Complexity itself is not an inherent (or essential) property of software (it is perfectly possible to write software which is simple and yet is still software), and further, much complexity that we do see in existing software is not essential (to the problem).

Functional relational programming

The second part of the paper concerns itself with exploring a real-world implementation of a complexity-minimizing system known as functional relational programming, or FRP. It is based on two major paradigms:

  • the relational data model for handling state
  • pure functional programming for handling logic

Throughout these final sections, the role of the relational model was the primary topic, whereas the role of functional programming within the system was given relatively little importance. I found a few points interesting here:

  1. The authors focus primarily on the pure form of the relational model, citing a work of Codd's that criticizes "impure" but pragmatic implementations, such as SQL. However, and as far as I can tell, there are extremely few real-world implementations of a pure relational database. (I only managed to find one, RelDB, from a page explaining why and how SQL does not strictly follow the relational model.)
  2. In spite of the paper's insistence on the elimination of mutable state, it seems like the authors ignore the point that their implementation of essential state using the relational model is mutable. I know of no way that FRP can update the relational variables in the system's essential state without mutating them.

The following quotes indicate that the authors seem to understand that their system is based on mutable state, which is why I am puzzled that it is part of their proposed solution.

Specifically, all input must be converted into relational assignments (which replace the old relvar values in the essential state with new ones), and all output (and side-effects) must be driven from changes to the values of relvars (primarily derived relvars). - Section 9.1.4

Finally, the ability to access arbitrary historical relvar values would obviously be a useful extension in some scenarios. - Section 9.1.5

I was actually a bit relieved when a colleague pointed out to me that Rich Hickey expresses similar criticism about FRP in his talk Deconstructing the Database because it gave me some confidence that my suspicions may be correct.

Summary

"Out of the Tar Pit" has significantly changed the way I think about solving problems with software engineering. In particular, it has made me reevaluate the signifance of domain specific languages (DSLs), especially those that constrain the freedom of the programmer to work only within the problem's immediate domain. I am much more suspect of the tendency of programmers to focus on the flow of control in a program, and I now go to great lengths to avoid mutable state in my own work.

I invested the most thought in the final sections of the paper, only to conclude that their content, in my opinion, is somewhat incomplete. Though the relational model is indeed powerful and elegant, it has very few pure, real-world implementations and seems to admit mutable state. This last bit I find difficult to reconcile with the main thesis of the paper, which is that mutable state is the primary cause of complexity in systems.

I described "Out of the Tar Pit" to my former manager as one of those papers where if you read it, you can't help but agree with it's message because it puts into words what you already know by heart. He replied that this is true only if you've worked in programming for long enough to suffer from the effects of unmanageable complexity. Otherwise, you have no idea what the authors are talking about.

Variable locations in Rust during copy and move

Ownership is a well-known concept in Rust and is a major reason for the language's memory safety. Consider the following example of ownership from The Rust Programming Language:

1
2
3
4
let s1 = String::from("hello");
let s2 = s1;

println!("{}, world!", s1);

Compiling this code will produce an error because the String value that is bound to the variable s1 is moved to s2. After a move, you may no longer use the value bound to s1. (You may, as we will see, re-use its memory by binding a new value of the same type to it.)

String is a container type, which means that it contains both metadata on the stack and a pointer to data on the heap. Simple types such as i32, on the other hand, are normally stored entirely on the stack. Types such as these implement the Copy trait. This means that variable reassignment does not produce a compile-time error like it does in the example above:

1
2
3
4
5
let x = 42;
let y = x;

// The value bound to x is Copy, so no error will be raised.
println!("{}", x);

String is not Copy because it contains a pointer to data on the heap. When a variable that is bound to a String is dropped, its data is automatically freed from the heap. If a String were Copy, the compiler would have to determine whether the heap data is still pointed to by another variable to avoid a double free error.

Memory layout in copies and moves

What's happening inside a program's memory during copies and moves? Consider first a move:

let mut x = String::from("foo");
println!("Memory address of x: {:p}", &x);

// Move occurs here
let y = x;
println!("Memory address of y: {:p}", &y);

// Printing the memory address of x is an error because its value was moved.
// println!("Memory address of x: {:p}", &x)

// Assign new string to x
x = String::from("bar");
println!("Memory address of x: {:p}", &x);

In the above code snippet, addresses of variables are printed by using the pointer formatter :p. I create a String value and bind it to the variable x. Then, the value is moved to the variable y which prevents me from using x again in the println! macro.

However, I can assign a new String value to x by reusing its memory on the stack. This does not change its memory address as seen in the output below:

1
2
3
Memory address of x: 0x7ffded2aa3d0
Memory address of y: 0x7ffded2aa3f0
Memory address of x: 0x7ffded2aa3d0  # Memory address of x is unchanged after reassignment

A copy is similar. In the snippet below, an integer that is originally bound to x is copied to y, which means that I can still refer to x in the println! macro.

let mut x = 42;
println!("Memory address of x: {:p}", &x);

// Move occurs here
let y = x;
println!("Memory address of y: {:p}", &y);

// Printing the memory address of x is not an error because its value was copied.
println!("Memory address of x: {:p}", &x);

// Assign new integer to x
x = 0;
println!("Memory address of x: {:p}", &x);

Both the copy operation and value reassignment do not change the memory locations of x as seen in the program's output:

1
2
3
4
Memory address of x: 0x7ffee579d544
Memory address of y: 0x7ffee579d548
Memory address of x: 0x7ffee579d544  # Memory address of x is unchanged after copy
Memory address of x: 0x7ffee579d544  # Memory address of x is unchanged after reassignment

Summary

Move semantics on container types are one of the reasons for Rust's memory safety. Nothing mysterious is happening in memory when a value is moved from one location to another. The original stack memory still exists; its use is simply disallowed by the compiler until a new value is assigned to it.

The complete program from this post may be found here: https://gist.github.com/kmdouglass/e596d0934e15f6b3a96c1eca6f6cd999

A simple plugin interface for the Rust FFI

In a recent post I explored how to pass complex datatypes through the Rust FFI. (The FFI is the foreign function interface, a part of the Rust language for calling code written in other languages.)

I am exploring the Rust FFI because I want to use it in a small web application that I am writing and that will be used to interact with hardware peripherals connected to a system on a chip (SoC). One use-case that I have in mind is to monitor readings from moisture and temperature sensors implanted in the soil of my houseplants. In many cases the general purpose input/output (GPIO) pins of a SoC are controlled through a C library such as WiringPi for the Raspberry Pi, which means my monitoring system needs to interface with C libraries such as this one.

In this post I will describe my current understanding for how best to integrate C-language plugins with a Rust application. I have omitted all application-specific logic from the example and will instead focus on the design of the plugin interface itself.

You may find the source code for this post here. I was heavily inspired by both the Rust FFI Omnibus and The (unofficial) Rust FFI Guide. This post was written using version 1.35.0 of the Rust compiler.

The C plugin

I wrote a very simple of C library that is located in the ffi-test folder of the example repository and that will serve the purpose of this demonstration. It consists of two source files (ffi-test.c and ffi-test.h) and a Makefile. The plugin's interface is defined as usual in the header file:

#ifndef FFI_TEST_H
#define FFI_TEST_H

#include <stdlib.h>

struct object;

struct object* init(void);
void free_object(struct object*);
int get_api_version(void);
int get_info(const struct object*);
void set_info(struct object*, int);
size_t sizeof_obj(void);

#endif /* FFI_TEST_H */

In particular, there is an opaque struct that is declared by the line struct object;. (An opaque struct is a struct whose definition is hidden from the public API; the definition is provided in the file ffi-test.c.) This object will hold the data for our plugin, but, because it is opaque, we will only be able to interact with it through functions such as init, free_object, etc. that are provided by the API.

To build the C library on UNIX-like systems, simply execute the make command from within the ffi-test library.

$ make
gcc -c -o libffi-test.o -fpic ffi-test.c -Wall -Werror
gcc -shared -o libffi-test.so libffi-test.o

The Rust-C plugin interface

The Rust code is contained in one source file, src/main.rs. The design pattern contained within consists of three kinds of objects:

  • type definitions for the functions in the C library
  • a VTable struct that holds the external function types
  • a Plugin struct that holds the plugin's library, the VTable, and a raw pointer to the object provided by the C library

Let's take a look at each of these abstractions.

External function types

The external function types are defined as follows:

#[repr(C)]
struct Object {
    _private: [u8; 0],
}
type FreeObject = extern "C" fn(*mut Object);
type Init = extern "C" fn() -> *mut Object;
type GetApiVersion = extern "C" fn() -> c_int;
type GetInfo = extern "C" fn(*const Object) -> c_int;
type SetInfo = extern "C" fn(*mut Object, c_int);

The opaque struct from the C library is represented as rust struct with a single, private field containing an empty array. This is currently the recommended way to represent opaque structs in the Rust FFI. Following the struct definition are the type definitions for the foreign functions.

type FreeObject = extern "C" fn(*mut Object);
type Init = extern "C" fn() -> *mut Object;
// ...

For example, the Init type represents a foreign C function that takes no arguments and returns a mutable raw pointer to an Object instance. This function type therefore represents the Object constructor in Rust.

The VTable

The VTable serves as a way to collect the types associated with the C library functions into one place. Furthermore, I added a version number to make it VTableV0. The purpose in doing this is to easily maintain backwards compatability with and follow changes to the C API.

By looking at its definition, you can see that it contains a few RawSymbol instances:

struct VTableV0 {
    free_object: RawSymbol<FreeObject>,
    get_info: RawSymbol<GetInfo>,
    set_info: RawSymbol<SetInfo>,
}

A RawSymbol is a name that I gave to Unix-specific symbols from the libloading Rust library. (See the use statements at the top of the source code file.) I am not storing plain Symbols from that library inside the VTable because the lifetime constraints associated with plain Symbols and their corresponding Library do not allow me to take ownership of them inside the struct. (You can find a few attempts in the commit history of this repository where I tried to own plain Symbols; none of these attempts would compile.)

Instead, if I had used a plain Symbol, then I would have had to lookup the symbols inside the C library each time that I wanted to call them.

The way to obtain RawSymbols is to use the into_raw method of a plain Symbol. You can find an example of this inside the VTable's constructor:

unsafe fn new(library: &Library) -> VTableV0 {
    println!("Loading API version 0...");
    let free_object: Symbol<FreeObject> = library.get(b"free_object\0").unwrap();
    let free_object = free_object.into_raw();
    // ...

First, the free_object Symbol is imported from the library using the get() method from the library, then it is converted to a RawSymbol in the following line so that it can be stored inside the VTableV0 struct that is returned by the constructor. The whole function is marked as unsafe because of the multiple calls to the get method.

The Plugin

Finally we reach the top of the hierarchy of the components that comprise this design, the Plugin struct. Its implementation follows:

struct Plugin {
    #[allow(dead_code)]
    library: Library,
    object: *mut Object,
    vtable: VTableV0,
}

impl Plugin {
    unsafe fn new(library_name: &OsStr) -> Plugin {
    let library = Library::new(library_name).unwrap();
        let get_api_version: Symbol<GetApiVersion> = library.get(b"get_api_version\0").unwrap();
        let vtable = match get_api_version() {
            0 => VTableV0::new(&library),
            _ => panic!("Unrecognized C API version number."),
        };

        let init: Symbol<Init> = library.get(b"init\0").unwrap();
        let object: *mut Object = init();

        Plugin {
            library: library,
            object: object,
            vtable: vtable,
        }
    }
}

impl Drop for Plugin {
    fn drop(&mut self) {
        (self.vtable.free_object)(self.object);
    }
}

The interesting parts here are the Plugin's constructor new and the implementation of the Drop trait. After loading the library, the constructor calls the C library function that returns its API version; if the version matches one for which we have a VTable, then we create the new VTable. Next, we instantiate an Object by calling its constructor to obtain a raw pointer to it.

let init: Symbol<Init> = library.get(b"init\0").unwrap();
let object: *mut Object = init();

The constructor packs the library, the VTable, and the object pointer into a new Plugin struct and returns it.

The Drop trait implementation is used to automatically free the memory that has been allocated when the pointer held by the Plugin struct goes out-of-scope. It does this by calling the free_object method in the VTable.

impl Drop for Plugin {
    fn drop(&mut self) {
        (self.vtable.free_object)(self.object);
    }
}

Running the example

To run the example, run the following commands from the root directory of the example repository.

$ cargo build
Compiling rust-libloading v0.1.0 (/home/kmdouglass/src/rust-libloading-example)
 Finished dev [unoptimized + debuginfo] target(s) in 0.27s
$ cargo run
 Finished dev [unoptimized + debuginfo] target(s) in 0.01s
  Running `target/debug/rust-libloading`
Loading API version 0...
Original value: 0
New value: 42

The main method of the Rust code creates the plugin, prints the default value of the data held by the object (which is instantiated by the C library), and then mutates the data to the value 42.

It then prints this value, demonstrating that the FFI calls work.

Discussion

The most difficult part of developing this design was finding a way to own the symbols exposed by the plugin library. For me, it was not completely evident from the libloading documentation that this was the purpose of the into_raw method on a Symbol.

What I like about this design is that the whole plugin interface fits nicely within a simple hierarchy with a collection of foreign method types at its base. It also supports changes to the C API because a new VTable can be created each time the API changes.

One current disadvantage of the design is that free_object is exposed through the VTable. I think that this opens the possibility for a double-free error. One way to prevent this is to hide the free_object method, loading its corresponding symbol only when the drop method is called.

Another disadvantage of this design is that it relies on the particular C API exposed by the library. C programmers have a large amount of freedom in designing APIs for their libraries. They are not forced to use opaque structs or to version their APIs. As a result, I don't believe that the plugin design presented here can be completely generalized to any C library.

The Plugin struct is almost certainly not thread safe. To make it thread safe, it may be necessary to wrap the raw pointer in a Mutex. It may even be simpler to wrap the entire struct in a Mutex.

Finally, owning raw symbols is not platform independent. You can see at the top of the Rust source code that I am importing the Symbol object specific to UNIX systems. One would need to change this if it was intended to work on Windows.

Summary

  • I presented a design pattern for managing C-language plugins in Rust.
  • The design pattern consists of a collection of foreign object function types, the VTable. This collection is part of a larger collection which owns pointers to the opaque data types exposed by the library, as well as the plugin library itself.
  • The trick to owning symbols (instead of looking them up in the library each time you want to use them), is to use into_raw method that is implemented on libloading's Symbol.
  • This design cannot be completely generalized to any C library, but should provide a good starting point to work with FFI plugins in Rust.

Complex data types and the Rust FFI

The Rust Foreign Function Interface (FFI, for short) is a feature of Rust that enables the sharing of data and functions between parts of code that have been written in different languages. I am interested in the FFI because many libraries used in embedded systems are written in C, and I would like to leverage them for my embedded work with Rust.

I quickly learned from my initial experiments with the Rust FFI that one of its challenges is casting data types into a form that may be consumed by other languages. This challenge is not unique to the Rust FFI, and there are numerous reasons for it. For one, different languages use different mechanisms to layout data in the computer's memory. What's more, names for functions and data types are often mangled, which means that the symbol in a library that maps to a function may be different than the name that you give to the function in your code. As far as I know, C compilers do not mangle symbol names, and partly for this reason the C language is often used as an intermediary language in FFIs. Converting Rust to C is therefore an important skill when using Rust for multi-language work.

There are a few good resources on the internet about using the Rust FFI to expose functions written in Rust to other languages. However, I found little information about passing data types between languages. To help remedy this situation, I describe in this post a simple Rust library that I wrote to explore how to pass complex data types from Rust to C.

An example FFI project

I created the Rust library in the typical way by first starting a new project with Cargo:

$ cargo new --lib rstruct
$ cd rstruct

Inside, I modified the contents of Cargo.toml to the following:

[package]
name = "rstruct"
version = "0.1.0"
authors = ["Kyle M. Douglass"]
edition = "2018"

[lib]
crate-type = ["cdylib"]

[dependencies]

The only lines that I added were [lib] and crate-type = ["cdylib"]. As described in the 2018 edition guide, this type of crate produces a binary that has no Rust-specific information in it and is intended for use through a C FFI.

Next, I opened the src/lib.rs source file, removed the auto-generated content, and added the following source.

use std::boxed::Box;
use std::ffi::CString;
use std::os::raw::c_char;

#[repr(C)]
pub struct RStruct {
    name: *const c_char,
    value: Value,
}

#[repr(C)]
pub enum Value {
    _Int(i32),
    _Float(f64),
}

#[no_mangle]
pub extern "C" fn data_new() -> *mut RStruct {
    println!("{}", "Inside data_new().".to_string());

    Box::into_raw(Box::new(RStruct {
        name: CString::new("my_rstruct")
            .expect("Error: CString::new()")
            .into_raw(),
        value: Value::_Int(42),
    }))
}

#[no_mangle]
pub extern "C" fn data_free(ptr: *mut RStruct) {
    if ptr.is_null() {
        return;
    }
    unsafe {
        Box::from_raw(ptr);
    }
}

Roughly speaking, this simple library does two things. First, it defines two data types, a struct called RStruct and a Rust enum (not to be confused with a C enum!) called Value. Second, it exposes two functions that may be used to access instances of these data types from C: data_new() and data_free().

Let's take closer look now at what the code is doing.

Defining structs and enums for use in C

I want to expose instances of the RStruct type to C code. The definition of RStruct is

#[repr(C)]
pub struct RStruct {
    name: *const c_char,
    value: Value,
}

The first line here is [repr(C)]. This is an attribute that modifies the layout of the struct in memory to "do what C does." As described in the Rustonomicon,

The order, size, and alignment of fields is exactly what you would expect from C or C++. Any type you expect to pass through an FFI boundary should have repr(C), as C is the lingua-franca of the programming world.

Next, we define a public struct just as we would if we were writing typical Rust code. In this example, the struct has two fields. The first is a field called name, which has a type *const c_char. The Rust data types String and &str cannot be interpreted in C, so instead we define the data type as a raw pointer to a c_char. (In this case, * is not dereferencing the pointer but is part of the type name *const T.)

The second field is an enum whose definition follows:

#[repr(C)]
pub enum Value {
    _Int(i32),
    _Float(f64),
}

Again we use #[repr(C)] to indicate that we want the enum to be laid out in memory in the same manner as in C. The enum Value has two variants, _Int and _Float, that each contain a value of i32 and f64, respectively. If you're familiar with C, then you may have already noticed that C enums are differnt from Rust enums in that they do not hold any data themselves. How this minor annoyance is solved will be seen later when we generate the C header for this library.

The data types i32 and f64 are easily translated into C's equivalent numeric data types, so there is no need to do anything special with them.

Instantiating and freeing Memory

Following the data type definitions, there are two functions that are exposed through the FFI boundary, one for instantiating an RStruct and one for freeing the memory associated with an RStruct. The method for instantiation is first:

#[no_mangle]
pub extern "C" fn data_new() -> *mut RStruct {
    println!("{}", "Inside data_new().".to_string());

    Box::into_raw(Box::new(RStruct {
        name: CString::new("my_rstruct")
            .expect("Error: CString::new()")
            .into_raw(),
        value: Value::_Int(42),
    }))
}

The first line contains an attribute called #[no_mangle]. As defined in the Book:

Mangling is when a compiler changes the name we’ve given a function to a different name that contains more information for other parts of the compilation process to consume but is less human readable.

Placing the #[no_mangle] attribute before the function definition ensures that the function name matches that of the corresponding symbol in the library.

Next is the function definition pub extern "C" fn data_new() -> *mut RStruct. Let's break this down into parts to understand it better:

  • pub : The function will be callable from outside the library
  • extern "C" : This line serves two different purposes in Rust, both related to FFI. In my case, I use it specify that the function should be exposed with the application binary interface from C.
  • fn data_new() : This is just the usual fn keyword and the name of the function
  • -> *mut RStruct : Here I specify that the function will return a mutable, raw pointer to an RStruct instance.

The purpose of this function is to create a RStruct instance and return a pointer to it. The RStruct is created just as we would any other struct in Rust, with the exception of the name field:

RStruct {
    name: CString::new("my_rstruct")
        .expect("Error: CString::new()")
        .into_raw(),
    value: Value::_Int(42),
}

The CString is first created with the new() constructor and contains the value "my_rstruct". After unpacking the result with expect(), I call the into_raw() method to create a raw pointer to the C string whose ownership will be passed off to the calling C code. (If I had used as_ptr() instead, the pointer would have been dropped immediately after the function call because the CString would have been deallocated.) The value field is instantiated as it would be in normal Rust.

What is perhaps new in this method is the Box type that wraps the RStruct instance.

Box::into_raw(Box::new( ... ))

A Box is one of Rust's smart pointers that is used to allocate memory for a data type on the heap. When Box::new() is called it creates a pointer to the newly created RStruct instance. Normally, this pointer would be dropped and the memory automatically deallocated when the data_new() function returns. However, the Box::into_raw() function serves the same purpose here as the corresponding function for CString: it hands off ownership of the pointer to the calling code so that the memory is not deallocated.

There is a rule-of-thumb that memory allocated by Rust should be freed by Rust. For this reason, we provide the data_free() method that C code may use to deallocate the memory that is allocated by data_new().

#[no_mangle]
pub extern "C" fn data_free(ptr: *mut RStruct) {
    if ptr.is_null() {
        return;
    }
    unsafe {
        Box::from_raw(ptr);
    }
}

This function accepts a mutable pointer to an RStruct. First, it checks whether the pointer is null and if it is, the function returns without doing anything. Assuming that the pointer is not null, the Box is reconstructed from it inside an unsafe block because from_raw() is unsafe. Importantly, this new Box pointer will go out of scope at the end of the function so that it will automatically be dropped when the function returns.

Building the library is simple. I run cargo build --release to build a release version. The library itself will be found at target/release/librstruct.so. On Linux, one can verify that it contains the data_new() and data_free() methods by displaying its symbols with the nm -g command:

$ nm -g target/release/librstruct.so
# snip
00000000000046c0 T data_free
00000000000044e0 T data_new
# snip

Generating the header for the library

Now that I have a shared library, I want to access the functions that it exposes from C. To do this, I first need a header file that I can use to import the library's declarations into the C code. Moreover, generating the header can help in understanding how Rust translates its data types to C.

I will use cbindgen to automatically generate the header. cbindgen is installed with the command

$ cargo install cbindgen

cbindgen is highly configurable, but for the project described here I only need its most basic functionality. Assuming that I am in the root directory of my Rust project, I generate the header rstruct.h with the following

$ cbindgen --lang C -o rstruct.h .

After running cbindgen there is a new file called rstruct.h in the project folder. Here are its contents:

#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdlib.h>

typedef enum {
  _Int,
  _Float,
} Value_Tag;

typedef struct {
  int32_t _0;
} _Int_Body;

typedef struct {
  double _0;
} _Float_Body;

typedef struct {
  Value_Tag tag;
  union {
    _Int_Body _int;
    _Float_Body _float;
  };
} Value;

typedef struct {
  const char *name;
  Value value;
} RStruct;

void data_free(RStruct *ptr);

RStruct *data_new(void);

First, you can see the enum that contains the variations of the Value data type that is stored in the RStruct and that was defined in Rust. The name of this new type is Value_Tag, and it is used to define the current type of a value.

typedef struct {
  Value_Tag tag;
  union {
    _Int_Body _int;
    _Float_Body _float;
  };
} Value;

A Value is just another struct that contains a Value_Tag field to identify which variant of the enum it is holding and a union field that holds the actual value.

The important thing to understand here is that cbindgen effectively uses nested C data types to represent complex Rust data structures. In particular, Rust enums are a combination of C structs, enums, and unions.

Calling the library from C

With everything in place, it's now time to write the C program. My example C program looks like the following:

#include <dlfcn.h>
#include <stdio.h>
#include <stdlib.h>

#include "rstruct.h"

int main() {
  void* handle;
  RStruct* (*data_new)(void);
  void (*data_free)(RStruct*);
  char* error;

  printf("Loading librstruct.so...\n");
  handle = dlopen(
    "librstruct.so",
    RTLD_LAZY
  );
  if (!handle) {
    fprintf(stderr, "%s\n", dlerror());
    exit(EXIT_FAILURE);
  }
  printf("Done.\n\n");

  dlerror();

  data_new = (RStruct* (*)(void)) dlsym(handle, "data_new");
  error = dlerror();
  if (error != NULL) {
    fprintf(stderr, "%s\n", error);
    exit(EXIT_FAILURE);
  }

  dlerror();

  data_free = (void (*)(RStruct*)) dlsym(handle, "data_free");
  error = dlerror();
  if (error != NULL) {
    fprintf(stderr, "%s\n", error);
    exit(EXIT_FAILURE);
  }

  printf("Calling data_new() from main.c...\n");
  RStruct* data = (*data_new)();

  printf("\nBack inside main.c. Printing results...\n");
  printf("Name: %s\nValue: %d\n", data->name, data->value._int._0);

  printf("\nFreeing the RStruct data...\n");
  (*data_free)(data);

  dlclose(handle);
  return EXIT_SUCCESS;
}

This code is based on the example in the dlopen() man pages. In particular, the library file is opened and a handle attached to it here:

handle = dlopen(
  "librstruct.so",
  RTLD_LAZY
);

A function pointer to data_new() is created with dlsym(), and we use the function to create the new RStruct instance with the lines

data_new = (RStruct* (*)(void)) dlsym(handle, "data_new");
// snip
RStruct* data = (*data_new)();

Finally, the data is freed by creating another function pointer to data_free() and calling it.

data_free = (void (*)(RStruct*)) dlsym(handle, "data_free");
// snip
(*data_free)(data);

Running the program

I wrote a small Makefile to handle compilation of the C and Rust programs while I wrote this post. I won't include it here because it distracts from the main message about the Rust FFI. Instead, I will describe how to compile the program from the command line.

I first placed the librstruct.so, rstruct.h, and main.c programs into the following directory structure:

$ tree
.
├── include
│   └── rstruct.h
├── lib
│   └── librstruct.so
└── src
    └── main.c

Next, I compiled the main binary with gcc.

$ gcc -Wall -g -Iinclude -c -o main.o main.c
$ gcc -Wall -g -o main main.o -ldl

(-ldl is used to link against libdl for dynamically loading the library from C.) After compilation I run the main binary. To make it work, I set the LD_LIBRARY_PATH environment variable so that the program knows to look inside the lib directory for the librstruct.so library.

$ LD_LIBRARY_PATH=lib ./main
Loading librstruct.so...
Done.

Calling data_new() from main.c...
Inside data_new().

Back inside main.c. Printing results...
Name: my_rstruct
Value: 42

Freeing the RStruct data...

Nice! From the output you can see the print statements that I placed inside both the Rust and C code to indicate where the program was as it was running. In summary, the program performs the following sequence of events:

  • The main binary is run
  • The librstruct.so library is opened and pointers to the data_new() and data_free() functions are created
  • data_new() is called, creating our Rust datatype on the heap and returning a pointer to it in the C code
  • Information about the data type is printed from C
  • data_free() is called, freeing the memory from back inside Rust

Summary

And that's it! I hope you enjoyed this post. It took me several days of reading and trial-and-error to learn about this feature of Rust. The topics covered here were

  • the Rust FFI and its purpose
  • creating a complex data type (a Rust enum nested inside a Rust struct) and exporting it through the FFI
  • Box and CString Rust data types
  • cbindgen for automatically creating header files from Rust code
  • using the Rust library from inside C

A simple UNIX socket listener in Rust

I decided that I wanted to learn a new programming language in 2019. After a bit of research, I settled upon Rust due to its speed, novel ideas about memory safety, and focus on two areas that I am interested in: embedded systems and WebAssembly. While I think that The Book is the best place to get started learning the language, nothing is really a substitute for writing code.

With that in mind, I developed an idea for a starting project: a background daemon for Linux systems like the Raspberry Pi that controls and reads data from the system's peripherals. The design of this project is inspired by Docker: a daemon process does most of the heavy work while a command line tool communicates with the Daemon over a Unix socket (typically a file located at /var/run/docker.sock). The purpose of this post is to demonstrate the most basic realization of this: reading text from a UNIX socket in Rust. And to emphasize that the UNIX socket is used for communication between two separate processes, we will send messages from Bash to Rust.

Keep in mind that this is my first-ever Rust program, so it may not be completely idiomatic Rust. The following was compiled with rustc 1.32.0 (9fda7c223 2019-01-16).

To begin, I created a new Rust project with cargo.

$ cargo new rust-uds
$ cd rust-uds

Next, I opened the file that cargo automatically generated in src/main.rs, removed the auto-generated content, and added the following code, which is largely based on the example provided in the Rust documentation but with a few key differences:

use std::io::{BufRead, BufReader};
use std::os::unix::net::{UnixStream,UnixListener};
use std::thread;

fn handle_client(stream: UnixStream) {
    let stream = BufReader::new(stream);
    for line in stream.lines() {
        println!("{}", line.unwrap());
    }
}

fn main() {
    let listener = UnixListener::bind("/tmp/rust-uds.sock").unwrap();

    for stream in listener.incoming() {
        match stream {
            Ok(stream) => {
                thread::spawn(|| handle_client(stream));
            }
            Err(err) => {
                println!("Error: {}", err);
                break;
            }
        }
    }
}

Explanation

The first three lines import the necessary modules for this code example.

use std::io::{BufRead, BufReader};
use std::os::unix::net::{UnixStream,UnixListener};
use std::thread;

BufRead is a trait that enables extra ways of reading data sources; in this case, it has an internal buffer for reading the socket line-by-line. BufReader is a struct that actually implements the functionality in BufRead. UnixStream and UnixListener are structs that provide the functionality for handling the UNIX socket, and the std::thread module is used to spawn threads.

The next set of lines defines a function named handle_client() that is called whenever new data arrives in the stream. The explanation for this is best left until after the main() function.

The first line in the main() function creates the UnixListener struct and binds it to the listener variable.

let listener = UnixListener::bind("/tmp/rust-uds.sock").unwrap();

The bind() function takes a string argument that is a path to the socket file and unwrap() moves the value out of the Result that is returned by bind(). (This is a pattern that is discouraged in Rust but is OK for quick prototypes because it simplifies the error handling.)

After creating the listener, listener.incoming() returns an iterator over the incoming connections to the socket. The connections are looped over in an infinite for loop; I believe that this is more-or-less the same as a generator in Python which never raises a StopIteration exception.

Next, the Result of the incoming streams is matched; if there is an error, it is printed and the loop it exited:

Err(err) => {
    println!("Error: {}", err);
    break;
}

However, if the Result of the connection is Ok, then a new thread is spawned to handle the new stream:

Ok(stream) => {
    thread::spawn(|| handle_client(stream));
}

Finally, the client handler is called for each connection.

fn handle_client(mut stream: UnixStream) {
    let stream = BufReader::new(stream);
    for line in stream.lines() {
        println!("{}", line.unwrap());
    }
}

The handler in this case is fairly straight-forward. It shadows the original stream variable by binding it to a version of itself that has been converted to a BufReader. Finally, it loops over the lines() iterator, which blocks until a new line appears in the stream.

Sending messages

As an example, let's send messages to the Rust program via Bash using the OpenBSD version of netcat. (The OpenBSD version seems to be the default on Ubuntu-based systems.) This should underscore the fact that the UNIX socket is really being used to communicate between two different processes.

First, compile and run the Rust program to start the socket listener:

$ cargo run --release
   Compiling rust-uds v0.1.0 (/home/kmd/src/rust-uds)
    Finished release [optimized] target(s) in 1.59s
     Running `target/release/rust-uds`

Open up a new terminal. You should see the socket file /tmp/rust-uds.sock:

$ ls /tmp | grep rust
rust-uds.sock

Now let's send messages to the rust program. Use the following netcat command to open a connection to the socket.

$ nc -U /tmp/rust-uds.sock

The -U is necessary to indicate to netcat that this is a UNIX stream socket. Now, start typing text into the same window. Every time you press ENTER, you should see the same text appear in the terminal window in which the Rust program is running. Press CTRL-C to exit the Rust socket listener. If you re-run the program, delete the old socket first: rm /tmp/rust-uds.sock

Summary

  • Use a UnixListener struct to create a UNIX socket and listen to it for connections.
  • For each new connection, spawn a new thread and read the stream with a BufReader.
  • Print each new line in the stream by iterating over the lines() iterator of the BufReader.
  • Send commands to your Rust program from bash with nc -U "$PATH_TO_SOCKET".

Linking shared libraries in Micro-Manager for Linux

Lately I have been working on a new Video4Linux2 device adapter for Micro-Manager. I encountered the following error after adding some functionality that introduced two Boost libraries as new dependencies in my project.

Traceback (most recent call last):
  File "RPiV4L2.py", line 17, in <module>
    mmc.loadDevice("camera", "RPiV4L2", "RPiV4L2")
  File "/home/micro-manager/app/lib/micro-manager/MMCorePy.py", line 3515, in loadDevice
    return _MMCorePy.CMMCore_loadDevice(self, label, moduleName, deviceName)
MMCorePy.CMMError: Failed to load device "RPiV4L2" from adapter module "RPiV4L2" [ Failed to load device adapter "RPiV4L2" [ Failed to load module "/home/micro-manager/app/lib/micro-manager/libmmgr_dal_RPiV4L2.so.0" [ /home/micro-manager/app/lib/micro-manager/libmmgr_dal_RPiV4L2.so.0: undefined symbol: _ZN5boost10filesystem6detail13dir_itr_closeERPvS3_ ] ] ]

I received this error when I tried to load the device in a Python script. At first I was puzzled because the code compiled without problems, but I soon found that the solution was simple.

The key part of the message is undefined symbol: _ZN5boost10filesystem6detail13dir_itr_closeERPvS3_. To troubleshoot this, I first demangled the symbol name by entering it at https://demangler.com/. I discovered that the symbol was referring to the function boost::filesystem::detail::dir_itr_close(void*&, void*&). I had added both the Boost filesystem and Boost regex libraries to this device adapter as dependencies, so it was not surprising that either of their names appeared in the error message.

Next, I used the ldd program to check which libraries my device adapter were linked against. (libmmgr_dal_RPiV4l2.so.0 is the name of the device adapter library file).

$ ldd libmmgr_dal_RPiV4L2.so.0
     linux-vdso.so.1 (0x7ec21000)
     libdl.so.2 => /lib/arm-linux-gnueabihf/libdl.so.2 (0x76e9c000)
     libstdc++.so.6 => /usr/lib/arm-linux-gnueabihf/libstdc++.so.6 (0x76d54000)
     libm.so.6 => /lib/arm-linux-gnueabihf/libm.so.6 (0x76cdc000)
     libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x76bee000)
     libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0x76bc1000)
     libicudata.so.57 => /usr/lib/arm-linux-gnueabihf/libicudata.so.57 (0x75334000)
     libicui18n.so.57 => /usr/lib/arm-linux-gnueabihf/libicui18n.so.57 (0x75187000)
     libicuuc.so.57 => /usr/lib/arm-linux-gnueabihf/libicuuc.so.57 (0x7505e000)
     librt.so.1 => /lib/arm-linux-gnueabihf/librt.so.1 (0x75048000)
     libpthread.so.0 => /lib/arm-linux-gnueabihf/libpthread.so.0 (0x75024000)
     /lib/ld-linux-armhf.so.3 (0x76fb4000)

Neither libboost_filesystem nor libboost_regex are listed, so I knew that they were not linked with the device adapter.

There is a Makefile.am included in the directory of every device adapter in the Micro-Manager project. This file is used by Autotools define how the device adapter should be compiled and linked. Here is what my Makefile.am looked like:

1
2
3
4
5
AM_CXXFLAGS = $(MMDEVAPI_CXXFLAGS)
deviceadapter_LTLIBRARIES = libmmgr_dal_RPiV4L2.la
libmmgr_dal_RPiV4L2_la_SOURCES = RPiV4L2.cpp RPiV4L2.h refactor.h ../../MMDevice/MMDevice.h ../../MMDevice/DeviceBase.h
libmmgr_dal_RPiV4L2_la_LIBADD = $(MMDEVAPI_LIBADD)
libmmgr_dal_RPiV4L2_la_LDFLAGS = $(MMDEVAPI_LDFLAGS)

After experimenting a bit, I discovered that I could instruct the linker to link against shared libraries by adding them to the libmmgr_dal_RPiV4L2_la_LDFLAGS variable with the -l flag. The resulting line now looks like:

libmmgr_dal_RPiV4L2_la_LDFLAGS = $(MMDEVAPI_LDFLAGS) -lboost_regex -lboost_filesystem

Finally, running ldd on the rebuilt device adapter now shows these two libraries:

$ ldd libmmgr_dal_RPiV4L2.so.0
     linux-vdso.so.1 (0x7ed74000)
     libboost_regex.so.1.62.0 => /usr/lib/arm-linux-gnueabihf/libboost_regex.so.1.62.0 (0x76e3c000)
     libboost_filesystem.so.1.62.0 => /usr/lib/arm-linux-gnueabihf/libboost_filesystem.so.1.62.0 (0x76e1b000)
     ...

Create a custom Raspbian image with pi-gen: part 2

In my previous post, I discussed how to setup user accounts and locales in a custom Raspbian image using pi-gen. In this follow-up post, I will discuss the main problems that I want to solve: automatically configuring the wireless network and ssh on a new Raspberry Pi without a terminal attached directly to the Pi.

Set up the wireless network

The WPA supplicant

The Pi's wireless credentials are configured in stage 2 in the file stage2/02-net-tweaks/files/wpa_supplicant.conf. Here's how it looks by default:

ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1

According to the blogs Learn Think Solve Create and the Raspberry Spy, the first thing we should do is add our country code to the top of this file with the line country=CH. (Use your own country code for this.) Next, we want to enter the details for our wireless network, which includes its name and the password. For security reasons that I hope are obvious, we should not store the password in this file. Instead, we create a hash of the password and put that inside the file. The command to create the password hash is

wpa_passphrase ESSID PASSWORD > psk

where ESSID is our wireless network's name. Note that I also used a space before the wpa_passphrase command to prevent the password being written to my .bash_history file. Now, we copy and paste the contents of the file psk into the wpa_supplicant.conf and remove the comment that contains the actual password:

country=CH
ctrl_interface=DIR=/var/run/wpa_supplicant GROUP=netdev
update_config=1
network={
        ssid=YOUR_ESSID_HERE
        psk=YOUR_PSK_HASH_HERE
}

Configure the wireless network interfaces

After having configured the supplicant, we next move on to configuring the network interfaces used by Raspbian. The appropriate file is found in stage1/02-net-tweaks/files/interfaces. In my post Connecting a Raspberry Pi to a Home Linux Network I described how to set up the network interfaces by editing /etc/network/interfaces. Much of the information presented in that post has now been superseded in Raspbian by the DHCP daemon. For now, we will use the interfaces file to instruct our Pi to use DHCP and will use /etc/dhcpcd.conf at a later time to set up a static IP address when provisioning the Pi.

We first need to make a few changes to make the interfaces file aware of the credentials in the wpa supplicant configuration file. According to the blog kerneldriver, we need modify the /etc/networe/interfaces file as such:

auto lo

iface lo inet loopback
iface eth0 inet dhcp

auto wlan0
iface wlan0 inet manual
     wpa-roam /etc/wpa_supplicant/wpa_supplicant.conf
iface default inet dhcp

In the first modification, I specify that I want the wirless interface wlan0 started automatically with auto wlan0. Next, I specify that the wlan0 interface should use the manual inet address family with the line iface wlan0 inet manual.

According to the man pages, "[the manual] method may be used to define interfaces for which no configuration is done by default." After this we use the wpa-roam command to specify the location of the wpa_supplicant.conf file that we previously modified. The wireless ESSID and password are therefore not defined in interfaces, but rather reference them inside wpa_supplicant.conf.

In case you noticed that wpa-roam doesn't appear as an option in documentation on the interfaces file and were wondering why, it's because other programs like wpasupplicant may provide additional options to the interfaces file. A similar command is wpa-conf, but I do not quite yet understand the difference between these two commands.

Following the wpa-roam command, we configure the default options for all networks in our wpa_supplicant.conf file with the line iface default inet dhcp. At this point, we save the setup of the static IP address for a later time.

For more information, see the interfaces man page for Debian Stretch.

Change the hostname

Our Pi's hostname may be changed from the default (raspberrypi) by modifying the line in stage1/02-net-tweaks/files/hostname. See RFC 1178 for tips on choosing a hostname.

In addition to modifying the hostname file, we need to update stage1/02-net-tweaks/00-patches/01-hosts.diff and change raspberrypi to the new hostname:

Index: jessie-stage1/rootfs/etc/hosts
===================================================================
--- jessie-stage1.orig/rootfs/etc/hosts
+++ jessie-stage1/rootfs/etc/hosts
@@ -3,3 +3,4 @@
 ff02::1                ip6-allnodes
 ff02::2                ip6-allrouters

+127.0.1.1      NEW_HOSTNAME_HERE

Set the DNS servers

DNS servers are configured in export-image/02-network/files/resolv.conf. By default, mine was already configured to use one of Google's DNS servers (8.8.8.8). I added a secondary Google DNS address as well:

nameserver 8.8.8.8
nameserver 8.8.4.4

Enable SSH

Enabling SSH is simple. Open stage2/01-sys-tweaks/01-run.sh and change the line systemctl disable ssh to systemctl enable ssh.

(I later learned that we can also enable ssh on a headless pi by adding an empty file named ssh to the boot partition of a standard Raspbian image. See here for more details: https://www.raspberrypi.org/documentation/remote-access/ssh/)

Configuring SSH keys (or not)

I decided after writing much of this tutorial that pi-gen was not necessarily the best tool for adding my public SSH keys. So long as I have network access and SSH enabled, I can easily add my keys using ssh-copy-id. Furthermore, after following this tutorial, there still remains a lot of setup and customization steps. These can more easily be performed manually or by server automation tools like Fabric or Ansible.

Therefore, I think that at this point we can stop with our customization of the image with pi-gen and move to a different tool. We have a basic Raspbian image that is already configured for our home network and that serves as a starting point for more complete customization.

Conclusion

This tutorial and my previous post demonstrated how to create a custom Raspbian image that is pre-configured for

  • our home wireless network
  • our locale information
  • ssh

Of course, we can do much, much more with pi-gen, but other tools exist for the purpose of configuring a server. These tutorials at least allow you to setup a new Raspberry Pi without having to manually configure its most basic functionality. Happy Pi'ing!

Create a custom Raspbian image with pi-gen: part 1

Docker has been an amazing tool for improving my development efficiency on the Raspberry Pi. For example, I recently used it to cross-compile a large C++ and Python library for the Pi's ARM architecture on my x86_64 laptop. However, in that post I took it for granted that I had already set up my Raspberry Pi with user accounts, packages, ssh keys, etc. Performing these steps manually on a fresh install of the Pi's Raspbian operating system can become tedious, especially because ssh needs to be manually enabled before doing any remote work.

Fortunately, the Raspberry Pi developers have provided us with pi-gen, a useful collection of Shell scripts and a Docker container for creating custom Raspbian images. In this post, I will summarize the steps that I take in using pi-gen to create my own, personalized Raspbian image.

After I wrote this post, I found a set of posts at Learn Think Solve Create that describe many of the tasks I explain here. Be sure to check them out for another take on modifying Raspbian images.

Clone the pi-gen repository

This is as easy as cloning the git repository.

git clone git@github.com:RPi-Distro/pi-gen.git

Alternatively, you can use the https address instead of ssh, which is https://github.com/RPi-Distro/pi-gen.git.

From now on, all directories in this post will be relative to the root pi-gen directory.

Build the official Raspbian images

By default, the pi-gen repository will build the official Raspbian images. Doing this once before making any modifications is probably a good idea; if you can't build the official images, how will you be able to build a custom image?

There are two main scripts that you can use to do this: build.sh and build-docker.sh. build.sh requires you to install the packages that are listed in the repository's README.md file, whereas build-docker.sh requires only that you have Docker already installed on your computer. I'm going to be using the Docker-based build script for the rest of this post. If you don't have Docker installed on your system, you can follow the instructions here for the community edition.

Name your image

First we need to give a name to our image, even if we use the default build. To do this, we assign a name to a variable called IMG_NAME inside a file called config that is located inside the root pi-gen folder.

echo "IMG_NAME=my_name" > config

Build the default image

Once we've named our image, we can go ahead and run the build script.

./build-docker.sh

Be prepared to wait a while when running this script; the full build took well over half an hour on my laptop and with the Docker volume located on a SSD. It also consumed several GB of space on the SSD.

Resuming a failed build

The first time I used pi-gen the build failed twice. Once, it hung without doing anything for several minutes, so I canceled it with a Ctrl-C command. The other time I encountered a hash error when installing a Debian package.

We can resume a failed build from the point of failure by assigning the value 1 to the CONTINUE variable when calling build-docker.sh again.

CONTINUE=1 ./build-docker.sh

If we don't want to run previously built stages, we can simply place a file inside the corresponding folder named SKIP. For example, if our build fails at stage2, we can place SKIP files inside the stage0 and stage1 folders, then rerun the build-docker.sh script with CONTINUE=1.

Unfortunately, I have sometimes noticed that I have to also rebuild the stage prior to the one where the build failed. In the worst case, I had to rebuild all the stages because the fixes I applied to a file in stage2 were not accounted for when I tried to skip building stages 0 and 1. YMMV with this; I have no idea how well the SKIP mechanism works for the normal build.sh script.

After a successful build, we can find our custom images located inside the deploy folder of the pi-gen directory. These may then be written onto a SD card and used as a standard Raspbian image.

We can ensure that the build container is preserved even after successful builds using

PRESERVE_CONTAINER=1 ./build-docker.sh

Custom Raspbian images

Now that we've got the default build working, let's start by customizing the build process. For this post, I have the following goals:

  • Build only the lite version of the Raspbian images
  • Add a custom user account and delete the default pi account
  • Set the Pi's locale information

In a follow-up post, I will discuss the following:

  • Setup the WiFi for a home network
  • Setup ssh so that we can log on to the Pi remotely on its first startup

Building just Raspbian Lite

Raspbian Lite is a minimal Raspbian image without the X windows server and speciality modules that would otherwise make Raspbian more user friendly. It's an ideal starting point for projects that are highly specialized, require only a few packages, and do not require a GUI.

pi-gen creates Raspbian images in sequential steps called stages. At the time of this writing, there were five stages, with stages 2, 4, and 5 producing images of the operating system. Building everything from stage 0 up to and including stage 2 produces a Raspbian Lite image. We can speed up the build process and save harddrive space by disabling all the later stages.

To disable the build for a particular a stage, we add an empty file called SKIP inside the corresponding stage folder of the pi-gen root directory, just as we did above when skipping previously built stages. We also disable the explicit creation of images by adding an empty file called SKIP_IMAGES to stages 4 and 5. (We don't need to add a SKIP_IMAGES file to the stage3 folder because no image is produced at this stage.)

touch ./stage3/SKIP ./stage4/SKIP ./stage5/SKIP
touch ./stage4/SKIP_IMAGES ./stage5/SKIP_IMAGES

Now, when we run build-docker.sh, pi-gen will only build and produce one image for Raspbian Lite in the deploy directory.

Add a custom user account

The default user in Raspbian is called pi. This account is created in stage1 in the the script stage1/01-sys-tweaks/00-run.sh. This account is not very secure because it and its password, raspberry, are the well-known defaults in Raspbian. Let's go ahead and change them.

The relevant lines in the script look like this:

on_chroot << EOF
if ! id -u pi >/dev/null 2>&1; then
     adduser --disabled-password --gecos "" pi
fi
echo "pi:raspberry" | chpasswd
echo "root:root" | chpasswd
EOF

The user pi is created with the line adduser --disabled-password --gecos "" pi if it doesn't already exist. According to the adduser man pages The --disabled-password flag prevents the program passwd from setting the account's password when adduser is run, but remote logins without password authentication to the pi account are still allowed. the --gecos "" flag simply adds an empty string to the /etc/passwd file for the pi account.

After the user is created, raspberry is set as pi's password and root is set as the root password in the lines echo "pi:raspberry" | chpasswd and echo "root:root" | chpasswd.

Let's start by modifying the pi account. For the sake of this example, let's change its name to alphapi. For the password, we will generate a temporary, random password and write it to a file in the deploy directory. We'll do the same for root. The modifications look like the following:

user_passwd=$(< /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c${1:-8})
root_passwd=$(< /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c${1:-8})

# Write passwords to a file.
cat <<EOF > /pi-gen/deploy/users
${user_passwd}
${root_passwd}
EOF

on_chroot << EOF
if ! id -u alphapi >/dev/null 2>&1; then
     adduser --disabled-password --gecos "" alphapi
fi
echo "alphapi:${user_passwd}" | chpasswd
echo "root:${root_passwd}" | chpasswd
EOF

The first two lines create random alphanumeric passwords for the users alphapi and root. They should be changed immediately when the image is first run.

user_passwd=$(< /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c${1:-8})
root_passwd=$(< /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c${1:-8})

This way of password generation works by reading random bytes from /dev/urandom and redirecting them to the standard input of the tr command, which filters the input so only alphanumeric characters remain. Next, the output is piped to the head command, which outputs only the first eight alphanumeric characters produced in this fashion.

The passwords are then written to a file named users inside the deploy directory where the outputs will eventually be placed.

# Write passwords to a file.
cat <<EOF > /pi-gen/deploy/users
${user_passwd}
${root_passwd}
EOF

The remaining parts of the script are more-or-less the same as before, except I changed pi to alphapi and used variable substitution for the passwords.

Running ./build-docker.sh at this point will raise an error in stage02 because it's at this stage where the user pi is added to the various groups on the system. We therefore need to open stage2/01-sys-tweaks/01-run.sh and modify the following lines, replacing pi with alphapi.

for GRP in adm dialout cdrom audio users sudo video games plugdev input gpio spi i2c netdev; do
    adduser alphapi $GRP
done

Set the locale information

The locale information used by your operating system may be modified as follows. Open stage0/01-locale/00-debconf. I personally changed every occurence of en_GB.UTF-8 to en_US.UTF-8, but you can set your locale accordingly.

# Locales to be generated:
# Choices: All locales, aa_DJ ISO-8859-1, aa_DJ.UTF-8 UTF-8, ...
locales locales/locales_to_be_generated multiselect en_US.UTF-8 UTF-8
# Default locale for the system environment:
# Choices: None, C.UTF-8, en_US.UTF-8
locales locales/default_environment_locale   select  en_US.UTF-8

Next, we open stage2/01-sys-tweaks/00-debconf. I currently live in Europe, so I made the following changes:

tzdata        tzdata/Areas    select  Europe

I also made the following changes to switch from the default British English to American English:

keyboard-configuration keyboard-configuration/xkb-keymap select us
keyboard-configuration keyboard-configuration/fvariant  select  English (US) - English (US\, international with dead keys)

Note that the comment in 00-debconf above the keyboard-configuration/xkb-keymap line erroneously states that American English is an option, but it's not. You need to change it from "gb" to "us" if you want the American layout.

Using the custom image

With all these changes, we can build our new image by running ./build-docker.sh and, if successful, find a .zip file inside the deploy directory with the image name and date.

To use this image, we unzip the file to extract the .img file inside it. Next, we need to copy it onto a SD card that will plug into the pi. I have a SD card reader/writer on my laptop for which I check for its Linux device name by running lsblk before and after plugging in the card. (The device that appears in the output of lsblk after plugging it in is its name, which is /dev/mmcblk0 on my laptop). Once I get its device name, I use the Linux dd command to copy the contents of the image onto the card. (Be sure to change /dev/mccblk0 to match the name that your system gives to your SD card device.)

sudo dd if=2018-07-21-my_name-lite.img of=/dev/mmcblk0 bs=4096; sync

Please be EXTREMELY careful that you get the device name right. It's not very difficult to write the contents of the image file over your root partition or other important data.

After writing the image, we can plug the SD card into our pi, boot it up, and try logging in as alphapi with the random password that was created in the users file. Be sure at this point to change your user's and root's password. We can also verify that the keyboard was set to US English by typing Shift-3 and observing whether we get a hashtag (#) symbol and not the symbol for the British pound currency.

In a follow-up post, I will describe how to setup the network and SSH so I can continue to setup my Raspberry Pi without ever needing a terminal.

How I built a cross-compilation workflow for the Raspberry Pi

Some of you may know I tinker with the Raspberry Pi in my free time and that one of my current projects is to build a lensless microscope with the Pi as the brains. To control the microscope, I decided a while ago that I would use Micro-Manager, an open-source software package for microscope control. I made this decision for a few reasons:

  1. I already knew the Micro-Manager codebase since I use it frequently at work.
  2. The Micro-Manager core provides a device-independent interface to hardware.
  3. I've contributed to the project in the past and feel a sense of loyalty to the project and the people involved. Expanding Micro-Manager into embedded microscopy would be a great way for me to give back to the community.

Building Micro-Manager from source code presents its own set of challenges. After ensuring that you have the correct build environment, you need to actually compile it, and here's where things get tricky in Raspyberry Pi development. The Pi has an ARM processor, whereas most laptops and workstations use a x86_64 processor. This means that code compiled on a typical desktop PC will not work on the Pi. As I showed in my earlier post, you can compile the code directly on the Pi to circumvent this, but this unfortunately is quite cumbersome because the code base and dependencies are quite large. (They are nearly 8 GB in total). Furthermore, compiling the project on the Pi is slow and requires connecting to it via ssh or working directly on a TV screen or monitor.

These problems extend beyond Micro-Manager to other large-scale projects that require code compilation for a specific processor architecture. In this post, I'll describe the workflow that I developed for cross-compiling projects for the Raspberry Pi.

Previous attempts

Prior to the workflow that is the main topic of this post, I managed to cross-compile Micro-Manager using a chroot environment and the QEMU emulator. chroot is a Linux command that changes the apparent root (or '/') directory for a running process. With this approach, I mount an image of the Raspbian operating system that contains the gcc and g++ compilers and libraries for the ARM architecture. Then, I chroot into the image and run a setup script that builds the software. During execution of this script, the QEMU static libraries run the ARM compilers from within the chroot environment to build the project. The compiled code remains inside the image, which I then burn onto a micro SD card to insert into the Pi. I uploaded a gist of the bash script which orchestrates all this, and my inspiration for this approach came from a great series of blog posts from Disconnected Systems.

Ultimately this approach is a huge amount of work. As you can see in the gist, it's fairly complicated bash scripting that's not easy to debug. Furthermore, the setup script that is run inside the image needs to do a lot of work beyond cross-compiling, like setting up the user, permissions, network, etc. Debugging the final product is also a challenge because you need to verify that it's working on the Pi, which requires burning the image to a micro SD card.

Cross-compiling with Docker

After a bit of research I decided I would try instead to use Docker for cross-compilation and deployment to the Pi. I had just started using Docker at work to build reproducible environments for scientific computing research. In particular, and unlike my chroot script, I had learned that a Docker container that built the project could work on nearly any system that had Docker installed. Furthermore, deploying updates can be done on any Raspberry Pi that's running Docker.

I liked the idea of a portable cross-compilation workflow, so I dove into the Docker documentation and managed to get everything working in a few weeks of tinkering at home.

An overview of Docker

You can find many resources online about Docker, so I won't go into the details here. The main thing you need to know is that Docker is a system for creating, running, and sharing containers, which are something like light weight virtual machines. Containers solve the problem in software development of how to build and deploy programs that have a complex set of dependencies. It does this by isolating the environment in which a program runs from the rest of the operating system. For example, if you have a computer that has a certain version of gcc (the GNU C compiler) installed, but your application requires a different version, then you can install the required gcc along with your application inside a container and they will not interfere with the version of gcc that belongs to your operating system. This also means that you can send your container to any machine that has Docker installed and it should just run without having to do any setup.

Other important things to know about Docker are:

  • There are two main types of objects: images and containers. Images are sort of like blueprints that define what is inside a container, whereas containers are like the actual buildings specified by the blueprints. There can be many containers that come from a single image.
  • Containers are meant to be immutable. When you stop them and restart them, they always restart in the same state as when they were first created.
  • Since containers are immutable, some of your application data may need to be placed in a volume, which is either another container or a folder on the host system. A volume gets connected to your application container and exists even when your application container is not running.

The cross-compilation workflow

Now that we have established the essential background to this project, let's look at the cross-compilation workflow. Below is a picture that provides a sense of the entire process, moving in general from left-to-right.

The cross-compilation workflow

The process involves two Docker containers: one for building Micro-Manager and the other for running the application. The build dependencies and the QEMU emulator are both located inside the build container, having been specified when its image was created. These allow us to compile Micro-Manager for the ARM architecture. The source code is connected to the build container as a bind mount, which is a folder from the host workstation that is mounted inside the build container when it is run.

Once the libraries are compiled, they are installed into a folder inside the bind mount so that the host system will have access to them after the build container closes. Next, the compiled libraries are copied directly into an image that defines the application container. This image defines only the essential run-time requirements for running Micro-Manager and nothing else. The application image is stored on the registry server which I set up on my local network. This makes it easy for the Raspberry Pi to download the latest image and run the Micro-Manager application container whenever I make changes.

An important aspect of this workflow is how the data is passed between the systems and containers. Unlike what you will find in many introductory tutorials on Docker, I do not add the Micro-Manager source code directly to the build image/containers but instead use a bind mount. The reason for this is that the source code and 3rd party libraries are quite large, about 8 GB in total. By using a bind mount, I avoid needless copying of this data. Another reason for using a bind mount is that the source code will change frequently during development. If I add the source code to the image, then I will have to recreate the image every time the source code changes.

Once the libraries are built, I directly copy them into the application image because they are much, much smaller than the source code. I also want the code stored directly in the image so that the application image is all the Raspberry Pi needs to run the program. The image is stored in my local Docker registry server so that once I push an updated image to the server, the Raspberry Pi can download it and use it immediately.

Step 0: Prerequisites

I am going to assume that you already have installed Docker. (If not, follow these directions.) I am also going to assume that you are somewhat familiar with how to work on a Linux system. The Raspberry Pi runs Linux, so you probably wouldn't be here if you didn't already know at least a little.

For this article, I am working with these versions of Docker and Ubuntu on my host workstation.:

kmdouglass@xxxxx:~$ uname -a
Linux xxxxx 4.13.0-39-generic #44~16.04.1-Ubuntu SMP Thu Apr 5 16:43:10 UTC 2018 x86_64 x86_64
x86_64 GNU/Linux

kmdouglass@xxxxx:~$ docker version
Client:
 Version:      18.03.1-ce
 API version:  1.37
 Go version:   go1.9.5
 Git commit:   9ee9f40
 Built:        Thu Apr 26 07:17:20 2018
 OS/Arch:      linux/amd64
 Experimental: false
 Orchestrator: swarm

Server:
 Engine:
  Version:      18.03.1-ce
  API version:  1.37 (minimum version 1.12)
  Go version:   go1.9.5
  Git commit:   9ee9f40
  Built:        Thu Apr 26 07:15:30 2018
  OS/Arch:      linux/amd64
  Experimental: false

Finally, below is how my project directory structure is laid out.:

kmdouglass@xxxxx:~/src/alphapi/docker$ tree -L 2
.
└── rpi-micromanager
    ├── 2.0-python
    │   ├── build
    │   └── Dockerfile
    └── build
        ├── build
        ├── Dockerfile
        ├── run
        └── setup

I have two folders; build, which contains the files for the build container, and 2.0-python, which contains the files for creating the Micro-Manager application container. (In my case, I am going to build the Python wrapper for Micro-Manager 2.0.) Inside each folder are the scripts and Dockerfiles that execute the various steps of the workflow.

The final prerequisite is to register QEMU with the Docker build agent. First, install a few packages for QEMU. On Ubuntu, this looks like

$ sudo apt update
$ sudo install qemu qemu-user-static qemu-user binfmt-support

Finally, register the build agent with the command:

$ docker run --rm --privileged multiarch/qemu-user-static:register --reset

Step 1: Create the build image

Inside the build folder, I have a file called Dockerfile. Here are its contents.

# Copyright (C) 2018 Kyle M. Douglass
#
# Defines a build environment for Micro-Manager on the Raspberry Pi.
#
# Usage: docker build \
#          -t NAME:TAG \
#         .
#

FROM resin/raspberrypi3-debian:stretch
MAINTAINER Kyle M. Douglass <kyle.m.douglass@gmail.com>

RUN [ "cross-build-start" ]

# Get the build dependencies.
RUN apt-get update && apt-get -y install --no-install-recommends \
autoconf \
automake \
build-essential \
git \
libatlas-base-dev \
libboost-dev \
libboost-all-dev \
libtool \
patch \
pkg-config \
python3-dev \
python3-pip \
python3-setuptools \
python3-wheel \
swig \
&& apt-get clean && rm -rf /var/lib/apt/lists/* \
&& pip3 install numpy

RUN [ "cross-build-end" ]

# Set up the mount point for the source files and setup script.
ADD setup /micro-manager/
VOLUME /micro-manager/src

WORKDIR /micro-manager/src
ENTRYPOINT [ "/sbin/tini", "-s", "--" ]
CMD [ "/micro-manager/setup" ]

A Dockerfile defines the steps in building an image -- in this case, the build image. Let's break this file down into pieces. In the first two lines that follow the comments, I specify that my image is based on the resin/raspberrypi3-debian:stretch image and that I am the maintainer.

FROM resin/raspberrypi3-debian:stretch
MAINTAINER Kyle M. Douglass <kyle.m.douglass@gmail.com>

Images from Resin are freely available and already have the QEMU emulator installed. Next, I specify what commands should be run for the ARM architecture. Any commands located between RUN [ "cross-build-start" ] and RUN [ "cross-build-end" ] will be run using the emulator. Inside these two commands, I install the build dependencies for Micro-Manager using apt-get and pip. (These are just standard commands for installing software on Debian/Ubuntu Linux machines and from PyPI, respectively.)

After the installation of the requirements completes, I add the setup script to the folder /micro-manager inside the image with the ADD setup /micro-manager/ command. The setup script contains the commands that will actually compile Micro-Manager. I then define a mount point for the source code with VOLUME /micro-manager/src. It's important to realize here that you do not mount volumes inside images, you mount volumes inside containers. This command is just telling the image to expect a folder to be mounted at this location when the container is run.

The last three lines set the working directory, the entrypoint and the default container command, respectively.

WORKDIR /micro-manager/src
ENTRYPOINT [ "/sbin/tini", "-s", "--" ]
CMD [ "/micro-manager/setup" ]

This specific entrypoint tells Docker that any containers built from this image should first run Tini, which is a lightweight init system for Docker containers. If you do not specify Tini as the entry point, then it will not be able to reap zombies. (I don't know what this means exactly, but it sounds cool and you can read about it here: https://github.com/krallin/tini)

By default, the container will run the setup script, but, since I used the CMD directive, this can be overriden in case we need to perform some manual steps. Roughly speaking, you can think of the entrypoint as the command that can not be overridden and the CMD command as the one that can be. In other words, Tini will always be executed when containers created from this image are launched, whereas you can choose not to run the setup script but instead to enter the container through a Bash shell, for example.

To build the image, I use the following build script located in the same directory as the Dockerfile for convenience.

#!/bin/bash
# Copyright (C) 2018 Kyle M. Douglass
#
# Usage: ./build
#

docker build \
       -t localhost:5000/rpi-micromanager:build \
       .

By using -t localhost:5000/rpi-micromanager:build argument I am giving the image a name of rpi-micromanager, a tag of build, and specifying that I will eventually host this image on my local registry server (localhost) on port 5000.

In case you are wondering about the contents of the setup script, don't worry. I'll explain it in the next section.

Step 2: Compile Micro-Manager

After the image is built, I create a container and use it to compile Micro-Manager. For this, I use the run script in the build directory.

#!/bin/bash
# Copyright (C) 2018 Kyle M. Douglass
#
# Usage: ./run DIR CONFIGURE
#
# DIR is the parent folder containing the micro-manager Git
# repository, the 3rdpartypublic Subversion repository, and any
# additional build resources.
#
# If CONFIGURE=true, the build system is remade and the configure
# script is rerun before running 'make' and 'make install'. If
# CONFIGURE=false, only 'make' and 'make install' are run.
#
# The compiled program files are stored in a bind mount volume so that
# they may be copied into the deployment container.
#

src_dir=$1
cmd="/micro-manager/setup $2"

# Remove the build artifacts from previous builds.
if [ "$2" == true ] || [ "$2" == false ]; then
    rm -rf ${src_dir}/build || true
fi

docker run --rm \
       -v ${src_dir}:/micro-manager/src \
       --name mm-build \
       localhost:5000/rpi-micromanager:build \
       ${cmd}

The script takes two arguments. The first is the path to the folder containing all the source code (see below for details). The second argument is either true or false. (It can actually be anything, but it will only compile Micro-Manager if either true or false are provided.) If true, the full build process is run, including setting up the configure script; if false, only make and make install are run, which should recompile and install only recently updated files.

The run script uses the -v argument to docker run to mount the source directory into the container at the point specified by the VOLUME command in the Dockerfile. The directory layout on my host file system for the source directory looks like this:

kmdouglass@xxxxx:/media/kmdouglass/Data/micro-manager$ tree -L 1
.
├── 3rdpartypublic
├── micro-manager
└── patches

The patches folder is not necessary and only there to fix a bug in the WieneckeSinscke device adapter. (This bug may be fixed by now.) 3rdpartypublic is the large Subversion repository of all the required software to build Micro-Manager, and micro-manager is the cloned GitHub repository. Prior to building, I checkout the mm2 branch because I am interested in developing my application for Micro-Manager 2.0.

The setup script that is run inside the container and mentioned in the previous section looks like this.

#!/bin/bash
#
# # Copyright (C) 2018 Kyle M. Douglass
#
# Builds Micro-Manager.
#
# Usage: ./setup CONFIGURE
#
# If CONFIGURE=true, the build system is remade and the configure
# script is rerun before running 'make' and 'make install'. If
# CONFIGURE=false, only 'make' and 'make install' or run.
#
# Kyle M. Douglass, 2018
#

# Move into the source directory.
cd micro-manager

# Undo any previous patches.
git checkout -- DeviceAdapters/WieneckeSinske/CAN29.cpp
git checkout -- DeviceAdapters/WieneckeSinske/WieneckeSinske.cpp

# Patch the broken WieneckeSinske device adapter.
patch DeviceAdapters/WieneckeSinske/CAN29.cpp < ../patches/CAN29.cpp.diff \
&& patch DeviceAdapters/WieneckeSinske/WieneckeSinske.cpp < ../patches/WieneckeSinske.cpp.diff

# Compile MM2.
if [ "$1" = true ]; then
    # Remake the entire build system, then compile from scratch.
    ./autogen.sh
    PYTHON="/usr/bin/python3" ./configure \
        --prefix="/micro-manager/src/build" \
        --with-python="/usr/include/python3.5" \
        --with-boost-libdir="/usr/lib/arm-linux-gnueabihf" \
        --with-boost="/usr/include/boost" \
        --disable-java-app \
        --disable-install-dependency-jars \
        --with-java="no"
    make
    make install
    chmod -R a+w /micro-manager/src/build
elif [ "$1" = false ]; then
    # Only recompile changed source files.
    make
    make install
    chmod -R a+w /micro-manager/src/build
else
    echo "$1 : Unrecognized argument."
    echo "Pass \"true\" to run the full build process."
    echo "Pass \"false\" to run only \"make\" and \"make install\"."
fi

Most important in this script is the call to configure. You can see that the compiled libraries and Python wrapper will be written to the build folder inside the mounted directory. This gives the host file system access to the compiled artifacts after the container has stopped.

Step 3: Build the application image

Once the libraries are compiled, we can add them to an application image that contains only the essentials for running Micro-Manager.

For this, I use a separate Dockerfile inside the 2.0-python directory.

# Copyright (C) 2018 Kyle M. Douglass
#
# Builds the Micro-Manager 2.0 Python wrapper for the Raspberry Pi.
#
# Usage: docker build \
#          -t NAME:TAG \
#          .
#

FROM resin/raspberrypi3-debian:stretch
MAINTAINER Kyle M. Douglass <kyle.m.douglass@gmail.com>

RUN [ "cross-build-start" ]

# Install the run-time dependencies.
RUN apt-get update && apt-get -y install --no-install-recommends \
    libatlas-base-dev \
    libboost-all-dev \
    python3-pip \
    python3-setuptools \
    python3-wheel \
    && pip3 install numpy \
    && apt-get clean && rm -rf /var/lib/apt/lists/*

# Copy in the Micro-Manager source files.
RUN useradd -ms /bin/bash micro-manager
WORKDIR /home/micro-manager/app
COPY --chown=micro-manager:micro-manager . .

RUN [ "cross-build-end" ]

# Final environment configuration.
USER micro-manager:micro-manager
ENV PYTHONPATH /home/micro-manager/app/lib/micro-manager
ENTRYPOINT ["/sbin/tini", "-s", "--"]
CMD ["/usr/bin/python3"]

As before, I use a clean resin base image. However, this time I only install the essential software to run Micro-Manager.

After apt-getting and pip-installing everything, I create a new user called micro-manager and a new folder called app inside this user's home directory.

# Copy in the Micro-Manager source files.
RUN useradd -ms /bin/bash micro-manager
WORKDIR /home/micro-manager/app

Next, I directly copy the compiled libraries into the image with the COPY command.

COPY --chown=micro-manager:micro-manager . .

The two periods (.) mean that I copy the current host directory's contents into the container's current working directory (/home/micro-manager/app). What is the current host directory? Well, as I explain below, I actually run this Dockerfile from inside the build folder that was created to hold the compiled libraries in the previous step. But first, I'll end my explanation of the Dockerfile by saying that I switch the USER so that I do not run the container as root, add the library to the PYTHONPATH environment variable, and setup the default command as the python3 interpreter.

To build this image, I use the following build script.

#!/bin/bash
# Copyright (C) 2018 Kyle M. Douglass
#
# Usage: ./build DIR
#
# DIR is the root directory containing the Micro-Manager build
# artifacts. These artifacts will be added to the Docker image.
#

src_dir=$1

cp Dockerfile ${src_dir}
cd ${src_dir}

docker build \
       -t localhost:5000/rpi-micromanager:2.0-python \
       .

This script takes one argument, which is the build directory containing the compiled source code. The script first copies the Dockerfile into this directory and then changes into it with the cd command. (This explains the two periods (.) in the COPY command in the Dockerfile.)

Finally, I build the image and give it a name of localhost:5000/rpi-micromanager:2.0-python.

Step 4: Add the image to the local registry server

Now we need a way to get the image from the workstation onto the Raspberry Pi. Of course, I could manually transfer the file with a USB stick or possibly use ssh, but what if I have multiple Pi's? This process could become cumbersome. Docker provides a few ways to push and pull images across a network. The most obvious is Dockerhub, a site for freely sharing images. For the moment I don't want to use Dockerhub, though, because I have not yet checked all the software licenses and am unsure as to what my rights are for putting an image with Micro-Manager software on a public repository.

A better option, especially for testing, is to use a local registry server. This server operates only on my home network and already allows my workstation and Pi's to communicate with one another. Following the official registry documentation and this blog post by Zachary Keeton, I managed to setup the registry as follows.

Host setup

First, we need to setup a transport layer security (TLS) certificate. It's possible to run the server without one if you don't expect your network to be attacked, but it's good practice so let's create one.

To do this, I edit the /etc/ssl/openssl.cnf file and add the following to the top of the [ v3_ca ] section.:

subjectAltName = IP:192.168.XXX.XXX

where the IP address is the address of the workstation on the network. Next, I actually create the certificate. I make a directory called certs inside my workstation home directory and then use openssl to make the cerficate. During the prompts, I press ENTER at every step except the FQDN (fully qualified domain name). For the FQDN, I enter the same IP address as above.

mkdir certs
openssl req -newkey rsa:4096 -nodes -sha256 \
-keyout certs/domain.key -x509 -days 365 \
-config /etc/ssl/openssl.cnf -out certs/domain.crt

I had to add the ``-config /etc/ssl/openssl.cnf`` argument for the subject alternative name to be added to the certificate. This part was tricky, because if this argument is not included, then the key generation step will use some other .cnf file (I am not sure which). This results in the following SAN error when attemptingt to connect to the registry.:

cannot validate certificate for 192.168.XXX.XXX because it doesn't contain any IP SANs

After the domain.key and domain.crt files have been created, I run the official registry server container. (See how handy Docker containers are? There's no messy installation beyond grabbing the container.)

docker run -d -p 5000:5000 \
  --restart=always \
  --name registry \
  -v $(pwd)/certs:/certs \
  -e REGISTRY_HTTP_TLS_CERTIFICATE=/certs/domain.crt \
  -e REGISTRY_HTTP_TLS_KEY=/certs/domain.key \
  registry:2

If the registry:2 image is not already downloaded, then it will be downloaded for automatically when running the container. Note that the -p 5000:5000 argument indicates that the server is using port 5000 on both the host system and inside the container. Note also that the certs directory is relative to the current directory because I use the ($pwd) command. You can change this to an absolute path if you wish on your setup.

Let's go ahead and push the application image to the server now that it's running.

docker push localhost:5000/rpi-micromanager:2.0-python

Setup the Pi

Now, startup the Pi. I will assume that you have already installed Docker on it and know how to communicate with it via ssh and copy files to it using scp.

I copy the certificate from the host with scp.

sudo mkdir -p /etc/docker/certs.d/192.168.XXX.XXX:5000/
sudo scp kmdouglass@192.168.XXX.XXX:/home/kmdouglass/certs/domain.crt /etc/docker/certs.d/192.168.XXX.XXX:5000/ca.crt

The IP address that I am using is the one to the machine where the registry server is running. After this step, I make the operating system trust the certificate.

sudo scp kmdouglass@192.168.XXX.XXX:/home/kmdouglass/certs/domain.crt /usr/local/share/ca-certificates/192.168.XXX.XXX.crt
sudo update-ca-certifications

Finally, I restart the Docker daemon.

sudo service docker restart

If everything is working, then I should be able to pull the image from your network's registry server.

docker pull 192.168.XXX.XXX:5000/rpi-micromanager:python2.0

Step 5: Run Micro-Manager!

And now the moment of truth: running the application container. Since it's setup to run Python automatically, I use a pretty simple docker run command.

docker run -it --rm \
     --name micro-manager \
     192.168.XXX.XXX:5000/rpi-micromanager:2.0-python

I verify that the Micro-Manager Python wrapper is working by trying to import it and run a few basic commands.

>>> import MMCorePy
>>> mmc = MMCorePy.CMMCore()
>>> mmc.getVersionInfo()

If these work without error, then congratulations! You're now ready to start building your embedded microscopy system ;)

Step 6: Running the whole process

The beauty of having scripted all these steps is that the full workflow may be executed quite simply. From the host system's build folder, run:

kmdouglass@xxxxx:~/src/alphapi/docker/rpi-micromanager/build$ ./build
kmdouglass@xxxxx:~/src/alphapi/docker/rpi-micromanager/build$ ./run /path/to/source true

From the 2.0-python folder:

kmdouglass@xxxxx:~/src/alphapi/docker/rpi-micromanager/2.0-python ./build /path/to/source/artifacts
kmdouglass@xxxxx:~$ docker push localhost:5000/rpi-micromanager:2.0-python

And from the Raspberry Pi:

pi@yyyyy:~$ docker pull 192.168.XXX.XXX:5000/rpi-micromanager:2.0-python
pi@yyyyy:~$ docker run -it --rm \
                   --name micro-manager \
                   192.168.XXX.XXX:5000/rpi-micromanager:2.0-python

Hopefully this is enough to get you started building Micro-Manager for the Raspberry Pi with Docker. Though I focused on Micro-Manager, the workflow should be generally applicable to any large scale project in which you want to isolate the build environment from the host machine.

If you have any questions, just leave them in the comments. Happy programming!