Rust Kernel Module: Pinning

Wu Yu Wei published on
4 min, 740 words

Let's talk about pinning! This is a concept we usually don't encounter often. It's introduced in the update of Rust 1.36 which brings async syntax and many other related types to build the async executor. And Pin is one of them. If you want to understand it comprehensively, documentation of std::pin has excellent explanation of why we need to pin the data. But in short, it's basically a type to pin data to its location in memory.

This is in fact rarely seen in the world of C/C++ which doesn't construct movable objects frequently. Even when we work with std::move in C++, we will make sure move constructor is equal to delete. However, cases like moving data are the default behavior in Rust. Its cloning requires explicit semantics.

The problem requires pinning

When everything is under the context of Rust, it's of course not a problem. Ownership model can take care of it. But this is not the case in our kernel module. Instead of rewriting everything in Rust, it's still more reasonable to build on top of something that already exists. This is how kernel crate wraps around the existing kernel interface. And that's why you see we reuse the refcount_t wrapped by Ref type instead of using Arc like normal rust program.

This is the point that pinning matters to us. Because many kernel data structures are actually self-referential. Either it is a list self-reference itself directly, or it points to other structure which points back to it eventually.

Let's look at the definition from kernel's mutex which we will discuss how to use it soon in the future post. It has a wait_list which is an intrusive, circular, double-linked list.

struct mutex {
	atomic_long_t		owner;
	raw_spinlock_t		wait_lock;
#ifdef CONFIG_MUTEX_SPIN_ON_OWNER
	struct optimistic_spin_queue osq;
#endif
	struct list_head	wait_list; // <==================
#ifdef CONFIG_DEBUG_MUTEXES
	void			*magic;
#endif
#ifdef CONFIG_DEBUG_LOCK_ALLOC
	struct lockdep_map	dep_map;
#endif
};

If we represent it visually, you can tell it's self-referenced as soon as you see it. When we define the abstraction in Rust, we of course include this binding and the data we want to protect. So this list is included in the type of Mutex:

pub sturct Mutex<T: ?Sized> {
    mutex: Opaque<bindings::mutex>,
    _pin: PhantomPinned,
    data: UnsafeCell<T>,
}

And because we already know Rust is move by default. When type like this is contained in the field of other struct, it's very trivial to move it in basic operations. For example if we declare a struct contains a mutex and we want to make it a reference count object, we will see the problem as shown in below:

Ref::new(AStruct{
    //...
    innner: Mutex::new(Inner { /* ... */ })

})

We have an original list in inner field, and when we called Ref::new, we need to move that structure to the Ref type we are going to create. This means create and copy the object bit by bit from the original. And because we copy the data the pointer points to instead of pointing to the newly created one, this causes the new list points to the old one.

Miscdev initialization

This also happens to misc device we just learned. Its definition also has an intrusive, circular, double-linked list. That's why methods signature of miscdev::Registration always requires Pin upon registration:

struct miscdevice  {
	int minor;
	const char *name;
	const struct file_operations *fops;
	struct list_head list; // <==================
	struct device *parent;
	struct device *this_device;
	const struct attribute_group **groups;
	const char *nodename;
	umode_t mode;
};
pub fn new_pinned(name: fmt::Arguments<'_>, open_data: T::OpenData) -> Result<Pin<Box<Self>>> {
    Options::new().register_new(name, open_data)
}

pub fn register(
    self: Pin<&mut Self>,
    name: fmt::Arguments<'_>,
    open_data: T::OpenData,
) -> Result {
    Options::new().register(self, name, open_data)
}

You can call new and pin it manually before register, or just call new_pinned to do both thing at the same time. The interface enforces you to pin the date before registration.

Conclusion

And that's the reason we use new_pinned to create the misc device in previous post. The question we haven't solved is how do we also pin the mutex we just mentioned. This will leave to the next post to answer and I think it will also be the good timing to introduce what built-in concurrent primitives we can use in kernel.

Reference