Rust Kernel Module: Character Device Driver

Wu Yu Wei published on
12 min, 2300 words

There's one kind of module called device driver which provides functionality for hardware like a serial port. On Unix, each piece of hardware is represented by a file located in /dev named a device file (or sometimes called device node) which provides the means to communicate with the hardware. Devices are divided into two types: character devices and block devices. Block devices are usually used for storage hardwares since have buffers to handle requests with different scenario. Character device, on the other hand, can accept any length of bytes as they like.

Device File and Device Driver

In our simple initramfs, we can see our console device file is controlled by a character device:

# Add this line in the init script `qemu-init.sh` then run.
$ busybox ls -l /dev/
crw------- 1 0 0   5, 1 May 21 12:43 console

If it starts with c then it means character device. And if it's b, it would be block device. Notice there are two numbers separated by a comma. These are driver's major number and minor number. The major number is a unique number tells you which driver is used to access the hardware. If you see multiple device file with same major number, they are controlled by the same device driver. The minor number is used by the driver to distinguish between the various hardware it controls. Imagine there's a storage hardware might contain multiple partitions, so there might be multiple device files in /dev with same major number. The device driver will use minor number to distinguish different partitions.

To create a device file, we can use mknod command to create the node. For example, mknod /dev/covfefe c 12 2 will create your device file dev/covfefe as character device with major number 12 and minor number 2. And since we use usr/gen_init_cpio to generate our initramfs, it also supports creating nodes with nod <name> <mode> <uid> <gid> <dev_type> <maj> <min>. So if you want to create console statically, you can add the line like nod /dev/console 0600 0 0 c 5 1.

While we can create device files from this for sure, this isn't the best way to do it because we might create files with conflicted major number accidentally. Kernel can create and generate major number dynamically nowadays. We could choose this approach instead. There are also another kind of module even suit for us. I will demonstrate it pretty soon along the way.

More File System Support

Before we proceed, I would like to introduce a few more file systems for us to mount. So we can interact to our character devices and get some more information about them. Let's add following configurations:

  1. Run make menuconfig and check these options:
$ Device Drivers -> Generic Driver Options -> Maintain a devtmpfs filesystem to mount at /dev
$ File Systems -> Pseudo filesystems -> /proc file system support
$ File Systems -> Pseudo filesystems -> /sysfs file system support
  1. In qemu-initramfs.desc, it should have these at least:
dir     /bin                                          0755 0 0
dir     /sys                                          0755 0 0
dir     /dev                                          0755 0 0
dir     /proc                                         0755 0 0
file    /bin/busybox  busybox                         0755 0 0
slink   /bin/sh       /bin/busybox                    0755 0 0
file    /init         .github/workflows/qemu-init.sh  0755 0 0

file    /rust_chrdev.ko  samples/rust/rust_chrdev.ko  0755 0 0
file    /rust_miscdev.ko samples/rust/rust_miscdev.ko 0755 0 0
# Other kernel modules...
  1. Update our qemu-init.sh. You can add whatever commands you want to add. Just don't rmmod modules we just list. To interact with our character devices, we need a shell with job control enabled. The last line in init script is usually used for rescue shell, but it also works perfectly in our case. You can also unmount at the end of the script. But since we are not going to do anything afterwards, it's not really necessary.
#!/bin/sh

busybox mount -t devtmpfs none /dev
busybox mount -t proc none /proc
busybox mount -t sysfs none /sys
busybox insmod rust_chrdev.ko 
busybox insmod rust_miscdev.ko
busybox insmod rust_semaphore.ko
busybox setsid sh -c 'exec sh -l </dev/ttyS0 >/dev/ttyS0 2>&1'

Okay let's build the kernel and image again, run via qemu, and you should get the shell prompt. This time you should see devices created by devtmpfs, and it should also contain some we just installed:

$ busybox ls -l /dev
crw-------    1 0 0   10, 126 May 21 15:59 rust_miscdev
crw-------    1 0 0   10, 125 May 21 15:59 rust_semaphore
...

Device Registration and file::Operations

Alright! Let's see how does the code work, starting with rust_chrdev.rs.

struct RustChrdev {
    _dev: Pin<Box<chrdev::Registration<2>>>,
}

impl kernel::Module for RustChrdev {
    fn init(name: &'static CStr, module: &'static ThisModule) -> Result<Self> {
        pr_info!("Rust character device sample (init)\n");

        let mut chrdev_reg = chrdev::Registration::new_pinned(name, 0, module)?;

        // Register the same kind of device twice, we're just demonstrating
        // that you can use multiple minors. There are two minors in this case
        // because its type is `chrdev::Registration<2>`
        chrdev_reg.as_mut().register::<RustFile>()?;
        chrdev_reg.as_mut().register::<RustFile>()?;

        Ok(RustChrdev { _dev: chrdev_reg })
    }
}

First we can notice the module declares a new field with type of pinned chrdev::Registration. We create the instance via new_pinned with arguments name, module, and a number 0 indicate the starting minor number. Don't worry about what's Pin for now. We will discuss it in the future post. To register a character device, we call register method. It accept any generic has the trait file::Operations which we will explain later. In this case, we define a RustFile for it. Notice we can register the same kind of device twice because our type uses const generic and its type is chrdev::Registration<2>. This matches what we just said above: a device driver can control multiple minors. Next, let's shift our focus to RustFile.

struct RustFile;

impl file::Operations for RustFile {
    kernel::declare_file_operations!();

    fn open(_shared: &(), _file: &file::File) -> Result {
        Ok(())
    }
}

As you can see it doesn't contain many things yet. But the file::Operations trait actually has more methods can implement, we just define an pretty empty one first. The whole trait definition looks like this:

pub trait Operations {
    type Data: PointerWrapper + Send + Sync = ();
    type OpenData: Sync = ();

    const TO_USE: ToUse;

    fn open(context: &Self::OpenData, file: &File) -> Result<Self::Data>;

    fn read(
        _data: <Self::Data as PointerWrapper>::Borrowed, 
        _file: &File, 
        _writer: &mut impl IoBufferWriter, 
        _offset: u64
    ) -> Result<usize> { ... }
    fn write(
        _data: <Self::Data as PointerWrapper>::Borrowed, 
        _file: &File, 
        _reader: &mut impl IoBufferReader, 
        _offset: u64
    ) -> Result<usize> { ... }
    //...
}

All methods except open have default implementation which just returns EINVAL invalid argument signal. So types implement file::Operations trait must define open at least. The trait has two associated types OpenData and Data. OpenData is used for open method, it's passed as an argument when we call register. And Data is returned by open method and made available to other methods. Since we haven't done anything yet, they are both () for now and Rust can derive them automatically so we don't need to specify them explicitly.

Last, we have macro kernel::declare_file_operations!() to help generating TO_USE constant. It's a list of bool to decide which methods will be used as callbacks in the end. In our example, we do nothing for now, so all methods except open will be None (and hence be NULL in files_operation bindings.)

Miscellaneous Device Driver

We discussed that it's not safe to assign a major number to create device file manually. And when we called register method, it pass 0 under the hood which will let kernel dynamically allocate the major number. The newly registered device will have an entry in /proc/devices. We can see the major number by command busybox more /proc/devices. To make the device file, we can either call mknod command or use device_create function in C FFI manner. But we also have another choice which can create the device node when we just call register. It is called Miscellaneous Character Drivers.

Miscellaneous character drivers (short for miscdev) is a subset of character drivers which different drivers will all share the same major number and make minor numbers as their identifier. This is pretty useful when your driver only controls one device, because it will save all the unnecessary minor space. Running busybox more /proc/misc will show all miscdev's minor numbers. What's really convenient is when a miscdev is registered, kernel will create a device node under /dev right away. This suits us pretty well for our upcoming examples. So let's update our chrdev sample to become a miscdev and find how to interact with it! We'll create a module that can show some strings when it's being opened and read, such as cat /dev/rust_chrdev.

// Update your imports to these
use core::sync::atomic::{AtomicBool, AtomicU64, Ordering};
use kernel::io_buffer::*;
use kernel::prelude::*;
use kernel::str::CString;
use kernel::sync::{Ref, RefBorrow};
use kernel::{file, miscdev};


struct SharedState {
    already_opened: AtomicBool,
    count: AtomicU64,
}

struct RustChrdev {
    _dev: Pin<Box<miscdev::Registration<RustFile>>>,
}

impl kernel::Module for RustChrdev {
    fn init(name: &'static CStr, _module: &'static ThisModule) -> Result<Self> {
        pr_info!("Rust character device sample (init)\n");

        let state = Ref::try_new(SharedState {
            already_opened: AtomicBool::new(false),
            count: AtomicU64::new(0),
        })?;

        let miscdev_reg = miscdev::Registration::new_pinned(fmt!("{name}"), state)?;

        Ok(RustChrdev { _dev: miscdev_reg })
    }
}

You will notice a few signatures have been changed but their types should still be similar. We now declare the module with miscdev::Registration and it accepts just a generic that has file::Operations trait instead. Parameters of new_pinned is also a little bit different because miscdev can take an argument for OpenData. This time we actually want to pass some thing for our driver. So we declare a new type SharedState which contains a few atomic types. Notice we also create the instance with Ref type. This is the reference counted pointer refcount_t in kernel which serve the same purpose as Arc in rust std.

In the multiple-threaded environment, without any protection, concurrent access to the same memory may lead to the race condition, and will not preserve the performance. In the kernel module, this problem may happen due to multiple instances accessing the shared resources. Fortunately, OpenData and Data in file::Operations are passed as immutable reference. You can't modify it unless you define with types that promise synchronization. We will introduce other concurrency primitives in later posts. For now, let's just use reference count with atomic combination to let processor handle for you. Now let's update our RustFile as well to see how it utilize SharedState:

struct RustFile;

impl file::Operations for RustFile {
    type OpenData = Ref<SharedState>;
    type Data = Ref<SharedState>;
    kernel::declare_file_operations!(read, write);

    /// Called when a process tries to open the device file, such as `cat /dev/rust_chrdev`.
    fn open(shared: &Ref<SharedState>, _: &file::File) -> Result<Ref<SharedState>> {
        if shared
            .already_opened
            .compare_exchange(false, true, Ordering::Acquire, Ordering::Relaxed)
            .is_ok()
        {
            shared.count.fetch_add(1, Ordering::AcqRel);
            Ok(shared.clone())
        } else {
            Err(EBUSY)
        }
    }

    /// Called when a process closes the device file.
    fn release(shared: Ref<SharedState>, _: &file::File) {
        shared.already_opened.store(false, Ordering::Release);
    }

    /// Called when a process, which already opened the dev file, attempts to read from it.
    fn read(
        shared: RefBorrow<'_, SharedState>,
        _: &file::File,
        data: &mut impl IoBufferWriter,
        offset: u64,
    ) -> Result<usize> {
        // Succeed if the caller doesn't provide a buffer or if not at the start.
        if data.is_empty() || offset != 0 {
            return Ok(0);
        }

        let msg = CString::try_from_fmt(fmt!(
            "This is my {} times saying Hello World!\n",
            shared.count.load(Ordering::Acquire)
        ))?;
        
        // Put the message to the buffer
        data.write_slice(msg.as_bytes())?;
        // Most read functions return the number of bytes put into the buffer.
        Ok(msg.len())
    }

    /// Called when a process writes to dev file, such as `echo "hi" > /dev/hello`
    fn write(
        _: RefBorrow<'_, SharedState>,
        _: &file::File,
        _: &mut impl IoBufferReader,
        _: u64,
    ) -> Result<usize> {
        pr_alert!("Sorry, this operation is not supported.\n");
        Err(EINVAL)
    }
}

We need to declare the type of Data and OpenData since we now want to share some states. They can be different types, but it doesn't need to in this case. In this example we use already_opened field which is a AtomicBool to determine whether the file is currently opened by someone or not and then add the counter.

This time we also have more operation methods, so we need to add them into declare_file_operations! macro. Usually the only required methods are open and release, so what we only need to list in the macro would be read and write. In read method, you can see how to create a message and put into buffer. While we don't do anything in write method, we still add this to demonstrate how to catch these attempts and tell the user that the operation is not supported.

Build and run the kernel again and you should find there's a device file rust_chardev under /dev and see how many times "Hello World" are being called via busybox cat /dev/rust_chrdev. Running command busybox more /proc/misc will show that is now a miscdev and you can get its minor number.

Device Unregistration

Last we need to talk about unregistration. We can not allow rmmod to remove the kernel module that is being used. When you call cat /proc/modules or lsmod, the third field is the counter of how many processes are using that module. It's increase when open is called and decrease when release. The check is performed by system call sys_delete_module.

Conclusion

Congratulations! Your kernel module start doing something. While it's not pretty meaningful, it will get more and more powerful as we add more types and functions into our tool belt. Next will be concurrency primitives like Mutex lock. But before we explore them, we will have to figure out why some types must be pinned.

Reference

The Linux Kernel Module Programming Guide: Character Device drivers