open
enquiries@dvtsoftware.com | +44 020 3422 3400 | +31 208 905590
enquiries@dvtsoftware.com | +44 020 3422 3400 | +31 208 905590
Media

Insights

Trends and insights making an impact in your digital transformation journey.

How to write your first Linux Kernel Module
Ruan de Bruyn

How to write your first Linux Kernel Module


The Linux Kernel is perhaps the most ubiquitous (and arguably still underappreciated) piece of software around today. It forms the basis of all Linux distributions (obviously), but that’s not all. It’s also running on lots of embedded hardware pretty much everywhere. Got a microwave? It’s probably running the Linux Kernel. Dishwasher? That too. Got enough money for a Tesla vehicle? Maybe you can fix a few bugs you find, and submit a patch to their Model S and Model X code on Github. Circuitry that keeps the International Space Station from crashing into the Earth in a fiery mass of death and destruction? Of course. The kernel is lightweight. Just means it plays nicely with low gravity.


The Linux kernel goes through a development cycle that is — quite frankly — insane. Some statistics from the Kernel 5.10 patch show that this release saw 252 new authors making commits into the repo (which is also the lowest amount of new contributors since 5.6), and new releases are coming out every 9 weeks. All in all, the kernel forms the solid bedrock of a large part of the computing world, but it’s not archaic by any means. All good and well, but what if you want to poke around inside it, and maybe write some code yourself? It can be a little daunting, as it’s an area of programming that most schools and boot camps don’t touch on. Plus, unlike with every flavour-of-the-month JavaScript framework that comes crawling out of the woodwork whenever you blink your eyes, you can’t go onto StackOverflow and find an odd billion or so posts to guide you through any issues.


So here we are then. Are you interested in writing a hello world project for the most persistent open source project out there? Partial, perhaps, to take a small dose of Operating Systems theory? Amenable to coding in a language that was created in the ’70s, and gives you a profound sense of accomplishment when you do literally anything at all and it works? Great, because I honestly can’t think of a better way to spend your time otherwise.


Heads up: in this article, I assume that you have a working knowledge of how to set up a Virtual Machine with Ubuntu. There are already tons of resources out there on how to do this, so fire up your favourite VM manager and get it done. I also assume that you’re a little familiar with C, as that is the language the kernel is written in. Since this is just a hello world module, we won’t be doing very complex coding at all, but I won’t be introducing any concepts from the language. At any rate, the code should be basic enough to be self-explanatory. With all that said, let’s get to it.


Writing the base module

Firstly, let’s just define what a kernel module is. A typical module is also called a driver and is kind of like an API, but between hardware and software. See, in most operating systems, you have two spaces where things happen. Kernel space, and userspace. Linux certainly works this way, and Windows does too. Userspace is where user-related stuff goes on, like you listening to a song on Spotify. Kernel space is where all of the low level, inner workings of the OS are. If you’re listening to a song on Spotify, a connection must have been created to their servers, and something on your computer is listening for network packets, retrieving the data inside of them, and eventually passing this on to your speakers or headphones so you can hear the sound. This is what happens in the kernel space. One of the drivers at work here is the software that allows the packets coming through your network port to be translated to music. The driver itself would have an API-like interface that allows user-space applications (or maybe even other kernel-space applications) to call its functions and retrieve those packets.


Luckily, our module won’t be anything like this, so don’t be daunted. It won’t even interact with any hardware. Many modules are entirely software-based. A good example of this is the process scheduler in the kernel, which dictates which cores of your CPU are working on which running process at any given time. A module that purely works with software is also the best place to start getting your hands dirty. Startup your VM, open up the terminal with Ctrl+Alt+T and do the ol’



sudo apt update && sudo apt upgrade

to make sure your software is up to date. Next, let’s get the new software packages we’ll need for this endeavour. Run



sudo apt install gcc make build-essential libncurses-dev exuberant-ctags

With that, we can finally start coding. We’ll start it off easy, and just put the following code in a source file. I put mine in Documents and named it dvt-driver.c



#include 
#include 
#include 
#include 
#include 

// Module metadata
MODULE_AUTHOR("Ruan de Bruyn");
MODULE_DESCRIPTION("Hello world driver");
MODULE_LICENSE("GPL");

// Custom init and exit methods
static int __init custom_init(void) {
 printk(KERN_INFO "Hello world driver loaded.");
 return 0;
}

static void __exit custom_exit(void) {
 printk(KERN_INFO "Goodbye my friend, I shall miss you dearly...");
}

module_init(custom_init);
module_exit(custom_exit);

Note that we don’t need all the includes right this second, but we’ll use all of them soon enough. Next, we need to compile it. Create a new file called Makefile alongside the source code, and put the following contents in it:



obj-m += dvt-driver.o

all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules
clean:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean

Open the terminal in the directory of your two files, and run make. At this point, you should see some console output of your module compiling, and this whole process should spit out a file named dvt-driver.ko. This is your fully functional, compiled kernel module. Let’s load this ground-breaking piece of intellectual property into the kernel, shall we? It’s not doing us any good sitting here by itself. In the same directory as your code, run



sudo insmod dvt-driver.ko

and your driver should be inserted into the kernel. You can verify this by running lsmod, which lists all of the modules currently in the kernel. Among them, you should see dvt_driver. Note that the kernel replaces dashes in your module’s filename with underscores when it loads it. If you want to remove it, you can run



sudo rmmod dvt_driver

In the source code, we also do some logging to let it be known our driver loaded okay, so run dmesg from the terminal. This command is a shortcut for printing the kernel’s logs to the screen, and prettifying it a bit so it’s more readable. The most recent lines of output from dmesg should be the messages from the driver, saying the hello world driver has been loaded, and so on. Note that there is sometimes a lag in seeing init and exit function messages from drivers, but if you insert and remove the module twice, you should see all these messages being logged. If you want to see these messages get logged live-action, you can open up a second terminal, and do dmesg --follow. Then, as you insert and remove your driver from the other terminal, you’ll see the messages popping up.



My dmesg -- follow output after inserting and removing the module twice


So let’s examine what we have so far. In the source code, we start with some module metadata. You can get away with not specifying the author and so on, but you might as well put your name in there. The compiler would also give you a stern warning if you don’t include a license code, and my pathological desire for approval or acceptance in my life from virtually anything capable of providing it dictates that I need to specify said license code. If you are not marred by such psychological afflictions, it’s probably also good to note that the kernel maintainers are quite wary of taking in code that is not open source, and they pay good attention to details like licenses. In the past, big companies have been denied the right to put proprietary kernel modules in the source code. Don’t be like those guys. Be good. Be open-source. Use open-source licenses.


Next, we make custom init and exit functions. Whenever a module is loaded into the kernel, its init function is run, and conversely, the exit function is run when it’s removed. Our functions aren’t doing much, just logging text to the kernel logs. The printk() function is the kernel’s version of the classic print function from C. Obviously, the kernel does not have some terminal or screen available with which to print random things to, so the printk() function prints to the kernel logs. You have the KERN_INFO macro for logging general stuff. You can also use macros like KERN_ERROR in case an error occurs, which will alter the output formatting in dmesg. At any rate, the two functions for init and exit are registered in the last two lines of the source code. You have to do this; your driver has no other way of knowing which functions to run. You can also name them whatever you want, so long as their signature (arguments and return type) are the same as the ones I used.


Lastly, there’s the Makefile. Many open-source projects use the GNU Make utility for compiling libraries. This is typically used for libraries coded in C/C++ and is just a way of automating compiling your code. The Makefile listed here is the standard way of compiling your module. The first line appends your to-be-compiled .o file to the obj-m variable. The kernel is also compiled this way and appends a lot of .o files to this variable before compiling. In the next line, we employ some sleight of hand. See, the rules and commands for building kernel modules are already defined in the Makefile that ships with the kernel. We don’t have to write our own, we can use the kernel’s rules instead…which is exactly what we’re doing. In the -C argument, we point to the root directory of our kernel sources. Then we tell it to target our project’s working directory and compile the modules. Voilà. GNU Make is a deceptively powerful compiling tool, that can be used for automating the compilation of any kind of project, not just C/C++ projects. If you want to read up on it, you can look at this book, absolutely for free (as in beer, and as in speech).


The /proc entry

Let’s get to the meat of this post. Logging messages in the kernel is all good and well, but this is not the stuff that great modules are made of. Earlier in the article, I mentioned that kernel modules typically act as APIs for user space programs. Right now, our driver doesn’t do anything like that. Linux has a very neat way of handling this interaction; it works with an “everything is a file” abstraction.


To demonstrate, open up another terminal, and do cd /proc. Running ls, you should see a bunch of files listed. Now, run cat modules, and you’ll see some text printed to the screen. Does that look familiar? It should; all of the modules presented in the lsmod command you ran earlier are present here as well. Let’s try cat meminfo. Now we have info from the memory usage of the virtual machine. Cool. One last command to try: do ls -sh. This lists the size of each file alongside its name, and…wait, what? What is this madness?



See attached: madness


Their sizes are all 0 bytes. Nothing. And even though not a single bit is expended for these files, we just read their contents…? Well, that’s right, actually. See, /proc is the process directory, and is sort of a central place for userspace applications to get information from (and sometimes control) kernel modules. Ubuntu’s version of Task Manager is System Monitor, which you can run by tapping the OS key on your keyboard, and typing “system”, at which point a shortcut to System Monitor should be visible. System Monitor shows stats like which processes are running, CPU usage, memory usage, etc. And it gets all this information by reading the special files in /proc, like meminfo.


Let’s add the functionality to our driver so we can have our own entry in /proc. We will make it so that when a userspace application reads from it, it will greet us with a hello world message. Replace all the code under our module metadata with the following:



static struct proc_dir_entry* proc_entry;

static ssize_t custom_read(struct file* file, char __user* user_buffer, size_t count, loff_t* offset)
{
 printk(KERN_INFO "calling our very own custom read method.");
 
 char greeting[] = "Hello world!\n";
 int greeting_length = strlen(greeting);
 
 if (*offset > 0)
  return 0;
  
  copy_to_user(user_buffer, greeting, greeting_length);
 *offset = greeting_length;
 
 return greeting_length;
}

static struct file_operations fops =
{
 .owner = THIS_MODULE,
 .read = custom_read
};

// Custom init and exit methods
static int __init custom_init(void) {
 proc_entry = proc_create("helloworlddriver", 0666, NULL, &fops);
 printk(KERN_INFO "Hello world driver loaded.");
 return 0;
}

static void __exit custom_exit(void) {
 proc_remove(proc_entry);
 printk(KERN_INFO "Goodbye my friend, I shall miss you dearly...");
}

module_init(custom_init);
module_exit(custom_exit);

Now, remove the driver from the kernel, recompile, and insert the new .ko module into the kernel. Run cat /proc/helloworlddriver, and you should see our driver returning the hello world greeting to the terminal. Very neat, if you ask me. But alas, the cat command is maybe too easy to really drive the point home of what we’re doing here, so let’s write our own user space application to interact with this driver. Put the following Python code in a script in any directory (I called mine hello.py):



kernel_module = open('/proc/helloworlddriver')

greeting = kernel_module.readline();
print(greeting)

kernel_module.close()

This code should be self-explanatory, and as you can see, this is exactly how you would do file I/O in any programming language. The /proc/helloworlddriver file is our API to the kernel module we just made. If you run python3 hello.py, you should see it printing our greeting to the terminal. Cool stuff.



Result of running the Python script after removing, recompiling, and inserting the module into the kernel.


In our code, we made a custom read function. As you might guess, you can override the write function as well, if your module requires some userspace input. For instance, if you had a driver that controls the speed of the fans in your PC, you could give it a write function where you write a percentage number between 0 and 100 to the file, and your driver manually adjusts the fan speed accordingly. If you’d like to know how this function overriding actually works, read the next section. If you’re not interested, just skip on ahead to the end of the article.


Bonus Section — How Does This Even Work?

In this section, I figured some of you might be curious as to how overriding read/write functions for a /proc entry actually works. To know this, we need to delve into some OS theory, and we’ll use Assembly as an analogy.


In Assembly, your program has a “stack” for keeping track of variables you make during execution. It is a little different from a canonical Computer Science stack though since you can push and pop to it, but you can also access arbitrary elements in the stack — not just the element on top — and change/read them. Alright, so let’s say you define a function with two arguments in your Assembly code. You don’t just pass these variables when calling the function, no sir. Passing variables to functions in brackets is for amateurs copying code for a Python chatbot from an online tutorial. Assembly programmers are kind of a big deal folks, putting Apollo 11 on the moon with their craft. We’re talking no pain, no gain. Before you call your two-argument function, you have to push your arguments on the stack. Then you call your function, which usually reads the arguments from the top of the stack backwards, and uses them however it needs to. Plenty of pain to be had here, actually, since it’s all too easy to push your arguments onto the stack in the wrong order, and then your function reads the arguments as gibberish.


I mention this since your OS has very similar ways of executing code as well. It has its own stack, keeping track of variables, and when the kernel calls an OS function, that function looks for arguments on the top of the stack and then executes. If you want to read a file from disk, you call the read function with a few arguments, these arguments get put on the kernel’s stack, and the read function is then called to read the file (or parts of it) from disk. The kernel keeps track of all its functions in a huge table, where entries in the table list the function name and the address in memory where the function is stored. This is where our own custom functions come in. See, even though our module interactions happen via files, there’s no hard and fast rule that says when we read from that file, that the actual read function is called. The read function is just an address in memory looked up in a table. We can override what function in memory we’re calling when a userspace program reads our module’s /proc entry, and that’s precisely what we’re doing! In the file_operations struct, we assign the .read attribute to our custom_read function and then register the /proc entry with it. When we call the read function from our Python user space application, it might look like you’re reading a file from disk, and you’re passing all the right arguments on the kernel’s stack, but at the last moment, our custom_read function is being called instead, via its own address in memory that we made the kernel aware of. This works, because our custom_read function takes in the exact same arguments as reading a file from disk, so the correct arguments are being read from the kernel’s stack in the correct order.


The thing we have to keep in mind here is that userspace applications will treat our /proc entry as if it’s a file on disk, and will read and write to it as such. The onus falls on us to make sure that this interaction holds. Our module has to behave just like a regular file on disk, even though it’s not. When most programming languages read a file, they usually do so in chunks. Let’s say the chunks are 1024 bytes at a time. You would read the first 1024 bytes from a file into a buffer, which would contain bytes 0–1023 after it’s done. The read operations return 1024, to tell you that 1024 bytes were read successfully. Then the next 1024 bytes are read, and the buffer contains bytes 1024–2047. Eventually, we’ll reach the end of our file. Maybe the last chunk will ask for 1024 bytes, but there are only 800 left. So the read function returns 800 and puts those last 800 bytes in the buffer. Finally, the read function will ask for yet another chunk, but our file’s contents have been read fully. Then the read function will return 0. When this happens, your programming language knows that it’s reached the end of the file, and will stop trying to read from it.


Looking at the arguments of our own custom_read function, you can likely see the arguments that make this happen. The file struct represents the file that our userspace application is reading from (though this specific struct is actually a kernel only thing, but that’s not important for this article). Our last arguments are the buffer, count, and offset. The buffer is our user-space buffer, and basically contains the memory address of the array that we’re writing bytes into. The count is our chunk size. The offset is the point in the file that we’re reading a chunk from, as you’ve probably surmised. Let’s look at what we expect to happen when we’re reading from our module. We’re only returning “Hello world!” to the userspace. Including the newline at the end of the string, this is 13 characters, which will comfortably fit into pretty much any chunk size. When we’re trying to read from our /proc entry, it will go like so: we read our first chunk, write the greeting into the buffer, and return 13 (length of our greeting string) to the user space application since 13 bytes were read. Then the second chunk will read from an offset of 13, which is the “end” of our file (we have nothing left to send back, after all), so we return 0. The logic in our custom_read function reflects this. If the offset passed to it is greater than 0, it means we’ve already given our greeting, so we just return 0 and call it a day. Otherwise, we copy our greeting string to the user-space buffer and update our offset accordingly.


Other types of functions, like overriding a write function, should follow the same principles. Your function can do anything, just so long as it acts like a file to any userspace applications doing read/write operations on it.


Conclusion

Thank you for reading this post and I hope you found it interesting enough to start poking around the kernel on your own. Though we used a VM in this article, knowing how to write kernel modules is a must if you’re ever writing code for embedded systems (like IoT devices). If this is the case, or you want to learn more about kernel development, check out the KernelNewbies website, in particular this tutorial. There are many books available as well but look for a fairly recent publishing date before buying it. At any rate, you probably just wrote your first Linux kernel module ever, so be proud, and happy coding!