Vous êtes sur la page 1sur 103

Device Drivers, Part 1: Linux Device Drivers for Your Girl Friend

This series on Linux device drivers aims to present the usually technical topic in a way that is more interesting to a wider cross-section of readers.

Of drivers and buses
A driver drives, manages, controls, directs and monitors the entity under its command. What a bus driver does with a bus, a device driver does with a computer device (any piece of hardware connected to a computer) like a mouse, keyboard, monitor, hard disk, Web-camera, clock, and more. Further, a “pilot” could be a person or even an automatic system monitored by a person (an auto-pilot system in airliners, for example). Similarly, a specific piece of hardware could be controlled by a piece of software (a device driver), or could be controlled by another hardware device, which in turn could be managed by a software device driver. In the latter case, such a controlling device is commonly called a device controller. This, being a device itself, often also needs a driver, which is commonly referred to as a bus driver. General examples of device controllers include hard disk controllers, display controllers, and audio controllers that in turn manage devices connected to them. More technical examples would be an IDE controller, PCI controller, USB controller, SPI controller, I2C controller, etc. Pictorially, this whole concept can be depicted as in Figure 1.

Figure 1: Device and driver interaction

Device controllers are typically connected to the CPU through their respectively named buses (collection of physical lines) — for example, the PCI bus, the IDE bus, etc. In today’s embedded world, we encounter more micro-controllers than CPUs; these are the CPU plus various device

controllers built onto a single chip. This effective embedding of device controllers primarily reduces cost and space, making it suitable for embedded systems. In such cases, the buses are integrated into the chip itself. Does this change anything for the drivers, or more generically, on the software front? The answer is, not much — except that the bus drivers corresponding to the embedded device controllers are now developed under the architecture-specific umbrella.

Drivers have two parts
Bus drivers provide hardware-specific interfaces for the corresponding hardware protocols, and are the bottom-most horizontal software layers of an operating system (OS). Over these sit the actual device drivers. These operate on the underlying devices using the horizontal layer interfaces, and hence are device-specific. However, the whole idea of writing these drivers is to provide an abstraction to the user, and so, at the other “end”, these do provide an interface (which varies from OS to OS). In short, a device driver has two parts, which are: a) device-specific, and b) OS-specific. Refer to Figure 2.

Figure 2: Linux device driver partition

The device-specific portion of a device driver remains the same across all operating systems, and is more about understanding and decoding the device data sheets than software programming. A data sheet for a device is a document with technical details of the device, including its operation, performance, programming, etc. — in short a device user manual. Later, I shall show some examples of decoding data sheets as well. However, the OS-specific portion is the one that is tightly coupled with the OS mechanisms of user interfaces, and thus differentiates a Linux device driver from a Windows device driver and from a MacOS device driver.


In Linux, a device driver provides a “system call” interface to the user; this is the boundary line between the so-called kernel space and user-space of Linux, as shown in Figure 2. Figure 3 provides further classification.

Figure 3: Linux kernel overview

Based on the OS-specific interface of a driver, in Linux, a driver is broadly classified into three verticals: Packet-oriented or the network vertical Block-oriented or the storage vertical Byte-oriented or the character vertical The CPU vertical and memory vertical, taken together with the other three verticals, give the complete overview of the Linux kernel, like any textbook definition of an OS: “An OS performs 5 management functions: CPU/process, memory, network, storage, device I/O.” Though these two verticals could be classified as device drivers, where CPU and memory are the respective devices, they are treated differently, for many reasons. These are the core functionalities of any OS, be it a micro-kernel or a monolithic kernel. More often than not, adding code in these areas is mainly a Linux porting effort, which is typically done for a new CPU or architecture. Moreover, the code in these two verticals cannot be loaded or unloaded on the fly, unlike the other three verticals. Henceforth, when we talk about Linux device drivers, we mean to talk only about the latter three verticals in Figure 3. Let’s get a little deeper into these three verticals. The network vertical consists of two parts: a) the network protocol stack, and b)the network interface card (NIC) device drivers, or simply network device drivers, which could be for Ethernet, Wi-Fi, or any other network horizontals. Storage, again, consists of two parts: a) File-system drivers, to decode the various formats on different partitions, and b) Block device drivers for various storage (hardware) protocols, i.e., horizontals like IDE, SCSI, MTD, etc. With this, you may wonder if that is the only set of devices for which you need drivers (or for which Linux has drivers). Hold on a moment; you certainly need drivers for the whole lot of devices that interface with the system, and Linux does have drivers for them. However, their byte-oriented

etc. but come under three different verticals! In Linux. commonly called cores. input drivers.. are often split into two parts. all device drivers are drivers. Summing up So. SPI. In fact. and a USB-toserial converter — all are USB. I2C. The typical horizontals here would be RS232. etc. the majority bucket. VGA. or even two drivers: a) device controller-specific. etc. a USB pen drive. etc. sound drivers. and b) an abstraction layer over that for the verticals to interface. character drivers have been further subclassified — so you have tty drivers. ehci. and the USB abstraction. etc. span below multiple verticals. Multiple-vertical drivers One final note on the complete picture (placement of all the drivers in the Linux driver ecosystem): the horizontals like USB. a device driver is a piece of software that drives a device. console drivers. Why is that? Simple — you already know that you can have a USB Wi-Fi dongle. . though there are so many classifications. PS/2. because of the vast number of drivers in this vertical. usbcore. I2S. we call it just a driver. usbcore.cessibility puts all of them under the character vertical — this is. in reality. bus drivers or the horizontals. In case it drives only another piece of software. Hence. to conclude. A classic example would be the USB controller drivers ohci. PCI. but all drivers are not device drivers. Examples are file-system drivers. frame-buffer drivers.

That is really not acceptable. a typical driver installation on Windows needs a reboot for it to get activated. before we move on to write our first Linux driver. and it is active for use instantly after loading. Every Linux system has a standard place under the root of the file system (/) for all the pre-built modules. An annoyed Professor Gopi responded. Pugs was more than happy when the professor said. “Okay! Take your seats.Device Drivers. suppose we need to do it on a server? That’s where Linux wins. “Come on! You guys are late again. Also. what is your excuse. Part 2: Writing Your First Linux Driver in the Classroom This article. “Now you already know what is meant by dynamic loading and unloading of drivers.” This impressed the professor. where<kernel_version> would be the output of the command uname -ron the system. If you get it right. Shweta sheepishly asked for his permission to enter. deals with the concept of dynamically loading drivers. it is instantly disabled when unloaded. so I’ll show you how to do it. the two of you are excused!” Pugs knew that one way to make his professor happy was to criticise Windows.” The professor continued to the class. but make sure you are not late again. Shweta and Pugs reached their classroom late.ko (kernel object) extension. In Linux. which is part of the series on Linux device drivers. They are organised similar to the kernel source tree structure. “Good! Then explain about dynamic loading in Linux. as shown in Figure 1. “As we know. This is called dynamic loading and unloading of drivers in Linux.” Dynamically loading drivers These dynamically loadable drivers are more commonly called modules and built into individual files with a . first writing a Linux driver. He explained. we can load or unload a driver on the fly. . before building and then loading it. today?” Pugs hurriedly replied that they had been discussing the very topic for that day’s class — device drivers in Linux. to find their professor already in the middle of a lecture. under /lib/modules/<kernel_version>/kernel.

so fat. before you can insmod them. If they are in compressed .ko extension to the module’s name. rmmod is used when to unload the modules. To automatically perform decompression and dependency loading. in thefat (vfat for older kernels) directory under /lib/modules/`uname -r`/kernel/fs.Figure 1: Linux pre-built modules To dynamically load or unload a driver. use these commands. Note that you shouldn’t specify the . and must be executed with root privileges: lsmod — lists currently loaded modules insmod <module_file> — inserts/loads the specified module file modprobe <module> — inserts/loads the module. . The m module files would be fat. use modprobe instead. which reside in the /sbindirectory.ko.gz format.. vfat.ko needs to be loaded first.ko. Figure 2 demonstrates this complete related process of experimentation. you need to uncompress them with gunzip. along with any dependencies rmmod <module> — removes/unloads the module Let’s look at the FAT filesystem-related drivers as an example. wh using the modprobecommand. etc. Figure 2: Linux module operations The vfat module depends on the fat module.

h is included for the module version to be compatible with the kernel into which it is going to be loaded. and the destructor when rmmod succeeds in unloading the module. The following Makefile invokes the kernel’s build system from the kernel source. we use the analogous kernel.ko.h> <linux/kernel. invoke our first driver’s Makefile to build our first driver. and the kernel’s Makefile will. so it needs to be compiled in a similar way to the kernel. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"). as we will observe even with our first driver. by the macros module_init() and module_exit(). One interesting fact about the kernel is that it is an object-oriented implementation in C. let’s call it ofd. which acts like the module’s “signature”.h (a kernel space header). .h> static int __init ofd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofd registered").Our first Linux driver Before we write our first driver. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 /* ofd. it will be loaded/linked with the kernel. respectively. Any Linux driver has a constructor and a destructor.h> <linux/version.h. It is similar to a library that is loaded for its functions to be invoked by a running application. MODULE_DESCRIPTION("Our First Driver").c #include #include #include – Our First Driver code */ <linux/module. which are defined in the kernel headermodule. It is written in C. A driver never runs by itself. it is time to compile it and create the module file ofd. let’s go over some concepts. Note that there is nostdio. } module_init(ofd_init).c. module_exit(ofd_exit). and the header files you can use are only those from the kernel sources.h (a user-space header). The module’s constructor is called when the module is successfully loaded into the kernel. We use the kernel build system to do this. } static void __exit ofd_exit(void) /* Destructor */ { printk(KERN_INFO "Alvida: ofd unregistered"). in turn. Given above is the complete code for our first driver. instead. but lacks a main() function. Building our first Linux driver Once we have the C code. printk() is the equivalent of printf(). Additionally. TheMODULE_* macros populate module-related information. MODULE_LICENSE("GPL"). not from the standard /usr/include. return 0. version. These two are like normal functions in the driver. except that they are specified as the init and exit functions. Moreover.

) obj-m := ofd. at least. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # Makefile – makefile of our first driver # if KERNELRELEASE is defined.ko file. or with sudo. our further learning will be to enhance this driver to achieve specific driver functionalities. “Currently. the bell rang.o # Otherwise we were called directly from the command line. # Invoke the kernel build system. stage 2./ofd. The kernel source is assumed to be installed at /usr/src/linux. Professor Gopi concluded... all we need to do is invoke make to build our first driver (ofd..o LD [M] . So. and update me with your findings.o Building modules.. modules make[1]: Entering directory `/usr/src/linux' CC [M] .ko make[1]: Leaving directory `/usr/src/linux' Summing up Once we have the ofd. specify the location in the KERNEL_SOURCE variable in thisMakefile./ofd.mod.. the kernel headers) installed on your system. marking the end of the session.ko).To build a Linux driver. perform the usual steps as the root user. Writing a specialised driver is just a matter of what gets filled into its constructor and destructor. MODPOST 1 modules CC .. we've been invoked from the # kernel build system and can use its language.” ./ofd. Also note that our first driver is a template for any driver you would write in Linux. ifneq (${KERNELRELEASE}. you need to have the kernel source (or.c) and Makefile ready. # su # insmod ofd. Where’s the printk output gone? Find that out for yourselves..ko # lsmod | head -10 lsmod should show you the ofd driver loaded. else KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: ${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} modules clean: ${MAKE} -C ${KERNEL_SOURCE} SUBDIRS=${PWD} clean endif With the C code (ofd.. If it’s at any other location on your system. $ make make -C /usr/src/linux SUBDIRS=. you may not be able to observe anything other than thelsmod listing showing the driver has loaded. in the lab session. While the students were trying their first module.

Enthused by how Pugs impressed their professor in the last class. we don’t bother about the float formats %f. printf and printk are the same. the syslog daemon running in user-space picks them up for final processing and redirection to various devices. she grabbed the best system. only when triggered either from hardware-space or user-space. Note that there is no comma (. she realised that he would have dropped a hint about the possible solution in the previous class itself. as configured in the configuration file/etc/syslog. %lf and the like. Then. So. “Shall we go for a coffee?” proposed Pugs. in the printk calls. unlike printf.h in the kernel source. Pugs began his explanation. Shweta wanted to do so too. and began work. logged in. She immediately tried that.ko — running dmesg | tail. namely: #define KERN_EMERG "<0>" /* system is unusable */ . As far as parameters are concerned. There are eight such macros defined in linux/kernel.) between them. she remembered the error output demonstration from insmod vfat. However. and executes like a library. “I know what you’re thinking about. it is something in the background. I’ll explain you all about dmesg. You must have observed the out-of-place macro KERN_INFO. they are not two separate arguments. into a single string. which gets concatenated with the format string after it. And there was soon an opportunity: finding out where the output of printk had gone.” Kernel message logging Over coffee. in the last article.Device Drivers. In fact. Part 3: Kernel C Extras in a Linux Driver This article in the series on Linux device drivers deals with the kernel’s message logging. That is actually a constant string.conf. All printk calls put their output into the (log) ring buffer of the kernel. But how did it come to be here? A tap on her shoulder roused her from her thoughts. except that when programming for the kernel. “Let’s go.” interrupted Pugs. and found the printk output there. and kernel-specific GCC extensions. Going over what had been taught. printk is not designed to dump its output to some console. it cannot do so. Knowing her professor well. “But I need to –”. as soon as she entered the lab.

Figure 1: Kernel’s message logging . userspace utility. for instance.e. Now. Moreover. a user space. the first three characters in the format string). or to all consoles. A typical destination is the log file /var/log/messages. However. dmesg. but also from various daemons running in user-space. this file is often not readable by a normal user. all th printkoutputs are. is provided to directly parse the kernel ring buffer. Hence.#define KERN_ALERT "<1>" /* action must be taken immediately */ #define KERN_CRIT "<2>" /* critical conditions */ #define KERN_ERR "<3>" /* error conditions */ #define KERN_WARNING "<4>" /* warning conditions RN_WARNING */ #define KERN_NOTICE "<5>" /* normal but significant condition */ #define KERN_INFO "<6>" /* informational */ #define KERN_DEBUG "<7>" /* debug-level messages */ Now depending on these log levels (i. for all log levels. the they can be configured differently — to a serial port (like /dev/ttyS0). and contains messages not only from the kernel. Hence. like what typically happens for KERN_EMERG. thesyslog user-space e daemon redirects the corresponding messages to their configured locations. and dump it to standard output. in that file.. by default. /var/log/messages is buffered. Figure 1 shows snippets from the two.

Kernel-specific GCC extensions
Shweta, frustrated since she could no longer show off as having discovered all these on her own, retorted, “Since you have explained all about printing in the kernel, why don’t you also tell me about the weird C in the driver as well — the special keywords __init, __exit, etc.” These are not special keywords. Kernel C is not “weird C”, but just standard C with some additional extensions from the C compiler, GCC. Macros __init and __exit are just two of these extensions. However, these do not have any relevance in case we are using them for a dynamically loadable driver, but only when the same code gets built into the kernel. All functions marked with __init get placed inside the init section of the kernel image automatically, by GCC, during kernel compilation; and all functions marked with __exit are placed in the exitsection of the kernel image. What is the benefit of this? All functions with __init are supposed to be executed only once during bootup (and not executed again till the next bootup). So, once they are executed during bootup, the kernel frees up RAM by removing them (by freeing the init section). Similarly, all functions in the exit section are supposed to be called during system shutdown. Now, if the system is shutting down anyway, why do you need to do any cleaning up? Hence, theexit section is not even loaded into the kernel — another cool optimisation. This is a beautiful example of how the kernel and GCC work hand-in-hand to achieve a lot of optimisation, and many other tricks that we will see as we go along. And that is why the Linux kernel can only be compiled using GCC-based compilers — a closely knit bond.

The kernel function’s return guidelines
While returning from coffee, Pugs kept praising OSS and the community that’s grown around it. Do you know why different individuals are able to come together and contribute excellently without any conflicts, and in a project as huge as Linux, at that? There are many reasons, but most important amongst them is that they all follow and abide by inherent coding guidelines. Take, for example, the kernel programming guideline for returning values from a function. Any kernel function needing error handling, typically returns an integer-like type — and the return value again follows a guideline. For an error, we return a negative number: a minus sign appended with a macro that is available through the kernel header linux/errno.h, that includes the various error number headers under the kernel sources — namely, asm/errno.h,asm-generic/errno.h, asm-generic/errno-base.h. For success, zero is the most common return value, unless there is some additional information to be provided. In that case, a positive value is returned, the value indicating the information, such as the number of bytes transferred by the function.

Kernel C = pure C
Once back in the lab, Shweta remembered their professor mentioning that no /usr/includeheaders can be used for kernel programming. But Pugs had said that kernel C is just standard C with some GCC extensions. Why this conflict? Actually this is not a conflict. Standard C is pure C — just the language. The headers are not part of it. Those are part of the standard libraries built in for C programmers, based on the concept of reusing code. Does that mean that all standard libraries, and hence, all ANSI standard functions, are not part of “pure” C? Yes, that’s right. Then, was it really tough coding the kernel? Well, not for this reason. In reality, kernel developers have evolved their own set of required functions, which are all part of the kernel code. The printk function is just one of them. Similarly, many string functions, memory functions, and more, are all part of the kernel source, under various directories like kernel, ipc, lib, and so on, along with the corresponding headers under the include/linux directory. “Oh yes! That is why we need to have the kernel source to build a driver,” agreed Shweta. “If not the complete source, at least the headers are a must. And that is why we have separate packages to install the complete kernel source, or just the kernel headers,” added Pugs. “In the lab, all the sources are set up. But if I want to try out drivers on my Linux system in my hostel room, how do I go about it?” asked Shweta. “Our lab has Fedora, where the kernel sources are typically installed under/usr/src/kernels/<kernelversion>, unlike the standard /usr/src/linux. Lab administrators must have installed it using the command-line yum install kernel-devel. I use Mandriva, and installed the kernel sources using urpmi kernel-source,” replied Pugs. “But I have Ubuntu,” Shweta said. “Okay! For that, just use apt-get utility to fetch the source — possibly apt-get install linux-source,” replied Pugs.

Summing up
The lab session was almost over when Shweta suddenly asked, out of curiosity, “Hey Pugs, what’s the next topic we are going to learn in our Linux device drivers class?” “Hmm… most probably character drivers,” threw back Pugs. With this information, Shweta hurriedly packed her bag and headed towards her room to set up the kernel sources, and try out the next driver on her own. “In case you get stuck, just give me a call,” smiled Pugs.

Device Drivers, Part 4: Linux Character Drivers

concepts related to character drivers and their implementation.
Shweta, at her PC in her hostel room, was all set to explore the characters of Linux character drivers, before it was taught in class. She recalled the following lines from professor Gopi’s class: “… today’s first driver would be the template for any driver you write in Linux. Writing any specialised/advanced driver is just a matter of what gets filled into its constructor and destructor…” With that, she took out the first driver’s code, and pulled out various reference books, to start writing a character driver on her own. She also downloaded the online book, Linux Device Driversby Jonathan Corbet, by Alessandro Rubini, and Greg Kroah-Hartman. Here is the summary of what she learnt. Hartman.

W’s of character drivers
We already know what drivers are, and why we need them. What is so special about character drivers? If we write drivers for byte-oriented operations (or, in C lingo, character oriented operations), then we refer to them oriented character-oriented as character drivers. Since the majority of devices are byte byte-oriented, the majority of device drivers are character device drivers. Take, for example, serial drivers, audio drivers, video drivers, camera drivers, and basic I/O drivers. In fact, all device drivers that are neither storage nor network device drivers are some type of a charact driver. Let’s character look into the commonalities of these character drivers, and how Shweta wrote one of them.

The complete connection

Figure 1: Character driver overview

with different minor number ranges. note that the character device file is not the actual device. Character device file 3. the connection between the device file and the device driver is based on the number of the device file. not the name. Remember that this is the usual expected behaviour for device files. However. and the minor number used to represent the subfunctionalities of the driver. minor> range of device files. they would be as driven by the corresponding functions in the device driver. Device file(s) are linked to the device driver by specific registrations done by the driver. 2. unlike for regular files.As shown in Figure 1. their outcome may not be the usual ones. The recorded data need not be the played-back data. However. With kernel 2. Major and minor numbers The connection between the application and the device file is based on the name of the device file. this is more common with the non-reserved major numbers. An application gets connected to a device file by invoking the open system call on the device file. int minor) creates the dev from major and minor. they need to be explicitly connected. this distinction is no longer mandatory. a write followed by a read may not fetch what has just been written to the character device file. Earlier (till kernel 2. Those functions then do the final low-level access to the actual device to achieve the desired results. and enables the kernel-space to have a trivial index-based linkage between the device file and the device driver. 14 for audio devices. For example. Let’s take an audio device file as an example. and standard major numbers are typically preserved for single drivers. Connecting the device file with the device driver involves two steps: 1. and so on. Those operations are translated to the corresponding functions in the linked character device driver by the VFS. In this complete connection from the application to the device. but just a place-holder for the actual device. The mere existence of these on a system doesn’t mean they are linked to form the complete connection. Rather. Character device The interesting thing is that all of these can exist independently on a system. The following command would list the various character device files on your system: $ ls -l /dev/ | grep "^c" <major. Character device driver 4. This allows a user-space application to have any name for the device file. Thus we form the complete connection. Application 2. For example. it should use the corresponding character device driver (in kernel space). Note that though the application does the usual file operations. or the major and minor numbers of the device file. What we write into it is the audio data we want to play back. This device file number is more commonly referred to as the<major. there are four major entities involved: 1. one major number was for one driver.h): dev_t contains both major and minor numbers Macros (defined in kernel header linux/kdev_t. Character driver usage is done through the corresponding character device file(s).h: . the read would get us audio data that we are recording. With this. Registering for the <major. linked to it through the virtual file system (VFS). 4 for serial interfaces. say through a speaker. However. for any user-space application to operate on a byte-oriented device (in hardware space). minor> related support in kernel 2.h): MAJOR(dev_t dev) extracts the major number from dev MINOR(dev_t dev) extracts the minor number from dev MKDEV(int major. The driver is linked to a device by its device-specific low-level operations. say through a microphone.6. there could be multiple drivers under the same major number. minor> pair. defined in the kernel headerlinux/fs. The first step is achieved using either of the following two APIs.6 Type (defined in kernel header linux/types. without the other being present.4). 13 for mice. but obviously. Linking the device file operations to the device driver functions. Rather. What this means is that an application does the usual file operations on the character device file.

h> <linux/kernel. the /proc/devices kernel window lists the name with the registered major number.+ int register_chrdev_region(dev_t first. } static void __exit ofcd_exit(void) /* Destructor */ { unregister_chrdev_region(first. "Shweta") < 0) { return -1. // Global variable for the first device static dev_t number static int __init ofcd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofcd registered"). %d>\n". 0. 3. In the destructor. MINOR(first)). "Shweta") < 0) { return -1.h> <linux/types. Minor>: <%d. 3). MAJOR(first). module_exit(ofcd_exit). } module_init(ofcd_init).h> static dev_t first. char *name). if (alloc_chrdev_region(&first. as follows: MAJOR(first). unsigned int cnt.h> first. 0. Shweta added the following into the first driver code: #include <linux/types. The first API registers the cnt number of device file numbers.h> <linux/kdev_t. 3). char *name). The second API dynamically figures out a free major number. MODULE_LICENSE("GPL"). firstminor>.h> <linux/fs. It’s all put together. } printk(KERN_INFO "<Major.h> #include <linux/fs. with the given name. she added: unregister_chrdev_region(first. Minor>: <%d. with the given name. starting from first.h> #include <linux/kdev_t. %d>\n".h> <linux/version. In either case. unsigned int firstminor. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika- . MINOR(first)). 3. printk(KERN_INFO "Alvida: ofcd unregistered"). she added: if (alloc_chrdev_region(&first. } printk(KERN_INFO "<Major. return 0. + int alloc_chrdev_region(dev_t *first. // Global variable for the first device number In the constructor. With this information. unsigned int cnt. and registers the cnt number of device file numbers starting from <the free major. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 #include #include #include #include #include #include <linux/module.

However. Figure 2: Character device file experiments Please note that the major number 250 may vary from system to system. and then tried reading and writing those. 31 32 Then. . she couldn’t find any device file created under /dev with the same major number. Unload the driver using rmmod.ko file) by running make. Figure 2 shows all these steps. using cat /proc/devices. It was right there. Shweta repeated the usual steps that she’d learnt for the first driver: Build the driver (. Figure 2 also shows the results Shweta got from reading and writing one of the device files. Summing up Additionally. she peeped into the /proc/devices kernel window to look for the registered major number with the name “Shweta”. List the loaded modules using lsmod. based on availability. That reminded her that the second step to connect the device file with the device driver — which is linking the device file operations to the device driver functions — was not yet done. so she created them by hand. 30 MODULE_DESCRIPTION("Our First Character Driver").29 pugs_dot_com>"). She realised that she needed to dig around for more information to complete this step. and also to figure out the reason for the missing device files under /dev. using mknod. before unloading the driver. Load the driver using insmod.

the automatic creation of device files was done by the kernel itself.). "<device class name>"). their permissions. So. in kernel 2. In my previous article. by calling the appropriate APIs of devfs. their types. the kernel now only populates the appropriate device class and device information into the /sys window. as the kernel evolved. as far as the driver is concerned.Device Drivers. minor>) under this class is populated by: device_create(cl. based on the <major>:<minor> entry in the dev file. The device class is created as follows: struct class *cl = class_create(THIS_MODULE. as a policy. using the udev daemon. on further study. Refer to Figure 1 for the /sys entries created using chardrv as the <device class name>and mynull as the <device name format>. However. "<device name format>". and accordingly creates the device files. not at the kernel. the first is dev_t with the corresponding <major. minor>device range. the device files were not created under /dev — instead. class_destroy(cl). Automatic creation of device files Earlier. for the device under consideration.. Shweta had to create them manually. created by udev. . Shweta figured out a way to automatically create the device files.h>. minor>. NULL. etc. that is where they ought to be dealt with. In most Linux desktop systems. first. Then. the device info (<major. The rest should be handled by udev. However. Here. and carries on the discussion on character drivers and their implementation. udev can be further configured via its configuration files to tune the device file names. That also shows the device file. The corresponding complementary or the inverse calls. She also learnt the second step to connect the device file with the device driver — linking the device file operations to the device driver functions. User-space then needs to interpret it and take appropriate action. the appropriate /sys entries need to be populated using the Linux device model APIs declared in <linux/device.4. first). the udev daemon picks up that information. NULL. using mknod. are as follows: device_destroy(cl. . I had mentioned that even with the registration for the <major. Here is what she learnt. Part 5: Character Device Files — Creation & Operations This article is a continuation of the series on Linux device drivers.. kernel developers realised that device files were more related to user-space and hence. Based on this idea. which should be called in chronologically reverse order.

for VFS to pass the device file operations onto the driver. as easy as the “null driver”. let’s keep them as simple as possible — let’s say. NULL. my_close. Our discussion focuses on the second ding case. my_read. MINOR(first) + i). and in Linux. it should have been informed about it. let’s fill in a file operations structure (struct file_operations pugs_fops) with the desired file operations (my_open. thedevice_create() call in a for loop indexed by i could be as follows: device_create(cl. So. like a filesystem module in case of a regular file or directory. Then.) First. file operations) we talk of on a regular file. Now. my_read. That’s what we say: a file is a file. For example. are applicable to device files as well. to start with. and the <device name format> string could be useful. the device_create() and device_destroy() APIs may be put in the for loop. and the corresponding device driver in case of a device file. This involves two steps. that is what is called registering the file operations by the driver with the VFS. (The parenthesised code refers to the “null driver” code below.Figure 1: Automatic device file creation In case of multiple minors. …) and initialise the character device structure (struct y_read cdev c_dev) with that. my_write) the also had to be coded. The null driver . And yes. hand this structure to the VFS using the call cdev_add(). more commonly. using cdev_init(). File operations Whatever system calls (or. almost everything is a file from the user user-space perspective. my_write. MKNOD(MAJOR(first). The difference lies in the kernel space. i). t actual file operations (my_open. where the virtual file system (VFS) decodes the file type and transfers the file operations to the appropriate channel. "mynull%d". NULL.my_close. Both cdev_init() andcdev_add() are declared in <linux/cdev. Obviously.h>.

h> <linux/kernel. return -1. char __user *buf.read = my_read. } if ((cl = class_create(THIS_MODULE. . Shweta put the pieces together. Let’s see what the outcome was. } if (device_create(cl. "Shweta") < 0) { return -1. // Global variable for the first device number static struct cdev c_dev.h> <linux/cdev.write = my_write }. if (alloc_chrdev_region(&first. return 0. loff_t *off) { printk(KERN_INFO "Driver: write()\n"). } static int my_close(struct inode *i. size_t len.release = my_close. } static struct file_operations pugs_fops = { . //Global variable for the char dev structure static struct class *cl.c: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 #include #include #include #include #include #include #include #include <linux/module. struct file *f) { printk(KERN_INFO "Driver: open()\n"). } static ssize_t my_write(struct file *f. "mynull") == NULL) { . // Global variable for the device class static int my_open(struct inode *i.Following these steps. first. 1). return 0.h> <linux/device. . Here’s the complete code — ofcd. } static ssize_t my_read(struct file *f. struct file *f) { printk(KERN_INFO "Driver: close()\n"). 1. attempting her first character device driver. const char __user *buf.h> <linux/version. NULL. return 0. NULL.h> <linux/kdev_t. "chardrv")) == NULL) { unregister_chrdev_region(first.h> static dev_t first.open = my_open.h> <linux/fs. 0. return len. size_t len.h> <linux/types. . loff_t *off) { printk(KERN_INFO "Driver: read()\n").owner = THIS_MODULE. static int __init ofcd_init(void) /* Constructor */ { printk(KERN_INFO "Namaskar: ofcd registered"). .

class_destroy(cl). Build the driver (. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"). 3.ko file) by running make. device_destroy(cl. “null driver”-specific experiments (refer to Figure 2 for details). first. first). 1) == -1) { device_destroy(cl. 1). if (cdev_add(&c_dev. .48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 class_destroy(cl). 2. 1). List the major number allocated. 1). MODULE_LICENSE("GPL"). Shweta repeated the usual build process. return -1. unregister_chrdev_region(first. 5. printk(KERN_INFO "Alvida: ofcd unregistered"). 4. return -1. first). } module_init(ofcd_init). unregister_chrdev_region(first. } cdev_init(&c_dev. using cat /proc/devices. Unload the driver using rmmod. } static void __exit ofcd_exit(void) /* Destructor */ { cdev_del(&c_dev). Load the driver using insmod. 6. with some new test steps. MODULE_DESCRIPTION("Our First Character Driver"). module_exit(ofcd_exit). class_destroy(cl). &pugs_fops). List the loaded modules using lsmod. unregister_chrdev_region(first. as follows: 1. } return 0.

my_read. also try out the echo and cat commands with it. She had got her own calls (my_open. but wondered why they worked so unusually. However. unlike any regular file system calls. at least from the regular file operations’ perspective. What was unusual? Whatever was written. all on her own. which works the same as the standard /dev/null device file. one thing began to bother Shweta. How would she crack this problem? Watch out for the next article. and similarly. my_close. my_write) in her driver. .Figure 2: 'null driver' experiments Summing up Shweta was certainly happy. To understand what this means. check the<major. she’d got a character driver written. she got nothing when reading — unusual. minor> tuple for /dev/null.

loff_t *off) { printk(KERN_INFO "Driver: read()\n"). } static ssize_t my_read(struct file *f.” replied Shweta. Part 6: Decoding Character Device File Operations This article. “How come you’re here?” exclaimed Shweta. 2]. there was Pugs. Suddenly. my_close. Why don’t you decode and then explain what you’ve understood about it?” Shweta felt that was a good idea. It’s cool that you cracked your first character driver all on your own. Wasn’t it obvious? In our previous article. size_t len. } . Alongside.Device Drivers. return 0. what are you up to now?” asked Pugs. even after writing into the /dev/mynull character device file. char __user *buf. return 0. return len. struct file *f) { printk(KERN_INFO "Driver: close()\n"). but a real one at the door. size_t len. which was dealt with in the previous two articles [1. “I’ll tell you. } static ssize_t my_write(struct file *f. struct file *f) { printk(KERN_INFO "Driver: open()\n"). That’s amazing. specifically observing the device file operations my_open. “I have an idea. static int my_open(struct inode *i.” “And that too. Pugs perked up. we saw how Shweta was puzzled by not being able to read any data. loff_t *off) { printk(KERN_INFO "Driver: write()\n"). saying. “I saw your tweet. So. I’ll only give you advice. a bell rang — not inside her head. So. on the condition that you do not play spoil sport. } static int my_close(struct inode *i. Pugs smiled. “Okay. She tail‘ed the dmesg log to observe the printk output from her driver. what was your guess on how Shweta would crack the problem? Obviously.” said Shweta. return 0. And for sure. const char __user *buf. only if I ask for it! I am trying to understand character device file operations. my_read. she opened her null driver code on her console. continues to cover the various concepts of character drivers and their implementation. with the help of Pugs. which is part of the series on Linux device drivers. and my_write.

but with a caveat — the data sent would be some junk data. } static ssize_t my_write(struct file *f.. Shweta paused for a while. and figures out that it needs to redirect it to the driver’s function my_read(). if I change it to 1. And hence. “That’s really smart of you. it is ssize_t. “Aha!! That’s why all my writes into /dev/ mynull have been successful. So from that angle. and the number of bytes written should be passed back as the return value. The user provides len (the third parameter of my_write()) bytes of data to be written. loff_t *off) { printk(KERN_INFO "Driver: read()\n"). my_open()and my_close() are trivial. I get the last character written into /dev/mynull?” Confidently. their return types being int. the count in bytes requested by the user. sarcastically. In fact. or equal to. We read the data from (possibly) an underlying device. filled with happiness at understanding the complete flow of device file operations. the return types of both my_read() and my_write() are not int. my_read() is invoked as a request to read. that system call comes to the virtual file system (VFS) layer in the kernel. here’s a question for you. For the read operation. it would be the number of bytes written. And hence. so. Seems like you are thoroughly clear with the read/write fundamentals. from us — the device-driver writers. len bytes of data into buf. my_read() should write data into buf. Writing into the device file The write operation is the reverse. Preserving the last character With Shweta not giving Pugs any chance to correct her. the complete flow has to be given a relook. the users).” exclaimed Shweta. static ssize_t my_read(struct file *f. Reading the device file To understand this in detail. the end of the file.” said Pugs. char __user *buf. how many bytes they are getting from the read request.Based on the earlier understanding of the return value of the functions in the kernel. “Okay. that’s registered with it. returning a negative number would be a usual error. it would be the number of bytes read. looked at the parameters of the function my_read() and answered in the affirmative. Shweta took on the challenge. its return value would indicate to the requesters (i. size_t len. loff_t *off) . const char __user *buf. minor>tuple. when the device file is being read. independent of what is written into it. On further digging through kernel headers. Can you modify these my_read() and my_write() functions such that whenever I read/dev/mynull. adding a static global character variable: static char c. would it start giving me some data?” asked Pugs. the result is always nothing. buf[0] = c. and modified my_read() and my_write() as follows. by way of verifying.e. But a non-negative return value would have additional meaning. So. “Hmmm… So. or in other words. this is not a typo — in the read operation. so that the user can read it. since my_read() is not really populating data into buf (the buffer variable that is the second parameter ofmy_read(). size_t len. In our null driver example. When the user does a read from the device file /dev/mynull. To be more specific. The my_write()function would read that data and possibly write it to an underlying device. Let’s take the read operation first. rather. and then write that data into the user buffer. without actually doing any read or write. return 1. However. according tolen (the third parameter to the function). and return the number of bytes that have been successfully written. provided by the user). and both of them returning zero. and for the write operation. in buf (the second parameter of my_write()). we returned zero — which meant no bytes available. that turns out to be a signed word. he came up with a challenge. VFS decodes the <major. means success. it should write less than. device-driver writers “write” into the user-supplied buffer. No.

the output was a non-stop infinite sequence of s. or if the user buffer is swapped out. Write into /dev/mynull. &c. Load the driver using insmod. loff_t *off) { printk(KERN_INFO "Driver: read()\n"). Build the modified “null” driver (. she rewrote the above code snippet as follows: static char c.ko file) by running make. return len. With the complete understanding of the APIs. Unload the driver using rmmod. loff_t *off) { printk(KERN_INFO "Driver: write()\n"). So. } “Almost there. static ssize_t my_read(struct file *f. and zero from the second time onwards. 3. char __user *buf. Wouldn’t this direct access of the user-space buf just crash and oops the kernel?” pounced Pugs. say. } static ssize_t my_write(struct file *f. just to bolster Pugs’ ego. } Then Shweta repeated the usual build-and-test steps as follows: 1. if (copy_to_user(buf. refusing to be intimidated. dived into her collated material and figured out that there are two APIs just to ensure that user-space buffers are safe to access. size_t len. c = buf[len – 1]. else return 1. and then updated them. buf + len – 1. On cat‘ing /dev/mynull. Shweta. my_read() needs to return 1 the first time. and tried to explain. if (copy_from_user(&c.”. Pugs intervened and pressed Ctrl+C to stop the infinite read. 1) != 0) return -EFAULT. as my_read() gives the last one character forever. “If this is to be changed to ‘the last character only once’. 2. This can be achieved using off (the fourth parameter of my_read()). const char __user *buf. . 1) != 0) return -EFAULT.{ printk(KERN_INFO "Driver: write()\n"). using echo -n "Pugs" > /dev/ mynull 4. Shweta nodded her head obligingly. else return len. Read from /dev/mynull using cat /dev/mynull (stop by using Ctrl+C) 5. but what if the user has provided an invalid buffer. size_t len.

Many of her classmates had already read her blog and commented on her expertise. Skipping the theoretical details. over multiple lab sessions. Run cat /proc/meminfo to get the approximate RAM size on your system. the address/memory map ranges from 0 (0x00000000) to “232 – 1″ (0xFFFFFFFF). . Refer to Figure 2 for a snapshot. talks about accessing hardware in Linux. Till now. it was all software — but today’s lab was on accessing hardware in Linux. These addresses actually are architecture-dependent. Generic hardware interfacing As every one settled down in the laboratory. students are expected to learn “by experiment” how to access different kinds of hardware in Linux. In the lab. However. which is part of theseries on Linux device drivers. Members of the lab staff are usually reluctant to let students work on the hardware straight away without any experience — so they had prepared some presentations for the students (available here). And today was a chance to show off at another level. For a 32-bit address bus. if the RAM is less. and the later 1GB (0xC0000000 to 0xFFFFFFFF) for device maps. For example. on various architectures. device maps could start from 2GB (0x80000000). Run cat /proc/iomem to list the memory map on your system. the first interesting slide was about generic architecture-transparent hardware interfacing (see Figure 1). the initial 3 GB (0x00000000 to0xBFFFFFFF) is typically for RAM. Part 7: Generic Hardware Access in Linux This article.Device Drivers. in an x86 architecture. An architecture-independent layout of this memory map would be like what’s shown in Figure 1 — memory (RAM) and device regions (registers and memories of devices) mapped in an interleaved fashion. as she entered the Linux device drivers laboratory on the second floor of her college. Figure 1: Hardware mapping The basic assumption is that the architecture is 32-bit. say 2GB. Shweta was all jubilant about her character driver achievements. For others. the memory map would change accordingly. lab expert Priti started with an introduction to hardware interfacing in Linux.

h>) for mapping and unmapping the device bus addresses to virtual addresses are: void *ioremap(unsigned long device_region_size). device_bus_address. etc. unsigned int ioread16(void *virt_addr). All the architecture-dependent values of these physical and bus addresses are either dynamically configurable. the following are the APIs (also prototyped in<asm/io. the PCI bus in the x86 architecture. the AMBA bus in ARM architectures. it depends on the device datasheet as to which set of device registers and/or device memory to read from or write into.Figure 2: Physical and bus addresses on an x86 system Irrespective of the actual values. The interesting part is that in Linux.h>): unsigned int ioread8(void *virt_addr). .e. unsigned long Once mapped to virtual addresses. but are to be mapped to virtual addresses and then accessed through them — thus making the RAM and device accesses generic enough. The corresponding APIs (prototyped in <asm/io. hardware manuals) of the corresponding architecture processors/controllers. and those referring to device maps as bus addresses. or are to be obtained from the data-sheets (i.. by adding their offsets to the virtual address returned by ioremap(). none of these are directly accessible. the addresses referring to RAM are termed as physical addresses. void iounmap(void *virt_addr). For that. the SuperHyway bus in SuperH architectures. since these devices are always mapped through some architecture-specific bus — for example.

} if (*off + len > VRAM_SIZE) { len = VRAM_SIZE . void *virt_addr).h> <linux/kernel. iowrite32(u32 value. dev_t first.h> <linux/fs. She added the above APIs.h> <linux/cdev. with appropriate parameters. my_read(struct file *f. into the constructor and destructor of her existing “null” driver. The suggested initial experiment was with the video RAM of “DOS” days.h> <linux/device.h> <linux/kdev_t. static int my_open(struct inode *i. i++) { .h> <linux/types. Then she added the user access to the video RAM through read and write calls of the “vram” driver.h> <asm/io. struct file *f) { return 0. void *virt_addr).h> <linux/uaccess. ranging from 0x000A0000 to 0x000BFFFF. void *virt_addr).c: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 #include #include #include #include #include #include #include #include #include #include <linux/module. if (*off >= VRAM_SIZE) { return 0. Accessing the video RAM of ‘DOS’ days After this first set of information. } for (i = 0. u8 byte.*off. } static ssize_t loff_t *off) { int i.h> #define VRAM_BASE 0x000A0000 #define VRAM_SIZE 0x00020000 static static static static void __iomem *vram. struct file *f) { return 0. Shweta got onto the system and went through /proc/iomem (as in Figure 2) and got the video RAM address. } static int my_close(struct inode *i. here’s her new file — video_ram.unsigned unsigned unsigned unsigned int int int int ioread32(void *virt_addr).h> <linux/version. to understand the usage of the above APIs. to convert it into a “vram” driver. struct cdev c_dev. struct class *cl. iowrite8(u8 value. students were directed for the live experiments. size_t len. i < len. char __user *buf. iowrite16(u16 value.

"chardrv")) == NULL) . return len. } static struct file_operations vram_fops = { . &byte. static int __init vram_init(void) /* Constructor */ { if ((vram = ioremap(VRAM_BASE. } } *off += len. i < len. 1.release = my_close. if (copy_to_user(buf + i. } if (*off + len > VRAM_SIZE) { len = VRAM_SIZE . } static ssize_t my_write(struct file *f. i++) { if (copy_from_user(&byte. loff_t *off) { int i. return -1. } if ((cl = class_create(THIS_MODULE. } for (i = 0. "vram") < 0) { return -1.owner = THIS_MODULE. u8 byte. const char __user *buf. 0.read = my_read. . 1)) { return -EFAULT.open = my_open.*off. } *off += len. if (*off >= VRAM_SIZE) { return 0. buf + i. . size_t len. } iowrite8(byte. return len. .38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 byte = ioread8((u8 *)vram + *off + i).write = my_write }. (u8 *)vram + *off + i). . } if (alloc_chrdev_region(&first. VRAM_SIZE)) == NULL) { printk(KERN_ERR "Mapping video RAM failed\n"). 1)) { return -EFAULT.

unregister_chrdev_region(first. } if (device_create(cl. For more details. NULL. 2. using echo -n "0123456789" > /dev/vram. MODULE_LICENSE("GPL"). 1). iounmap(vram). Summing up Shweta then repeated the usual steps: Build the “vram” driver (video_ram. 1. Write into /dev/vram. return -1. first). } cdev_init(&c_dev. 1). return -1. MODULE_DESCRIPTION("Video RAM Driver"). say. first. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>").) 5. 1). class_destroy(cl). if (cdev_add(&c_dev. class_destroy(cl). } return 0. module_exit(vram_exit). &vram_fops). Shweta decided to walk around and possibly help somebody else with their experiments. but that would give all the binary content.ko file) by running make with a changed Makefile. .87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 { unregister_chrdev_region(first. } static void __exit vram_exit(void) /* Destructor */ { cdev_del(&c_dev). NULL. "vram") == NULL) { class_destroy(cl). 4. (The usual cat /dev/vram can also be used. Read the /dev/vram contents using od -t x1 -v /dev/vram | less. run man od. 1) == -1) { device_destroy(cl. 1). } module_init(vram_init). With half an hour still left for the end of the practical class. first. unregister_chrdev_region(first. 3. od -t x1 shows it as hexadecimal. Load the driver using insmod video_ram. return -1.ko. first). unregister_chrdev_region(first. Unload the driver using rmmod video_ram. device_destroy(cl.

it had a lot to offer first timers with regard to reading hardware device manuals first-timers (commonly called data sheets) and how to understand them to write device drivers. x86 has an additional hardware accessing mechanism. a 16-bit word. through direct I/O mapping. It is a direct 16-bit 16 addressing scheme. the previous s) session about generic architecture-transparent hardware interfacing was about mapping and accessing memory-mapped devices in Linux without any device mapped device-specific details. continues thediscussion on accessing hardware in Linux. it has an additional set of x86 (assembly/machine code) instructions. header <asm/io.h>) are as follows: The equivalent C functions/macros (available through the u8 inb(unsigned long port). u16 inw(unsigned long port). Apart from accessing and programming architecture-specific I/O architecture mapped hardware in x86. Part 8: Accessing x86-Specific x86I/O-Mapped Hardware Mapped This article. And yes. and inl for reading an 8-bit 8 byte. outw and outl. unsigned long port). The second day in the Linux device drivers’ laboratory was expected to be b quite different from the typical softwaresoftware oriented class. void outw(u16 value. respectively. void outb(u8 value. unsigned long port). through ports. u32 inl(unsigned long port). and doesn’t need mapping to a virtual address for access. x86-specific hardware interfacing specific Unlike most other architectures. In contrast. from I/O mapped devices. unsigned long port). The corresponding output instructions structions Figure 1: x86-specific I/O ports are outb. respectively. .Device Drivers. These addresses are referred to as port addresses. or ports. which is part of the series on Linux device drivers. void outl(u32 value. Since this is an additional access mechanism. inw. there are the input instructions inb. and a 32bit 32 bit long word.

apart from serial. Yes.The basic question that may arise relates to which devices are I/O mapped and what the port addresses of these devices are. open it up and check it out. As per x86-standard. how does one get the device number? Simple… by having a look at the device. to use the device. The data-sheet for this [PDF] can be downloaded as part of the self-extracting package [BIN file] used for the Linux device driver kit. does one get these device data sheets? Typically. The answer is pretty simple. Figure 1 shows a snippet of these mappings through the kernel window /proc/ioports. . as it is these registers that writers need to program. this is the least you may have to do to get going with the hardware. Simplest: serial port on x86 For example. If it is inside a desktop. A serial port is controlled by the serial controller device. it is time to peep into the data sheet of the PC16550D UART. the timer and RTC. Page 14 of the data sheet (also shown in Figure 2) shows the complete table of all the twelve 8-bit registers present in the UART PC16550D.esrijan. the typical UART used is the PC16550D. the first serial port is always I/O mapped from 0x3F8 to 0x3FF. in order to write device drivers. Generally speaking. commonly known as an UART (Universal Asynchronous Receiver/Transmitter) or at times a USART (Universal Synchronous/Asynchronous Receiver/Transmitter). Assuming all this has been done. all these devices and their mappings are predefined. On PCs. Then. and how. But what does this mapping mean? What do we do with this? How does it help us to use the serial port? That is where a data-sheet of the corresponding device needs to be looked up. an online search with the corresponding device number should yield their data-sheet links. Device driver writers need to understand the details of the registers of the device. to name a few. parallel and PCI bus interfaces.com. available at lddk. from where. The listing includes predefined DMA.

note that the register addresses start from 0 and goes .Figure 2: Registers of UART PC16550D Each of the eight rows corresponds to the respective bit of the registers. Also.

outb(val. SERIAL_PORT_BASE + UART_LCR). So. So. dlab = inb(SERIAL_PORT_BASE + UART_LCR). the corresponding macros could be used instead. rather than hard-coding these values from the data sheet. SERIAL_PORT_BASE + UART_LCR /* 3 */). to achieve the desired serial operations. outw(val. i.e. the eight register offsets. the best way would be to play with the Linux device driver kit (LDDK) mentioned above. a serial device on x86. unless they are dynamically configurable like in the case of PCI devices. Who decides the base address and where is it obtained from? Base addresses are typically board/platform specific. All the serial register offsets and the register bit masks are defined in the header<linux/serial_reg. val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */). outb(val. as follows: . SERIAL_PORT_BASE + UART_DLL /* 0 */). just for a feel of low-level hardware access. for reading or writing the corresponding serial registers. However. here are a few examples of how to do read and write operations of the serial registers and their bits. 0 to 7. All the following code uses these macros. exactly map to the eight port addresses 0x3F8 to 0x3FF. val = inw(SERIAL_PORT_BASE + UART_DLL /* 0 */). val = inb(SERIAL_PORT_BASE + UART_LCR /* 3 */). outb(val. SERIAL_PORT_BASE + UART_LCR /* 3 */). dlab |= UART_LCR_DLAB. Reading and writing the ‘Divisor Latch’: u8 dlab. to get the actual register addresses. /* Setting DLAB */ val |= UART_LCR_DLAB /* 0x80 */. // Setting DLAB to access Divisor Latch outb(dlab. a blinking light emitting diode (LED) may be tried. The interesting thing about this is that a data sheet always gives the register offsets. u16 val. Blinking an LED To get a real experience of low-level hardware access and Linux device drivers. /* Clearing DLAB */ val &= ~UART_LCR_DLAB /* 0x80 */.up to 7. In this case. these are the actual addresses to be read or written.h>. Setting and clearing the ‘Divisor Latch Access Bit (DLAB)’ in LCR: u8 val. as per the register descriptions. it is dictated by the x86 architecture—and that precisely was the starting serial port address mentioned above—0x3F8.. Reading and writing the ‘Line Control Register (LCR)’: u8 val. along with the following: #define SERIAL_PORT_BASE 0x3F8 Operating on the device registers To summarise the decoding of the PC16550D UART data sheet. SERIAL_PORT_BASE + UART_LCR /* 3 */). Thus. which then needs to be added to the base address of the device.

c: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 #include #include #include #include #include <linux/module. msleep(500). using insmod blink_led. respectively.h> #define SERIAL_PORT_BASE 0x3F8 int __init init_module() { int i. i++) { /* Pulling the Tx line low */ data |= UART_LCR_SBC.c by running make with the usual driver Makefile. <email_at_sarika- Looking ahead You might have wondered why Shweta is missing from this article? She bunked all the classes! Watch out for the next article to find out why. outb(data.Connect a light-emitting diode (LED) with a 330 ohm resistor in series across Pin 3 (Tx) and Pin 5 (Gnd) of the DB9 connector of your PC. /* Defaulting the Tx line high */ data &= ~UART_LCR_SBC.h> <linux/types. Driver file blink_led.ko can be created from its source file blink_led. Pull up and down the transmit (Tx) line with a 500 ms delay.h> <linux/version. by loading and unloading theblink_led driver.h> #include <linux/serial_reg. Given below is the complete blink_led. data = inb(SERIAL_PORT_BASE + UART_LCR). MODULE_AUTHOR("Anil Kumar Pugalia pugs_dot_com>").h> <linux/delay. outb(data. msleep(500). u8 data. for (i = 0. i < 5. . SERIAL_PORT_BASE + UART_LCR). SERIAL_PORT_BASE + UART_LCR).ko and rmmod blink_led.h> <asm/io. } return 0. MODULE_DESCRIPTION("Blinking LED Hack"). } void __exit cleanup_module() { } MODULE_LICENSE("GPL").

Other parameters will be ignored. and so on — basically. If there is a need for more arguments. and then an appropriate function pointer is initialised with it. Introducing ioctl() Input/Output Control (ioctl. for debugging a driver by querying driver data structures). It is a one-bill-fits-all kind of system call. in short) is a common operation. unsigned int cmd. it can be equivalently invoked from user-space using the ioctl()system call.. display configuration for a video device. like other system calls. it is the ioctl or unlocked_ioctl (since kernel 2. and tell me about the x86 hardware interfacing experiments in the last Linux device drivers’ lab session. anything to do with device input/output. Whether integer or pointer. read(). struct file *f.35) function pointer field in the struct file_operations that is to be initialised. yet versatile enough for any kind of operation (for example. from kernel 2. etc.. and also about what’s planned for the next session. all of them are put in a structure. though related to hardware.6. The question is: how can all this be achieved by a single function prototype? The trick lies in using its two key parameters: command and argument. Here. and accordingly type-cast and processed. then ioctl() is the one to use. The command is a number representing an operation. When the doctor requested them to leave.h> as: int ioctl(int fd. exactly as in other system calls like open(). cmd is the same as what is implemented in the driver’s ioctl(). prototyped in <sys/ioctl. unsigned int cmd. or device-specific operations. However. unsigned long arg). . ioctl(). would be about.6. Shweta’s friends summarised the session. . reading device registers. “Get me a laptop. Part 9: I/O Control in Linux This article. The argument command is the corresponding parameter for the operation. and told her that they didn’t know what the upcoming sessions. exasperated at being confined to bed due to food poisoning at a friend’s party. available in most driver categories. the argument is taken as a long integer in kernelspace. For example.. it changed to: long ioctl(struct file *f.. and the variable argument construct (.). which is part of the series on Linux device drivers. they took the opportunity to plan and talk about the most common hardware-controlling operation. unsigned long arg). talks about the typical ioctl() implementation and usage in Linux. or system call. If there is no other system call that meets a particular requirement.35. The following has been its prototype in the Linux kernel for quite some time: int ioctl(struct inode *i. ioctl() is typically implemented as part of the corresponding driver. and a pointer to the structure becomes the ‘one’ command argument. Theioctl() function implementation does a switch … case over the commmand to implement the corresponding functionality. Practical examples include volume control for an audio device. Again.) is a hack to be able to pass any type of argument (though only one) to the driver’s ioctl(). in character drivers. int cmd.Device Drivers.” cried Shweta.

h> <linux/kernel. } #if (LINUX_VERSION_CODE < KERNEL_VERSION(2.35)) static int my_ioctl(struct inode *i.6. Querying driver-internal variables To better understand the boring theory explained above. 1. unsigned long arg) .h> <linux/fs.c would be as follows: #include #include #include #include #include #include #include #include <linux/module.h> <linux/device. dignity. struct file *f) { return 0.h" #define FIRST_MINOR 0 #define MINOR_CNT 1 static static static static dev_t dev. int status = 1. and ego. here’s the code set for the “debugging a driver” example mentioned earlier. struct cdev c_dev. The header file query_ioctl. This driver has three static global variables: status. these definitions are commonly put into header files for each space.dignity.h> <asm/uaccess. 2) #define QUERY_SET_VARIABLES _IOW('q'. unsigned int cmd.h defines the corresponding commands and command argument type. struct class *cl. #define QUERY_GET_VARIABLES _IOR('q'.h> typedef struct { int status. the driver’s ioctl() implementation in query_ioctl.h> #include "query_ioctl.h> <linux/version. which need to be queried and possibly operated from an application. query_arg_t *) #define QUERY_CLR_VARIABLES _IO('q'. Thus. } static int my_close(struct inode *i. A listing follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 #ifndef QUERY_IOCTL_H #define QUERY_IOCTL_H #include <linux/ioctl. dignity = 3. struct file *f) { return 0.h> <linux/cdev. 3. } query_arg_t. ego = 5.Note that both the command and command argument type definitions need to be shared across the driver (in kernel-space) and the application (in user-space).h> <linux/errno. struct file *f. static int my_open(struct inode *i. ego. query_arg_t *) #endif Using these.

} return 0. unsigned long arg) #endif { query_arg_t q. q. dignity = q.unlocked_ioctl = my_ioctl #endif }. MINOR_CNT.ego = ego. case QUERY_CLR_VARIABLES: status = 0.open = my_open.status = status. switch (cmd) { case QUERY_GET_VARIABLES: q. break. if ((ret = alloc_chrdev_region(&dev. case QUERY_SET_VARIABLES: if (copy_from_user(&q. break. struct device *dev_ret. q.dignity = dignity.owner = THIS_MODULE. &q.release = my_close. . ego = 0. ego = q. #if (LINUX_VERSION_CODE < KERNEL_VERSION(2. } break. dignity = 0.6. } static struct file_operations query_fops = { . .26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 #else static long my_ioctl(struct file *f. default: return -EINVAL. } status = q. if (copy_to_user((query_arg_t *)arg.ioctl = my_ioctl #else . (query_arg_t sizeof(query_arg_t))) { return -EACCES. *)arg. sizeof(query_arg_t))) { return -EACCES. .status. unsigned int cmd.ego. static int __init query_ioctl_init(void) { int ret.35)) .dignity. FIRST_MINOR.

return PTR_ERR(dev_ret). dev. unregister_chrdev_region(dev. MINOR_CNT). &query_fops). cdev_del(&c_dev). MINOR_CNT)) < 0) { return ret. } module_init(query_ioctl_init). } cdev_init(&c_dev. .75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 "query_ioctl")) < 0) { return ret. "char"))) { cdev_del(&c_dev). module_exit(query_ioctl_exit). } return 0. MODULE_LICENSE("GPL"). NULL. unregister_chrdev_region(dev. } if (IS_ERR(cl = class_create(THIS_MODULE. class_destroy(cl). } static void __exit query_ioctl_exit(void) { device_destroy(cl. MINOR_CNT). } if (IS_ERR(dev_ret = device_create(cl. MODULE_DESCRIPTION("Query ioctl() Char Driver"). return PTR_ERR(cl). "query"))) { class_destroy(cl). dev. if ((ret = cdev_add(&c_dev. MINOR_CNT). cdev_del(&c_dev). NULL. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"). dev). unregister_chrdev_region(dev.

dignity). q. q.h> #include "query_ioctl. q.dignity = v. QUERY_CLR_VARIABLES) == -1) { perror("query_apps ioctl clr"). scanf("%d". getchar(). printf("Ego : %d\n". if (ioctl(fd. &v). query_arg_t q. q. the corresponding invocation functions from the application query_app.status). scanf("%d".h" void get_vars(int fd) { query_arg_t q. } else { printf("Status : %d\n".ego). if (ioctl(fd.124 125 126 127 And finally.ego = v. QUERY_GET_VARIABLES. printf("Dignity: %d\n".h> <sys/types.c would be as follows: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 #include #include #include #include #include #include <stdio.h> <fcntl.h> <unistd. &v). &v). &q) == -1) { perror("query_apps ioctl set"). printf("Enter Dignity: "). printf("Enter Status: "). } } void clr_vars(int fd) { if (ioctl(fd. scanf("%d".status = v. &q) == -1) { perror("query_apps ioctl get"). QUERY_SET_VARIABLES. printf("Enter Ego: "). } } void set_vars(int fd) { int v. q. getchar(). q.h> <string.h> <sys/ioctl. . getchar().

"-c") == 0) { option = e_clr. return 2. char *argv[]) { char *file_name = "/dev/query". } else { fprintf(stderr. return 1. argv[0]). } else if (argc == 2) { if (strcmp(argv[1]. int fd. . } else if (strcmp(argv[1]. O_RDWR). "-g") == 0) { option = e_get. if (fd == -1) { perror("query_apps open"). "Usage: %s [-g | -c | -s]\n". } else if (strcmp(argv[1]. e_clr. } } else { fprintf(stderr. if (argc == 1) { option = e_get. e_set } option. return 1. case e_clr: clr_vars(fd). "-s") == 0) { option = e_set. break. "Usage: %s [-g | -c | -s]\n". } fd = open(file_name. enum { e_get.45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 } } int main(int argc. } switch (option) { case e_get: get_vars(fd). argv[0]). break.

using the following Makefile: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # If called directly from the command line. return 0. break.c and query_ioctl. } close (fd). Now try out query_app.c with the following operations: Build the query_ioctl driver (query_ioctl. invoke the kernel build system. else .) KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: module query_app module: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) modules clean: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) clean ${RM} query_app # Otherwise KERNELRELEASE is defined. default: break.94 95 96 97 98 99 100 101 102 } 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 case e_set: set_vars(fd).ko file) and the application (query_app file) by running make. ifeq ($(KERNELRELEASE). we've been invoked from the # kernel build system and can use its language.

Shweta thanked her friends since she could understand most of the code now.the third argument to these macros. 4. _IO)..typically an ASCII character (the first argument to these macros)./query_app -g — to display the driver variables . _IO. The original command number [bits 7:0] -. _IOWR.)." yelled the security guard. Just that. 2. etc. Check out the header <asm-generic/ioctl./query_app -s — to set the driver variables (not mentioned above) Unload the driver using rmmod query_ioctl.ko. . including the need for copy_to_user(). 2. formed of four components embedded into the [31:0] bits: 1.computed using sizeof() with the command argument's type -. now additionally. as learnt earlier.o 18 19 20 endif 21 Load the driver using insmod query_ioctl.h.the second argument to these macros. 3. as per the POSIX standard for ioctl. 3. The size of the command argument [bits 29:16] -. run the application query_app: . But she wondered about _IOR. .obj-m := query_ioctl. or none -.. The 8-bit magic number [bits 15:8] -. which were used in defining commands inquery_ioctl.the actual command number (1. With appropriate privileges and command-line arguments. The standard talks about the 32-bit command numbers..h> for implementation details. defined as per our requirement -.to render the commands unique enough -. as mentioned earlier for an ioctl()command. Defining the ioctl() commands "Visiting time is over./query_app -c — to clear the driver variables . some useful command related information is also encoded as part of these numbers using various macros./query_app — to display the driver variables .filled by the corresponding macro (_IOR. both. write. _IOW.read. The direction of command operation [bits 31:30] -. These are usual numbers only.

and the data display debugger.35. which is part of theseries on Linux device drivers. to make debugging with gdb more meaningful. and a shell or a GUI over it to run the debugger (like gdb. talks about kernel-space debugging in Linux. installed and booted with. However. the kernel to be debugged needs to have kgdb enabled and built into it. where we have the OS running underneath. install and boot the new kernel. First of all. the kernel source has to be configured with CONFIG_KGDB=y.26. Here is how to play around with kgdb over the serial interface. its serial interface is part of the official kernel release. a client would connect to it from a remote host or local userspace over some interface (say serial or Ethernet). ddd). the kernel’s gdbserver.6.26. or using sudo). to be used with gdb as its client. one had to download source code (two sets of patches — one architecture-dependent.6. since kernel 2. Setting up the Linux kernel with kgdb Here are the prerequisites: Either the kernel source package for the running kernel should be installed on your system. Then she came across this interesting kernelspace debugging mechanism using kgdb. allowing gdb to construct more accurate stack back-traces. by issuing the following command: $ make mrproper $ make oldconfig running one $ make menuconfig configuration # To clean up properly # Configure the kernel same as the current # Start the ncurses based menu for further . accessible via the usual console. Shweta. Since kernel 2. one architecture-independent) from this FTP address and then patch these into the kernel source. The boot screen itself would give the kdb debugging interface. In either case. the complete kernel source for the kernel to be debugged is needed.6. Additionally. All these options are available under “Kernel hacking” in the menu obtained in the kernel source directory (preferably as root. since kernel 2. For example.org. you still need to patch with one of the releases from the kgdb project page.Device Drivers. To achieve that.35. in the case of kdb. provided as part of the kernel itself. if you’re interested in a network interface. Please note that in both the above cases. kdb support needs to be enabled in kernel source. back from hospital. or a corresponding kernel source release should have been downloaded from kernel. reading various books. CONFIG_FRAME_POINTER=y enables frame pointers in the kernel. recompile. And CONFIG_DEBUG_INFO is preferred for symbolic data to be built into the kernel. which was not official until kernel 2. if there was any. where just headers are sufficient. She was curious about how and where to run the kernel-space debugger. for kgdb over serial.6. This was in contrast with application/user-space debugging. you need to enablekgdb support in the kernel. Put a minimal debugging server into the kernel. In either case. The debugger challenge in kernel-space As we need some interface to be up to run a debugger to debug anything. the majority of it is in the officially released kernel source. Ever since she learned of theioctl way of debugging. This is kgdb. she was impatient to find out more about debugging in kernel-space. However. a kernel debugger could be visualised in two possible ways: Put the debugger into the kernel itself. with the kernel compiled. was relaxing in the library. Part Debuggers in Linux 10: Kernel-Space This article. unlike for building modules. CONFIG_KGDB_SERIAL_CONSOLE=y needs to be configured.

Figure 1: Configuring kernel options for kgdb See the highlighted selections in Figure 1. for how and where these options would be: “KGDB: kernel debugging with remote gdb” –> CONFIG_KGDB “KGDB: use kgdb over the serial console” –> CONFIG_KGDB_SERIAL_CONSOLE .

needs to be copied from the system to be debugged. the first serial port. Setting up gdb on another system Following are the prerequisites: Serial ports of the system to be debugged.33.cfg. Once installed. To get gdb to connect to the waiting kernel. and then it will wait forgdb to connect over the serial port. this parameter should be given only after kgdboc. The same should work for any 2. vmlinux is the kernel image copied from the system to be debugged. Reboot. or something similar. should be connected using a null modem (i. as shown in the highlighted text in Figure 2. build the kernel (run make). Depending on the distribution. <baud<baud rate> where: <serial_device> is the serial device file (port) on the system running the kernel to be debugged <baud-rate> is the baud rate of this serial port kgdbwait tells the kernel to delay booting till a gdb client connects to it. with kgdb enabled. a cross-over serial) cable. the GRUB configuration file may be /boot/grub/menu. along with w adding an entry for the installed kernel in the GRUB configuration file. into the working directory on the system where gdb is going to be run. choose the new kernel. the kgdb-related kernel boot parameters need to be added to this new entry. and then a make install to install it.. Also.e. and the other system to run gdb. the snapshots for kgdb are captured over the serial device file /dev/ttyS0. All the above snapshots are with kernel version 2.3x release of the rnel kernel source. over The vmlinux kernel image built..6.14. .e. launch gdb from the shell and run these commands: (gdb) file vmlinux (gdb) set remote interrupt-sequence Ctrl-C (gdb) set remotebaud 115200 aud (gdb) target remote /dev/ttyS0 (gdb) continue In the above commands. With this. Figure 2: GRUB configuration for kgdb kgdboc is for gdb connecting over the console. Make a copy of the vmlinux kernel image for use on the gdbclient system.“Compile the kernel with debug info” –> CONFIG_DEBUG_INFO “Compile the kernel with frame pointers” –> CONFIG_FRAME_POINTER Once configuration is saved. /etc/grub. and at the GRUB menu. we’re ready to begin. i.lst. and the basic format is kgdboc= <serial_device>.6.

In fact. There.Debugging using gdb with kgdb After this. . Summing up By now. Eclipse. add break points using b[reak]. she went to the Linux device drivers’ lab. One may stop execution using Ctrl+C. There are enough GDB tutorials available online. stop execution using s[tep] or n[ext] … — the usual gdb way. if you are not comfortable with text-based GDB. etc. Shweta was excited about wanting to try out kgdb. it is all like debugging an application from gdb. Since she needed two systems to try it out. if you need them. she set up the systems and ran gdb as described above. like ddd. use any of the standard GUI tools over gdb.

Hardware Hardware-space detection is done by the USB host e controller — typically a native bus device. and applications (which are dependent on the various Linux distributions). USB device detection in Linux Whether a driver for a USB device is there or not on a Linux system. A -v option to lsusbprovides detailed information. thus enabling the detection of a USB device in kernelkernel-space. since it is designed USB-enabled (and detected) as per the USB protocol specifications. Part 11: USB Drivers in Linux This article. The USB protocol formatted information about the USB device is then populated into the generic USB core layer (the usbcore driver) in kernel space. with and without the pen drive plugged in. So they chose a pen drive (a. Pugs’ pen drive was the device Shweta was playing with. with vendor ID 0x058f and product ID 0x6387. USB stick) that was at hand — a JetFlash from Transcend. to experiment with.a.Device Drivers. interfaces. as root. The corresponding host controller driver would pick and translate the low low-level physical layer information into higher-level USB protocol-specific level protocol information. driver for it. Figure 1: USB subsystem in Linux A basic listing of all detected USB devices can be obtained using the lsusb command. like a PCI device on x86 systems. to have the user-space view of the detected devices. and write a ay. and Pugs’ usual way. when both of them sat down to explore the world of USB drivers in Linux. a valid USB device will always be detected at the hardware and kernel spaces of a USB enabled Linux system. Figure 1 shows a top to-bottom view of the space top-to USB subsystem in Linux. was to pick up a USB device.k. even without having its specific driver. . fic After this. The fastest way to get the hang of it. kernel space. gets you started with writing your first USB driver in Linux. it is up to various drivers. which is part of theseries on Linux device drivers. Figure 2 shows this.

Figure 2: Output of lsusb In many Linux distributions like Mandriva.… the usbfs driver is loaded as part of the default configuration. Fedora. The listing basically contains one such section for each specific valid USB device detected on the system. using cat /proc/bus/usb/devices. Figure 3: USB's proc window snippet . clipped around the pen drive-specific section. Figure 3 shows a typical snippet of the same. This enables the detected USB device details to be viewed in a more techno techno-friendly way through the /proc window.

Interrupt. Figure 4: USB device overview Coming back to the USB device sections (Figure 3). pecification. there would be one or more end end-points. Based on the t type of information. Details about these and various others are available in the kernel source. The USB pen drive driver regist registration “Seems like there are so many things to know about the USB protocol. scanning and faxing. I for interface. As such. all valid USB devices have an implicit special control end end-point zero. A configuration of a USB device is like a profile. the endpoints have four types: Control. It is okay and fairly common to have a single USB device driver for all the interfaces of a USB device. … under a USB device specification. based point. the constructor and the destructor are required — basically the same driver template that has been used for all the drivers.ko). For example. For every interface. in Documentation/usb/proc_usb_info. where the default one is the commonly used one.txt. unlike a character driver. a valid USB device needs to be understood first. which was one of the vertical drivers discussed earlier. E for ication endpoint. B. a horizontal driver. and different device interfaces may have the same driver — though. one cours interface can have a maximum of one driver only. the first letter on each line represents the various parts of the USB device specification just explained.. as this is a hardware protocol layer driver. i. An interface corresponds to a function provided by the device. to be able to write the first USB driver itself — device configuration. All valid USB devices contain one or more configurations.Decoding a USB device section To further decode these sections. their four types. say an MFD (multi nterfaces (multifunction device) USB printer can do printing. rather than the device as a whole — meaning that one USB device may have multiple device drivers. on the above explanation. entry in the proc window output (Figure 3) shows the interface to driver mapping — a (none) indicating no associated driver.” “Yes. the content would vary. Like any other Linux device driver. So. transfer pipes. C for configuration. D for device. For every configuration. but don’t you worry — all of that can be covered in detail later. Bulk and Isochronous. and so many other symbols like T. etc. There would be as many interfaces as the number of functions provided by the device. depending on the functionality. the device may have one or more interfaces. one for each of the functions. the only bi-directional end-point. The Driver=.” consoled Pugs. S. of course. then it most likely would have at least three interfaces.. here. Let’s do first things first — get the pen drive’s interface associated with our USB device driver (pen_register. . Figure 4 shows the complete pictorial representation of a valid USB device. Linux supports only one configuration per device — the default one. However.e. interfaces.” sighed Shweta. So. The difference would be that instead of registering with and unregistering from VFS. An end-point is like a pipe for transferring point information either into or from the interface of the device. As per the USB protocol specification. too.. unlike other device drivers. a USB device driver is typically associated/written per interface.

} module_init(pen_init).h> 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 static int pen_probe(struct usb_interface *interface.h> #include <linux/usb. } static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F. The USB core APIs for the same are as follows (prototyped in <linux/usb. pen_table). the fields to be provided are the driver’s name. . id->idProduct). return 0. static int __init pen_init(void) { return usb_register(&pen_driver). it would get connected with the actual device in hardware-space.c would look like what follows: 1 #include <linux/module. instead of providing a user-space interface like a device file. }. MODULE_LICENSE("GPL"). } static void pen_disconnect(struct usb_interface *interface) { printk(KERN_INFO "Pen drive removed\n").probe = pen_probe. Putting it all together. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>"). 0x6387) }. and the two callback functions to be invoked by the USB core during a hot plugging and a hot removal of the device. respectively. .id_table = pen_table.h>): int usb_register(struct usb_driver *driver). const struct usb_device_id *id) { printk(KERN_INFO "Pen drive (%04X:%04X) plugged\n".here this would be done with the corresponding protocol layer — the USB core in this case. . id->idVendor. ID table for auto-detecting the particular device. static struct usb_driver pen_driver = { . module_exit(pen_exit). As part of the usb_driver structure. void usb_deregister(struct usb_driver *). pen_register.name = "pen_driver".h> 2 #include <linux/kernel. } static void __exit pen_exit(void) { usb_deregister(&pen_driver). MODULE_DEVICE_TABLE (usb.disconnect = pen_disconnect. MODULE_DESCRIPTION("USB Pen Registration Driver"). {} /* Terminating entry */ }. .

This is not because a USB driver is different from a character driver — but there’s a catch. etc. the results would be as expected. in order to get our driver associated with that interface. Unload the driver using rmmod. taking a break himself. Figure 5: Pen driver in action Summing up “Finally! Something in action!” a relieved Shweta said.e. Now. Check dmesg and the proc window to see the various logs and details. we need to unload the usb usb-storage driver (i. “But it seems like there are so many things (like the device ID table. Once that’s done. m . Figure 5 shows a glimpse of the possible logs and a procwindow snippet. the results wouldn’t be as expected. one by one. probe. Repeat hot-plugging in and hotplugging hot plugging out the pen drive to observe the probe and disconnect calls in action. Let’s take them up. you are right.). the usual steps for any Linux device driver may be repeated: Build the driver (. List the loaded modules using lsmod.42 43 44 45 46 Then.ko file) by running make. But surprisingly.. Figure 3 shows that the pen drive has one interface (numbered 0). yet to be understood to get a complete USB device driver in place.” replied Pugs. Load the driver using insmod. with breaks.” “Yes. rmmod usb-storage) and replug the pen drive. disconnect. which is already associated with the usual (numbered usb-storage driver.

typically of up to 8 bytes. Examples include resetting the device. the most significant bit (MSB) of which indicates the direction — 0 means “out”. mouse. Bulk — for big but comparatively slower data transfers. etc. and 1 means “in”. indicating the direction of data transfer. the other way. etc.Device Drivers. For that. Part 12: USB Drivers in Linux Continued The 12th part of the series on Linux device drivers takes you further along the path to writing your first USB driver in Linux — a continuation from theprevious article. an endpoint is identified using an 8-bit number. with the vendor ID 0x058f and product ID 0×6387. Isochronous — for big data transfers with a bandwidth guarantee. and then convert our learning into code.” USB endpoints and their types Depending on the type and attributes of information to be transferred. “in” indicates data flow from the USB device to the host machine. Technically. and the MSB is ignored. video. . All USB devices always have the default control endpoint point as zero. Interrupt — for small and fast data transfers. all but control endpoints could be “in” or “out”. Examples include data transfer for serial ports. Typical practical usage examples include transfers of time-sensitive data like audio. let’s dig further into the USB protocol. though data integrity may not be guaranteed. human interface devices (HIDs) like keyboards. Pugs continued. A typical example is data transfers for mass-storage devices. each belonging to one of the following four categories: Control — to transfer control information. etc. a USB device may have one or more endpoints. using the same handy JetFlash pen drive from Transcend. Control endpoints are bi-directional. and “out”. Additionally. “Let’s build upon the USB device driver coded in our previous session. querying information about the device. Figure 1 shows a typical snippet of USB device specifications for devices connected on a system.

and 64 for the bulk endpoints. In short. it is 2 (<=8). the maximum power (actually. the configuration descriptor. the endpoint numbers (in hex) are. Ivl specifies the interval in milliseconds to be given between two consecutive data packet transfers for proper transfer. the data size that can be transferred in a single go. There could be more in case of an interface having alternates. respectively. these lines in a USB device section give a complete overview of the device as per the USB specifications. usb port>. and the number of interfaces under this configuration. Refer back to Figure 1. device class/category. MxPS specifies the maximum packet size. the device attributes in this configuration. is Depending on this. Again.Figure 1: USB's proc window snippet (click for larger view) To be specific. Decoding a USB device section As we have just discussed regarding the E: line. as discussed in our previous article article. current) the device would draw in this configuration. the second is an (O) or ‘out’ endpoint. and two bulk endpoints of the pen drive under consideration. dicating represented by (I) in the figure. and is more significant for the interrupt endpoints. there would be at least that many I lines. There would be as many C lines as the number of configurations.. and the number of configurations available for this device. as expected. for the interrupt endpoint. 0x01 and 0x82 — the MSB of the first and third being1. 0x81. i.e. the E: lines in the figure show examples of an interrupt endpoint of a UHCI Host Controller. containing at least the device version. i. it is the right time to decode the relevant fields of others as well. uniquely identified by the triplet <usb bus number. indicating ‘in’ endpoints. Also. C. the same interface number but with different propert properties — a typical scenario for WebWeb cams. D represents the device descriptor. contains its index. I represents the interface descriptor with its index. the driver associated with this interface. the functionality class/category of this interface. indicating the position of the device in the USB tree. and the number of endpoints under this interface. usb tree level.. it is one. alternate number. The first letter of the first line of every device section is a T.e. though typically. .

have the corresponding interface handle as their first parameter. which are invoked by the USB core for every interface of the registered device. as shown through the /proc window. that’s what I am going to tell you about. struct usb_host_config *config. rather. struct usb_host_endpoint *endpoint /* array */. the device handle is not available directly in a driver. struct usb_host_config { struct usb_config_descriptor desc. The following are the exact data structures defined in <linux/usb. and possibly to get the first-cut overview of the device. details of which have already been discussed earlier. The P line provides the vendor ID. So. … struct usb_interface *interface[USB_MAXINTERFACES]. as USB drivers are written for device interfaces rather than the device as a whole. Refer to the prototypes below: . */.h>. struct usb_host_interface { struct usb_interface_descriptor desc. … }. there would be as many E lines. So. next. … }.The interface class may or may not be the same as that of the device class. *actconfig. and the product revision. But how does one get the device handle? In fact. exactly as per the USB specifications. the USB host controller driver populates its information into the generic USB core layer? To be precise. for flow clarity: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 struct usb_device { … struct usb_device_descriptor descriptor. the per-interface handles (pointers to struct usb_interface) are available.” Pugs replied. struct usb_interface { struct usb_host_interface *altsetting /* array *cur_altsetting. with access to the struct usb_device handle for a specific device. is there a way to access it using C code?” Shweta asked. S lines are string descriptors showing up some vendor-specific descriptive information about the device. ordered here in reverse. Recall that the probe and disconnect callbacks. But most probably this information would be required to write the driver for the device as well. “Yes. struct usb_host_endpoint { struct usb_endpoint_descriptor desc. The * after the C and I represents the currently active configuration and interface. it puts that into a set of structures embedded into one another. definitely. respectively. And depending on the number of endpoints. Do you remember that as soon as a USB device is plugged into the system. “Peeping into cat /proc/bus/usb/devices is good in order to figure out whether a device has been detected or not. all the USB-specific information about the device can be decoded. … }. … }. … }. product ID.

const struct iface_desc = interface->cur_altsetting. endpoint->bmAttributes). id- printk(KERN_INFO "ED[%d]->bEndpointAddress: 0x%02X\n". with the interface pointer. id->idVendor.bInterfaceNumber). void (*disconnect)(struct usb_interface *interface).h> 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 static struct usb_device *device. } device = interface_to_usbdev(interface).c): 1 #include <linux/module.bNumEndpoints. static int pen_probe(struct usb_interface *interface. endpoint->bEndpointAddress). } static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F. usb_device_id *id) { struct usb_host_interface *iface_desc. all information about the corresponding interface can be accessed — and to get the container device handle. printk(KERN_INFO "Pen i/f %d now probed: (%04X:%04X)\n". iface_desc->desc. int i. printk(KERN_INFO "ID->bNumEndpoints: %02X\n".h> 2 #include <linux/usb.bInterfaceClass).h> #include <linux/kernel. the following macro comes to the rescue: struct usb_device device = interface_to_usbdev(interface). i < iface_desc->desc. Adding this new learning into last month’s registration-only driver gets the following code listing (pen_info.desc.bNumEndpoints). struct usb_endpoint_descriptor *endpoint. interface->cur_altsetting->desc. const struct usb_device_id *id). return 0. printk(KERN_INFO "ED[%d]->wMaxPacketSize: 0x%04X (%d)\n". endpoint->wMaxPacketSize. i++) { endpoint = &iface_desc->endpoint[i]. So. printk(KERN_INFO "ID->bInterfaceClass: %02X\n". endpoint>wMaxPacketSize). for (i = 0. i. } static void pen_disconnect(struct usb_interface *interface) { printk(KERN_INFO "Pen i/f %d now disconnected\n". iface_desc->desc. {} /* Terminating entry */ }. . i. printk(KERN_INFO "ED[%d]->bmAttributes: 0x%02X\n".bInterfaceNumber. i.int (*probe)(struct usb_interface *interface. 0x6387) }. iface_desc->desc. >idProduct).

Remember to ensure (in the output of cat /proc/bus/usb/devices) that the usual usb-storage driver is not the one associated with the pen drive interface. Plug in the pen drive (after making sure that the usb-storage driver is not already loaded). Figure 2 shows a snippet of the above steps on Pugs’ system. MODULE_DESCRIPTION("USB Pen Info Driver").probe = pen_probe. Unplug the pen drive.ko. static struct usb_driver pen_driver = { . along with the pen drive steps: Build the driver (pen_info. static int __init pen_init(void) { return usb_register(&pen_driver). }. Then.ko file) by running make. MODULE_LICENSE("GPL"). . but rather the pen_info driver. } static void __exit pen_exit(void) { usb_deregister(&pen_driver).disconnect = pen_disconnect. } module_init(pen_init).com>"). . the usual steps for any Linux device driver may be repeated.43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 MODULE_DEVICE_TABLE (usb. . Load the driver using insmod pen_info. module_exit(pen_exit). pen_table).id_table = pen_table. Unload the driver using rmmod pen_info. Check the output of dmesg for the logs.name = "pen_driver". . MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs.

enabling to write a single driver for possibly many devices. transfer .Figure 2: Output of dmesg Summing up Before taking another break. In fact. That’s next in line of our device?”. discussion. The first one is by specifying the <vendor id. Moreover. Basically. you mentioned writing multiple drivers for a single device. many more macros are available in <linux/usb.h> for various combinations. multiple of these macros could be specified in the usb_device_id table (terminated by a null entry). product id> pair using the USB_DEVICE() macro (as done above). as well. using the struct usb_device_id table. for matching with any one of the criteria.” replied Pugs. along with the ultimate task in any device driver — the data-transfer mechanisms. queried Shweta. The second one is by specifying the device class/category using the USB_DEVICE_INFO() macro. how do we selectively register or not register a particular interface of a USB device?”. Pugs shared two of the many mechanisms for a driver to specify its device to the USB core. “Earlier. “Sure.

struct usb_class_driver. respectively. This enables auto-loading of these drivers. USB data transfer “Time for USB data transfers. Part 13: Data Transfer to and from USB Devices This article.h>: int usb_register_dev(struct usb_interface *intf. what is this MODULE_DEVICE_TABLE? This has been bothering me since you explained the USB device ID table macros. reserved for USB-based character device files. The macro MODULE_DEVICE_TABLEgenerates two variables in a module’s read-only section. However. using your first USB driver in Linux.” continued Pugs. It is mainly for the user-space depmod. needs to be populated with the suggested device file name and the set of device file operations. but can use the character major number 180. it means the driver has registered for that interface. Returning an error code indicates not registering for it. And hence. in order to understand the complete data transfer flow. respectively. to achieve the hot-plug-n-play behaviour for the (character) device files corresponding to USB devices. as we saw the usbstorage driver getting auto-loaded.” said Pugs. if the probe returns 0. The first parameter in the above functions is the interface pointer received as the first parameter in both probe and disconnect. these are instead invoked in the probe and disconnect callbacks. it has to connect through one of the vertical layers. we would expect these functions to be invoked in the constructor and the destructor of a module.pcimap. Now. “To answer your question about how a driver selectively registers or skips a particular interface of a USB device. which can be dynamically loaded/unloaded. respectively. “Now.” Note that the USB core would invoke probe for all the interfaces of a detected device. Two such files aremodules. the following are the APIs declared in <linux/usb. struct usb_class_driver *class_driver). Pugs continued. it will probe for all interfaces. USB. It details the ultimate step of data transfer to and from a USB device. tell me. Also.” asked Shweta. That’s all. Moreover. As the character (driver) vertical has already been discussed. “That was simple. for USB and PCI device drivers. forms the usual horizontal layer in the kernel space. ‘Module’ is another term for a driver. Usually. you need to understand the significance of the return value of the probe() callback. continues from the previous two articles.” he said. enthusiastically. while doing it for the first time.” commented Shweta. struct usb_class_driver *class_driver). except the ones which are already registered — thus. For the . The second parameter. we do not need to get a free unreserved character major number. to achieve this complete character driver logic with the USB horizontal in one go. with vendor ID 0x058f and product ID 0x6387. Let’s build upon the USB device driver coded in our previous sessions. it is the current preferred choice for the connection with the USB horizontal. void usb_deregister_dev(struct usb_interface *intf.Device Drivers. urging Pugs to slow down. being a hardware protocol. before invoking usb_register_dev. let’s talk about the ultimate — data transfers to and from a USB device. using the same handy JetFlash pen drive from Transcend. for it to provide an interface to user-space. “That’s trivial stuff.usbmap and modules. which is part of the series on Linux device drivers. “But before that. which is extracted by depmod and stored in global map files under /lib/modules/<kernel_version>.

read. respectively. which expects a set of SCSI like commands to be gs SCSI-like transacted over the bulk endpoints. But still. one would need some kind of custom USB device.usb_interrupt_msg(). etc. flow of a USB driver. that is exactly where we need to do the data transfers to and from the USB devi device. pen_write and pen_ read below show the possible calls to usb_bulk_msg() (prototyped in <linux/usb. this summarises the overall code formatted.h> #define #define #define #define MIN(a. Moreover. bit-mask Note that a pen drive belongs to a USB mass storage class.c below. refer to the functions pen_probe andpen_disconnect in the code listing of pen_driver.h> #include <linux/kernel. usb_rcvbulkpipe(). usb_sndbulkpipe(). 1 2 3 4 5 6 7 8 #include <linux/module. etc. also defined in <linux/usb.actual ual usage.h <linux/kernel. and many such other macros. Figure 1: USB specifications for the pen drive Refer to the header file <linux/usb. So.h>) to do the transfers over the pen drive’s bulk end-points end 0×01 and 0×82.h>.h> #include <linux/usb.. To get a feel of a real working USB data transfer in a simple and elegant way. as the file operations (write. something like the one available here.) are now provided. unless the data is appropriately formatted. a raw read/write as shown in the code listing below may not really do a data transfer as expected. So. Refer to the ‘E’ lines of the middle section in Figure 1 for the endpoint number listings of our pen drive. compute the actual endpoint bit mask to be passed to the various USB core APIs. for the complete list of USB core API prototypes for other endpoint-specific data transfer functio specific functions like usb_control_msg().b) (((a) <= (b)) ? (a) : (b)) BULK_EP_OUT 0x01 BULK_EP_IN 0x82 MAX_PKT_SIZE 512 static struct usb_device *device.h> under kernel sources. .

size_t cnt. read_cnt). &read_cnt. 5000). static unsigned char bulk_buf[MAX_PKT_SIZE]. MIN(cnt. size_t cnt. MIN(cnt. BULK_EP_IN). BULK_EP_OUT). static int pen_open(struct inode *i. } static struct file_operations fops = { . return retval. const char __user *buf. retval). if (copy_from_user(bulk_buf. usb_sndbulkpipe(device. .open = pen_open. } static ssize_t pen_write(struct file *f. &wrote_cnt. MIN(cnt. MAX_PKT_SIZE). int wrote_cnt = MIN(cnt. usb_rcvbulkpipe(device.9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 static struct usb_class_driver class. } if (copy_to_user(buf. retval). bulk_buf. /* Read the data from the bulk endpoint */ retval = usb_bulk_msg(device. struct file *f) { return 0. } return wrote_cnt. buf. 5000). loff_t *off) { int retval. char __user loff_t *off) { int retval. read_cnt))) { return -EFAULT. bulk_buf. MAX_PKT_SIZE). struct file *f) { return 0. *buf. } return MIN(cnt. if (retval) { printk(KERN_ERR "Bulk message returned %d\n". } static int pen_close(struct inode *i. MAX_PKT_SIZE))) { return -EFAULT. MAX_PKT_SIZE. bulk_buf. return retval. int read_cnt. } static ssize_t pen_read(struct file *f. if (retval) { printk(KERN_ERR "Bulk message returned %d\n". } /* Write the data into the bulk endpoint */ retval = usb_bulk_msg(device.

.disconnect = pen_disconnect.release = pen_close. usb_interface *interface. class. result). .probe = pen_probe. }. } else { printk(KERN_INFO "Minor obtained: %d\n". 0x6387) }. static int __init pen_init(void) { int result. class. Error number %d". } /* Table of devices that work with this driver */ static struct usb_device_id pen_table[] = { { USB_DEVICE(0x058F. {} /* Terminating entry */ }.write = pen_write.id_table = pen_table. } static void pen_disconnect(struct usb_interface *interface) { usb_deregister_dev(interface. . .58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 . MODULE_DEVICE_TABLE (usb. pen_table). static struct usb_driver pen_driver = { . /* Register this driver with the USB subsystem */ if ((result = usb_register(&pen_driver))) { err("usb_register failed.read = pen_read. static int pen_probe(struct usb_device_id *id) { int retval. const struct device = interface_to_usbdev(interface). if ((retval = usb_register_dev(interface. &class). &class)) < 0) { /* Something prevented us from registering this driver */ err("Not able to get a minor for this device.name = "pen_driver". } return retval. . }. } .name = "usb/pen%d". interface->minor).fops = &fops.").

because of non-conforming SCSI commands). “Want to have more fun? We could do a block driver over it…. module_exit(pen_exit). “Yes. Possibly try some write/read on /dev/pen0 (you most likely will get a connection timeout and/or broken pipe errors. along with the following steps for the pen drive: Build the driver (pen_driver. } module_init(pen_init). Load the driver using insmod pen_driver. Plug in the pen drive (after making sure that the usb-storage driver is not already loaded). Unload the driver using rmmod pen_driver. } static void __exit pen_exit(void) { /* Deregister this driver with the USB subsystem */ usb_deregister(&pen_driver). . excited. “Aha! Finally a cool complete working USB driver. Unplug the pen drive and look for /dev/pen0 to be gone. “Oh! Really?” asked Shweta. Pugs hooked up his first-of-its-kind creation — the Linux device driver kit (LDDK) — into his system for a live demonstration of the USB data transfers. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika-pugs_dot_com>").” commented Pugs. Meanwhile.ko) by running make.107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 return result. As a reminder.” added Pugs. we need to understand the partitioning mechanisms. MODULE_DESCRIPTION("USB Pen Device Driver"). the usual steps for any Linux device driver may be repeated with the above code. with glee. Check for the dynamic creation of /dev/pen0 (0 being the minor number obtained — checkdmesg logs for the value on your system). MODULE_LICENSE("GPL").” quipped Shweta.ko. But before that.

” reasoned Pugs. as one read write head is needed per disk. … T60801. Part 14: A Dive Inside the Hard Disk for Understanding Partitions This article. say S1. say T1. And each sector is typically 512 bytes. … D255. which is part of theseries on Linux device drivers. The second line mentions the human-friendly number of logical heads. Figure 1: Partition listing by fdisk The first line shows the hard disk size in human friendly format and in bytes. cylinders and sectors. In the above case. in our case). Let’s number them. say D1. And a particular track number from all the disks forms a cylinder of the sa same number. while waiting for the commencement of the seminar on storage systems. logical sectors per track. The primary last reason for that is the difference between today’s technology of organising the actual physical disk geometry and the traditional geometry representation using heads. tracks T2 from D1. each track has the same number of logical sectors — 63 in our case. it would be: 255 * 60801 * 63 * 512 bytes = 500105249280 bytes. using the following formula: Usable hard disk size in b bytes = (Number of heads or disks) * (Number of tracks per disk) (N bytes per sector. “Yes.Device Drivers. For example. D2. same starting from the outside to the inside. showing the output stem. D2. … S63. Now. Let’s read-write number them. i.801 such tracks per disk. nder Note that this number may be slightly less than the actual hard disk (500107862016 bytes. of fdisk -l(Figure 1). “Doesn’t it sound like a mechanical engineering subject: The design of the hard disk?” questioned Shweta. takes you on a tour inside a hard disk. … D255 will together form the cylinder C2. The 255 heads indicate the number of platters or disks. S2. each disk would have the same number of concentric circular tracks. Now.e. there are 60. one can actually compute the total usable hard disk size. The reason is that the formula doesn’t consider the bytes in the last partial or incomplete cylinder. Given this data. and the actual number of cylinders on the disk — together known as the geometry of the disk. T2. But understanding it gives us an insight into its programming aspect. it does. . sector size) For the disk under consideration. The seminar started with a few hard disks in the presenter’s hand and then a dive into her system.

This is useful for organising different types of data separately. that is loaded by the BIOS to boot the system from the disk. what is a partition. Hence. the corresponding <head. a 4-byte disk signature is placed at offset 440. But first.h> 1 #include <sys/types. thus allowing a maximum of 1023 cylinders only. etc. unsigned long abs_start_sec. it could be noted that the start and end cylinder fields are only 10 bits long. then why still maintain them and represent them in a logical form? The main reason is to be able to continue with the same concepts of partitioning. Each of these entries can be depicted by the following ‘C’ structure: typedef struct { unsigned char boot_type. each a 16-byte entry. Note the computation of cylinder size (255 heads * 63 sectors / track * 512 bytes / sector = 8225280 bytes) in the third line and then the demarcation of partitions in units of complete cylinders. the starting offset of this partition table within the MBR is 512 . // 0x00 . From the partition table entry structure.c listing contains these various definitions. and the actual value is computed using the last two fields: the absolute start sector number (abs_start_sec) and the number of sectors in this partition (sec_in_part). unsigned long sec_in_part. this is in no way sufficient. However.h> 4 #include <unistd. unsigned char start_sec:6. commonly known as the Master Boot Record (MBR). unsigned char end_cyl. Also. and why should we partition? A hard disk can be divided into one or more logical disks. different operating system data. which is the partition table. especially for the most prevalent DOStype partition tables. unsigned char start_cyl. A DOS-type partition table contains four partition entries. 0x80 . unsigned char start_cyl_hi:2. each of which is called a partition. which heavily depend on this simplistic geometry. we referred to the heads and sectors per track as logical not physical.h> 5 6 7 8 9 10 11 #define #define #define #define #define #define #define SECTOR_SIZE 512 MBR_SIZE SECTOR_SIZE MBR_DISK_SIGNATURE_OFFSET 440 MBR_DISK_SIGNATURE_SIZE 4 PARTITION_TABLE_OFFSET 446 PARTITION_ENTRY_SIZE 16 // sizeof(PartEntry) PARTITION_TABLE_SIZE 64 // sizeof(PartTable) . Hence. unsigned char part_type. The remaining top 440 bytes of the MBR are typically used to place the first piece of boot code. and be able to maintain the same partition table formats. cylinder. along with code for parsing and printing a formatted output of the partition table.(4 * 16 + 2) = 446.Inactive. sector> triplet in the partition table entry is set to the maximum value. unsigned char end_head. DOS-type partition tables This brings us to the next important topic: understanding DOS-type partition tables. } PartEntry. user data. resides at the end of the disk’s first sector.h> 2 #include <sys/stat. temporary data. The part_info. unsigned char end_sec:6. unsigned char end_cyl_hi:2. This partition table. for today’s huge hard disks. followed by the two-byte signature 0xAA55.h> 3 #include <fcntl. for example. So.Note that in the fdisk output. in overflow cases. partitions are basically logical divisions and need to be maintained by metadata. The code for this too is in part_info.Active (Bootable) unsigned char start_head. One may ask that if today’s disks don’t have such physical geometry concepts.c: #include <stdio.

heads. if (argc == 2) { dev_file = argv[1]. // 0x00 . "Failed reading %s: ". unsigned char pt[PARTITION_TABLE_SIZE].12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 #define #define #define #define #define #define #define MBR_SIGNATURE_OFFSET 510 MBR_SIGNATURE_SIZE 2 MBR_SIGNATURE 0xAA55 BR_SIZE SECTOR_SIZE BR_SIGNATURE_OFFSET 510 BR_SIGNATURE_SIZE 2 BR_SIGNATURE 0xAA55 typedef struct { unsigned char boot_type. 0x80 . cyls = tracks / 255 + 1 /* As indexed from 1 */. unsigned char end_sec:6.Active (Bootable) unsigned char start_head. heads = tracks % 255. sectors. unsigned char start_cyl_hi:2. unsigned short pad. unsigned char end_cyl_hi:2. PartEntry *p = (PartEntry *)(m. char *argv[]) { char *dev_file = "/dev/sda". } MBR.Inactive. cyls. unsigned short signature. } PartEntry. sizeof(m))) != sizeof(m)) { fprintf(stderr. typedef struct { unsigned char boot_code[MBR_DISK_SIGNATURE_OFFSET]. cyls. dev_file). unsigned char end_head. dev_file). unsigned long abs_start_sec. O_RDONLY)) == -1) { fprintf(stderr. unsigned char start_cyl. MBR m.pt). unsigned char part_type. void print_computed(unsigned long sector) { unsigned long heads. return 1. "Failed opening %s: ". } int main(int argc. . } if ((rd_val = read(fd. printf("(%3d/%5d/%1d)". &m. i. rd_val. unsigned char start_sec:6. int fd. unsigned char end_cyl. tracks. tracks = sector / 63. sectors = sector % 63 + 1 /* As indexed from 1 */. unsigned long disk_signature. sectors). } if ((fd = open(dev_file. unsigned long sec_in_part. perror("").

close(fd). p[i]./part_info /dev/sda to check out your primary partitioning information on /dev/sda.61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 perror("").sec_in_part 1).abs_start_sec). print_computed(p[i]. p[i]. for (i = 0. printf(" "). !!(p[i]. p[i].c -o part_info.boot_type & 0x80)). compile it with gcc part_info. i < 4. i++) { printf("%d:%d ". dev_file).sec_in_part).abs_start_sec. p[i]. p[i]. Figure 2 shows the output of .part_type. print_computed(p[i].abs_start_sec + p[i]. p[i]. for (i = 0. } As the above is an application. i + 1. printf(" B Start (H/C/S) End (H/C/S) Type StartSec TotSec\n"). return 2. i + 1. } printf("\n"). and then run. } close(fd). p[i].part_type.start_cyl_hi << 8) | p[i]. p[i].start_sec.end_head. 1 + ((p[i]. !!(p[i]. Compare it with the fdiskoutput in Figure 1. } printf("\nRe-computed Partition Table of %s:\n". dev_file). i++) { printf("%d:%d (%3d/%4d/%2d) (%3d/%4d/%2d) %02X %10d %9d\n".abs_start_sec.end_sec. 1 + ((p[i].end_cyl). printf(" B Start (H/C/S) End (H/C/S) Type StartSec TotSec\n").start_cyl).sec_in_part). printf(" %02X %10d %9d\n".boot_type & 0x80). printf("\nDOS type Partition Table of %s:\n".end_cyl_hi << 8) | p[i]. p[i]. . return 0. p[i]./part_info on the presenter’s system.start_head. i < 4.

Besides this. Right now. each having an associated type in the corresponding partition table entry. exploration. Hence. to have more partitions. Linux.. For that. These are called logical partitions and are created within the extended partition. that’s the maximum number of partitions you can coded that’s have. As the name suggests. OSs Minix. Solaris. commonly called the Boot Record (BR). ……). hard disk device files with other names (/dev/hda. you may try . Each linked-list node is a complete 4-entry partition table. the first sector of the extended partition. follow the steps (as the root user — hence with care) given below: . to be used for/with the particular OS. is used like the MBR to store (the linked-list head of) the partition table for the logical partitions. referred to as the Logical Boot Record (LBR). we have carefully and selectively played (read only) with the system’s hard disk. and the second as the linked list’s list next pointer. FreeBSD..e. in our next session. Novell Netware. it is used to further extend hard disk division. allowing an unlimited number of logical partitions (at least list theoretically).Figure 2: Output of . BSD. Subsequent linked list linked-list nodes are stored in the first sector of the subsequent logical partitions. information about the immediate logical partition. The me metadata of these is maintained in a linked-list format. this is more a formality than a real requirement. we will create a dummy disk in RAM and do destructive exploration on it. . though only the first two entries are used — the first for the try linked-list data. as this partition table is hard-coded to have four entries. To compare and understand the primary partitioning details on your system’s hard disk. and hence sort of map to various OS like DOS. But no learning is complete without a total non-bootable. pointing to the list of remaining logical par partitions. W95. one of the four primary partitions can be labelled as something called an extended partition. These are called primary partitions. etc. However. …). or an extended partition./part_info Partition types and boot records Now. Why carefully? (read-only) Since otherwise. which has a special significance. Trying on an extended partition would give you the information about the starting partition table of the logical partitions. i./part_info <device_file_name> on them as well. we may render our system non bootable. QNX. These types are typically coined by various OS vendors. namely./part_info /dev/sda ## Displays the partition table on /dev/sda fdisk -l /dev/sda ## To display and compare the partition table l display entries with the above In case you have multiple hard disks (/dev/sdb.

let’s start with the code itself.h> #include "partition.c 1 2 #include <linux/string. two ‘C’ headers. Part 15: Disk on RAM — Playing with Block Drivers This article. 7 #endif partition. and one Makefile. partition. So.h" . experiments with a dummy hard disk on RAM to demonstrate how block drivers work.Device Drivers. Disk On RAM source code Let’s us create a directory called DiskOnRAM which holds the following six files — three ‘C’ source files. theory makes the audience sleepy. After a delicious lunch.h> 5 6 extern void copy_mbr_n_br(u8 *disk).h 1 #ifndef PARTITION_H 2 #define PARTITION_H 3 4 #include <linux/types. which is part of the series on Linux device drivers.

. unsigned char end_sec:6. 0x80 . unsigned char start_sec:6. unsigned char end_head. unsigned char start_cyl.3 4 5 6 7 8 9 #define ARRAY_SIZE(a) (sizeof(a) / sizeof(*a)) #define SECTOR_SIZE 512 #define MBR_SIZE SECTOR_SIZE #define MBR_DISK_SIGNATURE_OFFSET 440 #define MBR_DISK_SIGNATURE_SIZE 4 #define PARTITION_TABLE_OFFSET 446 #define PARTITION_ENTRY_SIZE 16 // sizeof(PartEntry) #define PARTITION_TABLE_SIZE 64 // sizeof(PartTable) 10 #define MBR_SIGNATURE_OFFSET 510 11 #define MBR_SIGNATURE_SIZE 2 12 #define MBR_SIGNATURE 0xAA55 13 #define BR_SIZE SECTOR_SIZE 14 15 16 17 18 19 20 21 22 23 24 25 26 #define BR_SIGNATURE_OFFSET 510 #define BR_SIGNATURE_SIZE 2 #define BR_SIGNATURE 0xAA55 typedef struct { unsigned char boot_type. // 0x00 . unsigned char part_type. unsigned char start_cyl_hi:2.Active (Bootable) unsigned char start_head.Inactive.

typedef PartEntry PartTable[4]. 50 start_sec: 0x1. 47 { 48 boot_type: 0x00. abs_start_sec: 0x00000001. end_cyl: 0x09. start_cyl: 0x0A. // extended partition start cylinder (BR . unsigned char end_cyl. start_sec: 0x2. sec_in_part: 0x0000013F 46 }. 49 start_head: 0x00. start_head: 0x00. unsigned long abs_start_sec. unsigned long sec_in_part. start_cyl: 0x00. part_type: 0x83. end_head: 0x00. } PartEntry. end_sec: 0x20.27 28 29 30 31 32 unsigned char end_cyl_hi:2. 33 34 static PartTable def_part_table = 35 { 36 { 37 38 39 40 41 42 43 44 45 boot_type: 0x00.

} static unsigned int def_log_part_br_cyl[] = {0x0A. end_head: 0x00. end_head: 0x00. 69 { 70 71 72 73 74 }. start_sec: 0x1. 56 sec_in_part: 0x00000140 57 }.51 52 53 location) part_type: 0x05. start_head: 0x00. 54 end_cyl: 0x13. 58 59 60 61 62 63 64 65 { boot_type: 0x00. start_cyl: 0x14. 0x0E. end_sec: 0x20. 66 abs_start_sec: 0x00000280. static const PartTable def_log_part_table[] = { . end_sec: 0x20. 55 abs_start_sec: 0x00000140. 67 sec_in_part: 0x00000180 68 }. part_type: 0x83. 0x12}. end_cyl: 0x1F.

sec_in_part: 0x0000007F boot_type: 0x00. 96 97 98 }. start_head: 0x00. abs_start_sec: 0x00000001. 95 abs_start_sec: 0x00000080. { end_cyl: 0x0D. 83 84 85 86 87 88 89 90 91 }. start_sec: 0x2. start_head: 0x00. part_type: 0x05. 93 end_sec: 0x20. 82 end_sec: 0x20. 81 end_head: 0x00. start_cyl: 0x0A. }. 80 part_type: 0x83. 94 end_cyl: 0x11. 92 end_head: 0x00.75 76 77 78 79 { { boot_type: 0x00. sec_in_part: 0x00000080 . start_cyl: 0x0E. start_sec: 0x1.

start_cyl: 0x0E. start_head: 0x00.99 100 101 102 103 { { boot_type: 0x00. 106 end_sec: 0x20. 120 121 122 }. 118 end_cyl: 0x13. 107 108 109 110 111 112 113 114 115 }. start_head: 0x00. sec_in_part: 0x0000007F boot_type: 0x00. }. 104 part_type: 0x83. part_type: 0x05. start_sec: 0x1. 105 end_head: 0x00. sec_in_part: 0x00000040 . 119 abs_start_sec: 0x00000100. 116 end_head: 0x00. { end_cyl: 0x11. start_cyl: 0x12. 117 end_sec: 0x20. abs_start_sec: 0x00000001. start_sec: 0x2.

131 132 133 134 135 136 137 138 139 140 141 142 143 144 } }. start_sec: 0x2. &def_part_table. long *)(disk + MBR_DISK_SIGNATURE_OFFSET) = memcpy(disk + PARTITION_TABLE_SIZE). MBR_SIZE). } }. 129 end_head: 0x00. 0x0. start_head: 0x00. *)(disk + MBR_SIGNATURE_OFFSET) = 145 146 static void copy_br(u8 *disk. short PARTITION_TABLE_OFFSET. 130 end_sec: 0x20. start_cyl: 0x12. sec_in_part: 0x0000003F static void copy_mbr(u8 *disk) { memset(disk. int start_cylinder. *(unsigned 0x36E5756D. *(unsigned MBR_SIGNATURE. end_cyl: 0x13. 128 part_type: 0x83. abs_start_sec: 0x00000001. const PartTable *part_table) .123 124 125 126 127 { { boot_type: 0x00.

memcpy(disk + PARTITION_TABLE_OFFSET. BR_SIZE).147 148 149 150 151 152 153 154 155 156 157 { disk += SECTOR_SIZE). 158 for (i = 0. copy_mbr(disk). PARTITION_TABLE_SIZE). 0x0. 163 164 165 166 167 168 169 170 . i < ARRAY_SIZE(def_log_part_table). } def_log_part_br_cyl[i]. (start_cylinder * 32 /* sectors / cyl */ * memset(disk. i++) 159 { 160 161 162 } copy_br(disk. *(unsigned short *)(disk + BR_SIGNATURE_OFFSET) = BR_SIGNATURE. &def_log_part_table[i]). } void copy_mbr_n_br(u8 *disk) { int i. part_table.

h> 1 #include <linux/vmalloc.h" . #define RB_SECTOR_SIZE 512 #ifndef RAMDEVICE_H #define RAMDEVICE_H 6 extern void ramdevice_cleanup(void).171 172 173 174 175 176 ram_device.h 1 2 3 4 5 extern int ramdevice_init(void). unsigned int sectors). u8 *buffer. extern void ramdevice_read(sector_t sector_off. 7 8 9 10 extern void ramdevice_write(sector_t sector_off.h" 5 #include "partition.h> 2 #include <linux/string. u8 *buffer. #endif ram_device.h> 3 4 #include "ram_device. unsigned int sectors).c #include <linux/types.

6 7 8 9 10 11 12
int ramdevice_init(void) /* Array where the disk stores its data */ static u8 *dev_data; #define RB_DEVICE_SIZE 1024 /* sectors */ /* So, total device size = 1024 * 512 bytes = 512 KiB */


dev_data = vmalloc(RB_DEVICE_SIZE * RB_SECTOR_SIZE);

if (dev_data == NULL)

return -ENOMEM;

17 18 19 20 21 22 23 24 25 26 27 28 29
} }

/* Setup its partition table */ copy_mbr_n_br(dev_data); return RB_DEVICE_SIZE;

void ramdevice_cleanup(void) { vfree(dev_data);

void ramdevice_write(sector_t sector_off, u8 *buffer, unsigned int sectors) { memcpy(dev_data + sector_off * RB_SECTOR_SIZE, buffer,

30 31 32 33 34 35 36 37 38
} }

sectors * RB_SECTOR_SIZE);

void ramdevice_read(sector_t int sectors) {





memcpy(buffer, dev_data + sector_off * RB_SECTOR_SIZE, sectors * RB_SECTOR_SIZE);

1 2 3 4
#include <linux/types.h> /* Disk on RAM Driver */ #include <linux/module.h> #include <linux/kernel.h> #include <linux/fs.h>

#include <linux/genhd.h>

#include <linux/blkdev.h>

#include <linux/errno.h>

8 9
#include "ram_device.h"

10 11 12 13 14
static u_int rb_major = 0; #define RB_FIRST_MINOR 0 #define RB_MINOR_CNT 16

15 16 17 18 19 20
/* Size is the size of the device (in sectors) */ /* * The internal structure representation of our Device */ static struct rb_device {

unsigned int size;

/* For exclusive access to our request queue */

spinlock_t lock;

24 25 26 27 28 29 30 31 32 33

/* Our request queue */ struct request_queue *rb_queue; /* This is kernel's representation of an individual disk device */ struct gendisk *rb_disk; } rb_dev;

static int rb_open(struct block_device *bdev, fmode_t mode) { unsigned unit = iminor(bdev->bd_inode);

printk(KERN_INFO "rb: Device is opened\n");

printk(KERN_INFO "rb: Inode number is %d\n", unit);

35 36
if (unit > RB_MINOR_CNT)

return -ENODEV;

return 0;

55 unsigned int sector_cnt = blk_rq_sectors(req). 56 57 58 59 60 61 62 sector_t sector_offset. u8 *buffer. struct bio_vec *bv. struct req_iterator iter. unsigned int sectors.39 40 41 42 43 44 } static int rb_close(struct gendisk *disk. } 45 46 /* 47 * Actual Data transfer 48 */ 49 50 51 52 53 static int rb_transfer(struct request *req) { //struct rb_device >private_data). fmode_t mode) { printk(KERN_INFO "rb: Device is closed\n"). *dev = (struct rb_device *)(req->rq_disk- int dir = rq_data_dir(req). return 0. 54 sector_t start_sector = blk_rq_pos(req). .

int ret = 0. } + sector_offset. sector_offset = 0. req. Dir:%d. printk(KERN_DEBUG "rb: Sector Offset: %lld. Buffer: %p. RB_SECTOR_SIZE). Sec:%lld. buffer. sector_cnt). iter) { buffer = page_address(bv->bv_page) + bv->bv_offset. if (bv->bv_len % RB_SECTOR_SIZE != 0) { printk(KERN_ERR "rb: Should never happen: " "bio size (%d) is not a multiple of RB_SECTOR_SIZE //printk(KERN_DEBUG "rb: start_sector.\n" "This may lead to data truncation. Cnt:%d\n". rq_for_each_segment(bv. (%d). sectors). if (dir == WRITE) /* Write to the device */ { ramdevice_write(start_sector sectors). . ret = -EIO. sector_offset.\n". buffer. Length: %d sectors\n". dir.63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 } sectors = bv->bv_len / RB_SECTOR_SIZE. bv->bv_len.

* Represents a block I/O request for us to execute */ static void rb_request(struct request_queue *q) 105 { 106 struct request *req. 107 int ret. 108 109 110 /* Gets the current request from the dispatch queue */ while ((req = blk_fetch_request(q)) != NULL) . "rb: bio info doesn't match with the request return ret. + sector_offset. buffer.87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 /* } } } else /* Read from the device */ { ramdevice_read(start_sector sectors). } sector_offset += sectors. if (sector_offset != sector_cnt) { printk(KERN_ERR info"). ret = -EIO.

127 //__blk_end_request(req.111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 { #if 0 /* * This function tells us whether we are looking at a filesystem request * . blk_rq_bytes(req)). continue. 0.one that moves block of data */ if (!blk_fs_request(req)) { printk(KERN_NOTICE "rb: Skip non-fs request\n"). //__blk_end_request(req. } #endif ret = rb_transfer(req). 0). /* We pass 0 to indicate that we successfully completed the request */ __blk_end_request_all(req. ret. blk_rq_bytes(req)). 128 } 129 } 130 131 132 133 134 /* * These are the file operations that performed on the ram block device */ static struct block_device_operations rb_fops = . ret). __blk_end_request_all(req.

139 140 /* 141 142 143 144 145 { * This is the registration and initialization section of the ram block device * driver */ static int __init rb_init(void) 146 int ret. if (rb_major <= 0) { printk(KERN_ERR "rb: Unable to get Major Number\n"). "rb").open = rb_open.135 136 137 138 { . 147 148 /* Set up our RAM Device */ 149 if ((ret = ramdevice_init()) < 0) 150 { 151 152 153 154 155 156 157 158 } return ret.size = ret. /* Get Registered */ rb_major = register_blkdev(rb_major. .owner = THIS_MODULE. rb_dev. .release = rb_close. }. .

rb_queue = blk_init_queue(rb_request.lock).lock). 165 if (rb_dev. ramdevice_cleanup(). 168 169 170 171 172 173 174 175 176 177 */ /* } unregister_blkdev(rb_major.159 160 161 162 163 164 } ramdevice_cleanup(). &rb_dev. return -EBUSY.rb_queue). /* Get a request queue (here queue is created) */ spin_lock_init(&rb_dev.rb_disk) 180 { 181 printk(KERN_ERR "rb: alloc_disk failure\n"). * Add the gendisk structure * By using this memory allocation is involved.rb_queue == NULL) 166 { 167 printk(KERN_ERR "rb: blk_init_queue failure\n"). rb_dev. * the minor number we need to pass bcz the device * will support this much partitions 178 rb_dev. 182 blk_cleanup_queue(rb_dev. "rb"). 179 if (!rb_dev.rb_disk = alloc_disk(RB_MINOR_CNT). . return -ENOMEM.

rb_disk->first_minor = RB_FIRST_MINOR.size). "rb"). /* Driver-specific own internal data */ rb_dev. /* Setting the major number */ 188 rb_dev. rb_dev. "rb").rb_disk->private_data = &rb_dev. 206 /* Now the disk is "live" */ printk(KERN_INFO "rb: Ram Block driver initialised (%d sectors. sprintf(rb_dev. ramdevice_cleanup(). /* * You do not want partition information to show up in * cat /proc/partitions set this flags */ //rb_dev. 189 /* Setting the first mior number */ 190 rb_dev.rb_disk.rb_disk->queue = rb_dev.rb_disk).183 184 185 186 187 } unregister_blkdev(rb_major. %d .rb_disk->disk_name. 201 /* Setting the capacity of the device in its gendisk structure */ 202 set_capacity(rb_dev. 203 204 /* Adding the disk to the system */ 205 add_disk(rb_dev.rb_disk->major = rb_major. rb_dev.rb_disk->fops = &rb_fops.rb_disk->flags = GENHD_FL_SUPPRESS_PARTITION_INFO.rb_queue. return -ENOMEM. 191 /* Initializing the device operations */ 192 193 194 195 196 197 198 199 200 rb_dev.

MODULE_DESCRIPTION("Ram Block Driver").207 208 209 bytes)\n". 220 221 222 223 224 225 226 227 228 229 } unregister_blkdev(rb_major.size * RB_SECTOR_SIZE). "rb"). return 0. 210 } 211 /* 212 213 214 215 static void __exit rb_cleanup(void) * This is the unregistration and uninitialization section of the ram block * device driver */ 216 { 217 del_gendisk(rb_dev.rb_disk). 218 put_disk(rb_dev. MODULE_LICENSE("GPL"). module_exit(rb_cleanup).rb_queue).size. ramdevice_cleanup().rb_disk). MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs. rb_dev. rb_dev. 219 blk_cleanup_queue(rb_dev.com>"). module_init(rb_init). MODULE_ALIAS_BLOCKDEV_MAJOR(rb_major). 230 .

combining the three ‘C’ files. invoke the kernel build system. Makefile 1 2 ifeq ($(KERNELRELEASE).) # If called directly from the command line. executing make will build the ‘Disk on RAM’ driver (dor. 3 4 5 6 7 8 module: KERNEL_SOURCE := /usr/src/linux PWD := $(shell pwd) default: module . Check out the Makefile to see how.231 232 233 234 235 236 237 238 239 240 241 242 243 You can also download the code demonstrated from here. As usual.ko).

o partition.o dor-y := ram_block.9 10 11 12 13 14 15 16 17 18 19 20 21 $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) modules clean: $(MAKE) -C $(KERNEL_SOURCE) SUBDIRS=$(PWD) clean # Otherwise KERNELRELEASE is defined. .o ram_device. run the usual make clean. Once built. else obj-m := dor. the following are the experimental steps (refer to Figures 1 to 3).o endif To clean the built files. we've been invoked from the # kernel build system and can use its language.

Figure 1: Playing with the ‘Disk on RAM’ driver Figure 2: xxd showing the initial data on the first partition (/dev/rb1) .

Check out the automatically created block device files (/dev/rb*). /dev/rb is the entire disk. providing disk operation APIs like init/cleanup. You may go ahead and store files there. memcpy. again using dd. rb1. say at /mnt (Figure 3). Write some text into the disk’s first partition (/dev/rb1) using cat. Unload the driver using rmmod dor after unmounting the partition using umount /mnt.habstract the underlying RAM operations like vmalloc/vfree.ko using insmod. with rb2 being the extended partition and containing three logical partitions rb5. Quick-format the third primary partition (/dev/rb3) as a vfat filesystem (like your pen drive). which is 512 KiB in size.vfat (Figure 3). read/write. Recall the pre-lunch session (i. but without actually knowing the rules. All data on the disk da will be lost. Each of the three .c files nitty-gritty t represent a specific part of the driver. etc.. but remember that this is a disk on RAM. This would create the block device files representing the disk on 512 ice KiB of RAM. Display the initial contents of the first partition (/dev/rb1) using the xxd utility.. See Figure 3 for fdisk output. ram_device. So. let’s dig into the nitty gritty to decode the rules.. etc.c and ram_device. and so is non non-persistent. See Figure 2 for xxd output. Zero out the first sector of the disk’s first partition (/dev/rb1). rb6 and rb7.e. with three primary and three logical partitions. format using mkfs. . rb2 and rb3 are the primary pa partitions.c and partition. partition.e. Mount the newly formatted partition using mount. the internal details of the game.Figure 3: Formatting the third partition (/dev/rb3) Please note that all these need to be executed with root privileges: Load the driver dor. Display the partition information for the disk using fdisk. i. The disk usage utility df would now show this partition mounted at /mnt (Figure 3). Now let’s learn the rules We have just played around with the disk on RAM (DOR). the previous article) to understand the details of partitioning.h provide the functionality to emulate the various partition tables on the DOR. Read the entire disk (/dev/rb) using the disk dump utility dd.

these operations are too few compared to the character device file operations. the allocated major number is returned. especially with regards to the following: Usage of device files Major and minor numbers Device file operations Concept of device registration So. The second step is to provide the device file operations. The key differences are as follows: Abstraction for block-oriented versus byte-oriented devices. by specifying the total minors. However. All these are registered through the struct gendisk using the following function: . And these cause the implementation differences. Here. for the logical block access abstraction. commonly referred to as disk_name (rb in the dor driver) The starting minor number for the device files. through the structblock_device_operations (prototyped in <linux/blkdev. Character drivers are pass-through drivers. accessing the hardware directly. Both these are prototyped in <linux/fs. register_blkdev() tries to allocate and register a freely available major number. the following need to be provided: The request queue for queuing the read/write requests The spin lock associated with the request queue to protect its concurrent access The request function to process the requests in the request queue Also. namely: The maximum number of partitions supported for this block device. Compare that with character drivers that are to be used by VFS.h>.* form the horizontal layer of the device driver. etc. Block drivers are designed to be integrated with the Linux buffer cache mechanism for efficient data access.The code in this is responsible for the partition information like the number. on success. when 0 is passed for its first parameter major. andram_block.c forms the vertical (block) layer of the device driver. size. Finally.* and partition. const char *name). the read-write implementation is achieved through something called request queues. major is the major number to be registered. In other words. they are definitely not identical. there is no separate interface for block device file creations. the block drivers are very similar to character drivers. The first step is to register for an 8-bit (block) major number (which implicitly means registering for all 256 8-bit minor numbers associated with it). and name is a registration label displayed under the kernel window /proc/devices. The corresponding de-registration function is as follows: void unregister_blkdev(unsigned int major. Let’s analyse the key code snippets fromram_block. two block-device-specific things are also provided. const char *name).c file is the core block driver implementation. four of the five filesram_device. Block drivers are designed to be used by I/O schedulers. The block driver basics Conceptually. But as we already know that block drivers need to integrate with the I/O schedulers. there are no operations even to read and write. which is surprising. along with providing the device file operations. Interestingly. that is shown using fdisk. for optimal performance.. let’s understand that in detail.c. if you already know character driver implementation. it would be easy to understand block drivers. So. and mostly insignificant. so the following are also provided: The device file name prefix. So. However. exposing the DOR as the block device files (/dev/rb*) to user-space. commonly referred to as first_minor. type. The underlying device size in units of 512-byte sectors. starting at the driver’s constructor rb_init(). The ram_block.h>) for the registered major number device files. To elaborate. The function for that is as follows: int register_blkdev(unsigned int major.

h>. *blk_init_queue(request_fn_proc *. The corresponding queue clean-up function is given below: void blk_cleanup_queue(struct request_queue *). A typical example of request processing. /* Our custom function */ /* Informing return of ret */ that the request has been processed with __blk_end_request_all(req. The corresponding delete function is as follows: void del_gendisk(struct gendisk *disk). Here. beforeadd_disk(). And the corresponding inverse function would be: void put_disk(struct gendisk *disk). The request queue is initialised by calling: struct request_queue *). Then it should either process it. by using the following: struct request *blk_fetch_request(struct request_queue *q). spinlock_t We provide the request-processing function and the initialised concurrency protection spin-lock as parameters. or initiate processing. as demonstrated by the function rb_request() inram_block. Prior to add_disk(). Moreover. Whatever it does should be non-blocking.disk_name are the minimal fields to be initialised directly. queue.c is given below: while ((req = blk_fetch_request(q)) != NULL) /* Fetching a request */ { /* Processing the request: the actual data transfer */ ret = rb_transfer(req). either directly or using various macros/functions like set_capacity(). using the function given below: struct gendisk *alloc_disk(int minors). fops. and also after taking the queue’s spin-lock. only functions not releasing or taking the queue’s spin-lock should be used within the request function. major. } . The request (processing) function should be defined with the following prototype: void request_fn(struct request_queue *q). Request queue and the request function The request queue also needs to be initialised and set up into the struct gendisk. as this request function is called from a non-process context. It should be coded to fetch a request from its parameter q. And even before the initialisation of these fields. for instance. the various fields of struct gendisk need to initialised. All these are prototyped in <linux/genhd. ret). minors is the total number of partitions supported for this disk. the struct gendisk needs to be allocated. first_minor.void add_disk(struct gendisk *disk).

The struct request primarily contains the direction of data transfer.c. the session is open for questions — please feel free to leave your queries as comments. the total number of sectors for the data transfer.read from device. the appropriate data transfer is done. u8 *buffer. formatting and various other raw operations on a hard disk. Check out the complete code of rb_transfer() in ram_block. and extracting the individual buffer information into the struct bio_vec (bv: basic input/output vector) on each iteration. /* Operation type: 0 . based on the operation type. /* Starting sector to process */ blk_req_sectors(req). unsigned int sectors). Thanks for patiently listening. u8 *buffer. Now. And then. . we have actually learnt the beautiful block drivers by traversing through the design of a hard disk and playing around with partitioning. and the scatter-gather buffer for the data transfer. which parses a struct request and accordingly does the actual data transfer.Requests and their processing Our key function is rb_transfer(). /* Total sectors to process */ rq_for_each_segment(bv. iter) /* Iterator to extract individual buffers */ rq_for_each_segment() is the special one which iterates over the struct request (req)using iter. Summing up With that.c: void ramdevice_write(sector_t sector_off. The various macros to extract these from the struct request are as follows: rq_data_dir(req). unsigned int sectors). otherwise write to device */ blk_req_pos(req). invoking one of the following APIs from ram_device. the starting sector for the data transfer. on each extraction. req. void ramdevice_read(sector_t sector_off.

h> <linux/kernel. And in a jiffy. Part 16: Kernel Window — Peeping through /proc This article. especially through the /proc virtual filesystem (using cat).h> static struct proc_dir_entry *parent. "state = %d\n".” he added. “Why just one? You can have as many as you want.” Shweta grumbled. off_t off. int *eof. After many months. *link. and there you go. All through. “Just watch me creating one for you. state).” “For you. Here’s a non-exhaustive summary listing: /proc/modules — dynamically loaded modules /proc/devices — registered character and block major numbers /proc/iomem — on-system physical RAM and bus device addresses /proc/ioports — on-system I/O port addresses (especially for x86 systems) /proc/interrupts — registered interrupt request numbers /proc/softirqs — registered soft IRQs /proc/kallsyms — running kernel symbols. … Custom kernel windows “Yes.Device Drivers. Shweta and Pugs got together for some peaceful technical romancing. char **start. *file. viz. Pugs created the proc_window. I mean can we create one such kernel window through /proc?” asked Shweta. len = sprintf(page. “No yaar. these have been really helpful in understanding and debugging Linux device drivers. to help them decode various details of Linux device drivers. int count. void *data) { int len. this is seriously simple.h> <linux/jiffies. static int state = 0. val..c file below: 1 2 3 4 5 6 7 8 9 10 11 12 #include #include #include #include <linux/module. including from loaded modules /proc/partitions — currently connected block devices and their partitions /proc/filesystems — currently active filesystem drivers /proc/swaps — currently active swaps /proc/cpuinfo — information about the CPU(s) on the system /proc/meminfo — information about the memory on the system.h> <linux/proc_fs. int time_read(char *page. And it’s simple — just use the right set of APIs. they had been using all kinds of kernel windows. RAM. which is part of the series on Linux device drivers. But is it possible for us to also provide some help? Yes. demonstrates the creation and usage of files under the /proc virtual filesystem.” smiled Pugs. . everything is simple. unsigned long act_jiffies. swap.

val % 60). break. } if ((file = create_proc_entry("rel_time". break. "rel_time")) == NULL) { remove_proc_entry("rel_time". case 2: len += sprintf(page + len. } int time_write(struct file *file. } file->read_proc = time_read. val % 1000). switch (state) { case 0: len += sprintf(page + len. file->write_proc = time_write. } len += sprintf(page + len. parent)) == NULL) { remove_proc_entry("anil". default: len += sprintf(page + len. } static void __exit proc_win_exit(void) { . return count. parent). NULL). val / 3600. val = jiffies_to_msecs(act_jiffies). count = %d.'0'. return -1. state = buffer[0] . } *buffer. (val / 60) % 60.INITIAL_JIFFIES.}\n". NULL). "<not implemented>\n"). off. if ((buffer[0] < '0') || ('9' < buffer[0])) return count. if ((count == 2) && (buffer[1] != '\n')) return count. case 3: val /= 1000. 0666. break. remove_proc_entry("anil". count). break. parent. return -1. if ((link = proc_symlink("rel_time_l". break.13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 act_jiffies = jiffies . } link->uid = 0. "time = %02d:%02d:%02d\n". "time = %ld jiffies\n". case 1: len += sprintf(page + len. act_jiffies). val). return len. unsigned long static int __init proc_win_init(void) { if ((parent = proc_mkdir("anil". void *data) { if (count > 2) return count. const char __user count. len += sprintf(page + len. "{offset = %ld. NULL)) == NULL) { return -1. val / 1000. return 0. link->gid = 100. "time = %d msecs\n". "time = %ds %dms\n".

parent).e.proc_win_exit(). NULL parent) with default permissions 0755. <email_at_sarika-pugs_dot_com>"). For every entry created under /proc. } module_init(proc_win_init). (Refer to Figure 1. remove_proc_entry("rel_time". Soft link rel_time_l to the file rel_time. remove_proc_entry("anil". in chronological reverse order. unloaded the driver using rmmod. in the same directory.ko) using the usual driver’s Makefile. Figure 1: Peeping through /proc Demystifying the details Starting from the constructor proc_win_init(). with permissions 0666 usingcreate_proc_entry() 0666. three proc entries have been created: Directory anil under /proc (i. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika pugs_dot_com>"). Showed various experiments using the newly created proc windows. parent). using proc_symlink() The corresponding removal of these is done with remove_proc_entry() in the destructor. Loaded the driver using insmod. NULL). MODULE_DESCRIPTION("Kernel window /proc Demonstration Driver"). a corresponding struct proc_dir_entry is created. usingproc_mkdir() Regular file rel_time in the above directory. many of its fields could be further updated as needed: mode — Permissions of the file uid — User ID of the file . And then Pugs did the following: Built the driver file (proc_window. module_exit(proc_win_exit). For each.62 63 64 65 66 67 68 69 70 71 72 73 remove_proc_entry("rel_time_l". MODULE_LICENSE("GPL").) And finally..

The above implementation lets the user write a digit from 0 to 9. Figure 2: Comparison with top’s output All the /proc-related structure definitions and function declarations are available through<linux/proc_fs. for a regular file. int count. based on the current state. typically to be filled up with count bytes from offset page-sized of off. respectively: int (*read_proc)(char *page. since on boot-up. and accordingly sets the internal state. jiffies is initialised up. and the time since the system has been booted up — in different units. Figure 2 highlights the system uptime in the output of top.read_proc() in the above implementation provides the current state. void *data) write_proc() is very similar to the character driver’s file operation write().h>. and <not implemented> in other states. off_t off. . onds And to check the computation accuracy.a Anil.” smiled Pugs.k. minutes and seconds in state 3. const char __user *buffer. read_proc‘s page parameter is a page sized buffer. related the actual jiffies are calculated by subtractingINITIAL_JIFFIES. Watch out for further technical romancing from Pugs a. ignoring all other parameters. or maybe yours. it’s just that everyone in college knows me as Pugs. int *eof. My real name is Anil.gid — Group ID of the file Additionally. seconds and milliseconds in state 2. void *data) int (*write_proc)(struct file *file. just the page is filled up.h>. “Ha! That’s a surprise. related <lin The jiffies-related function declarations and macro definitions are in<linux/jiffies. milliseconds in state 1. the following two function pointers for reading and writing over the file could be provided. Summing up “Hey Pugs! Why did you set the folder name to anil? Who is this Anil? You could have used my name. As a special note. char **start. unsigned long count. These are jiffies in state 0. But more often than not (because of less content).” suggested Shweta. hours. to INITIAL_JIFFIES instead of zero.

the <linux/module. demonstrates various interactions with a Linux module. by default. it isn’t despite of recommendations to make everything static. Hence. by default there always have been cases where non-static globals may be needed.h> #include <linux/device. Now. right? In the general application development paradigm. respectively. to avoid any kernel name-space collisions even with such cases. static int __init glob_sym_init(void) { if (IS_ERR(cool_cl = class_create(THIS_MODULE. And so.h> static struct class *cool_cl. even if we want to. they’re closing in on some final titbits of technical romancing. "cool"))) /* Creates /sys/class/cool/ */ . As Shweta and Pugs gear up for their final semester’s project on Linux drivers. only one of them needs to be used for a particular symbol — though the symbol could be either a variable name or a function name. with function(s) from one file needing to be called in the other. it’s this simple — but in kernel development. Thus. zero collision is achieved. Part 17: Module Interactions Aticle. and passing parameters to it.Device Drivers. This mainly includes the various communications with a Linux module (dynamically loadable and unloadable driver) like accessing its variables. every module is embodied in its own namespace. by default. calling its functions. Here’s the complete code (our_glob_syms. include the header and access. for exactly such scenarios.c) to demonstrate this: 1 2 3 4 5 6 7 8 9 10 11 12 #include <linux/module. nothing from a module can be made really global throughout the kernel. _gpl or _gpl_future sections. However. Global variables and functions One might wonder what the big deal is about accessing a module’s variables and functions from outside it. EXPORT_SYMBOL_GPL(get_cool_cl). this also implies that. declare them extern in a header.h> header defines the following macros: EXPORT_SYMBOL(sym) EXPORT_SYMBOL_GPL(sym) EXPORT_SYMBOL_GPL_FUTURE(sym) Each of these exports the symbol passed as their parameter. A simple example could be a driver spanning multiple files. additionally putting them in one of the default. } EXPORT_SYMBOL(cool_cl). which is part of the series on Linux device drivers. And we know that two modules with the same name cannot be loaded at the same time. static struct class *get_cool_cl(void) { return cool_cl. Just make them global.

kernel string table (__kstrtab). MODULE_DESCRIPTION("Global Symbols exporting Driver"). } static void __exit glob_sym_exit(void) { /* Removes /sys/class/cool/ */ class_destroy(cool_cl). Figure 1: Our global symbols module The following code shows the supporting header file (our_glob_syms.h). } module_init(glob_sym_init). MODULE_LICENSE("GPL"). to be included by modules using the exported symbols cool_cl and get_cool_cl: #ifndef OUR_GLOB_SYMS_H . <email_at_sarika-pugs.13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 { return PTR_ERR(cool_cl). marking it to be globally accessible. MODULE_AUTHOR("Anil Kumar Pugalia <email_at_sarika pugs. Each exported symbol also has a corresponding structure placed into (each of) the kernel symbol table (__ksymtab). } return 0. and kernel CRC table (__kcrctab) sections.com>"). which has been compiled using the usual driverMakefile.com>"). module_exit(glob_sym_exit). Figure 1 shows a filtered snippet of the /proc/kallsyms kernel window. before and after loading the module our_glob_syms.ko.

h> extern struct class *cool_cl. static int __init mod_par_init(void) { printk(KERN_INFO "Loaded with %d\n". these can be modified even later. long. MODULE_AUTHOR("Anil Kumar Pugalia <email@sarika-pugs. cfg_value). yes. The following module code (module_param. when using insmod. The module parameters are set up using the following macro (defined in<linux/moduleparam. return 0.com>"). modules using the exported symbols should possibly have this file Module. } static void __exit mod_par_exit(void) { printk(KERN_INFO "Unloaded cfg value: %d\n". Note that the <linux/device.h> #include <linux/kernel. for instance. included through <linux/module. and in contrast to the command-line arguments to an application.h>. . name is the parameter name. int. bool or invbool (inverted Boolean). charp (character pointer). int. module_exit(mod_par_exit).#define OUR_GLOB_SYMS_H #ifdef __KERNEL__ #include <linux/device. extern struct class *get_cool_cl(void). perm) Here. This contains the various details of all the exported symbols in its directory. through sysfs interactions. generated by compiling the moduleour_glob_syms. Interestingly enough. ushort. Apart from including the above header file.c) demonstrates a module parameter: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 #include <linux/module. The supported type values are: byte. uint. Parameters can be passed to a module while loading it. it can.h> static int cfg_value = 3.h>): module_param(name. and perm refers to the permissions of the sysfs file corresponding to this parameter. } module_init(mod_par_init). type. it would be natural to ask if something similar can be done with a module — and the answer is. Module parameters Being aware of passing command-line arguments to an application. ulong. type is the type of the parameter.symvers in their build directory. which have already been covered in the earlier discussion on character drivers. MODULE_LICENSE("GPL"). module_param(cfg_value. short. MODULE_DESCRIPTION("Module Parameter demonstration Driver"). #endif #endif Figure 1 also shows the file Module. cfg_value). 0764).h> header in the above examples is being included for the various class-related declarations and definitions.symvers.

ko file) using the usual driver Makefile Loading the driver using insmod (with and without parameters) Various experiments through the corresp corresponding /sys entries And finally. Note the following: . the following steps and experiments are shown in Figures 2 and 3: Building the driver (module_param. Subsequently. unloading the driver using rmmod.Figure 2: Experiments with the module parameter Figure 3: Experiments with the module parameter (as root) Note that before the parameter setup. a variable of the same name and compatible type needs to be defined.

to the group. Any guesses what their project is about? Hint: They have picked up one of the most daunting Linux driver topics. Try writing into the /sys/module/module_param/parameters/cfg_value file as a normal (non-root) user. Permission 0764 gives rwx to the user.Initial value (3) of cfg_value becomes its default value when insmod is done without any parameters. and are all set to start working on their final semester project. the duo have a fairly good understanding of Linux drivers. for the printk outputs. Summing up With this. and r-. rw. Check for yourself: The output of dmesg/tail on every insmod and rmmod. Let us see how they fare with it next month.for the others on the filecfg_value under the parameters of module_param under /sys/module/. .