VFS, proc and root filesystems

The VFS (Virtual File System) is a subsystem of the kernel that implements the file and filesystem-related interfaces provided to user-space programs. All filesystems rely on the VFS in order to coexist and interoperate. Figure 1 shows the three major layers of a filesystem implementation. The first layer is the filesystem interface based on the open(), read(), write() and close() calls and on file descriptors. The VFS, second layer, enables system calls to work regardless of the filesystem or underlying physical medium as Figure 2 indicates. The third layer of the architecture implements the file system type or the remote file system protocol.

Figure 1: Schematic view of a VFS [1].
Figure 1: Schematic view of a VFS [1].
Figure 2: The flow of data from user-space issuing a write() call, through the VFS’s generic system call, into the filesystem’s specific write method, and finally arriving at the physical media. [2].
Figure 2: The flow of data from user-space issuing a write() call, through the
VFS’s generic system call, into the filesystem’s specific write method, and finally arriving at the physical media. [2].
In other words, VFS provides a common file model that can represent any filesystem’s general feature and behavior. VFS layer serves two important tasks. First, it separates generic filesystem operations from their implementation by defining a clean VFS interface. For the same machine several implementations for the VFS interface may coexist and consequently allowing access to different types of file systems. Furthermore, with VFS a file can be uniquely represented throughout a network. VFS is based on a V-node, which is a file representation structure. V-node contains a numerical designator for a network-wide unique file. The kernel retains one V-node structure for each active node (directory or file). Linux however does not have V-node. Instead, a generic i-node structure is used. Although the implementations differ, the V-node is conceptually the same as a generic i-node. Both point to an i-node structure specific to the file system.

The flow of Figures 1 and 2 shows that VFS activates filesystem operations to handle requests (local or remote) according to their filesystem types. File handles are constructed form the relevant V-nodes and are passed as arguments to these procedures.

Figure 3 shows a generic overview of VFS that separates the filesystem-specific structures from the rest of the kernel by “translating” system calls to proper filesystem type functions.

Figure 3: Overview of filesystem management structure [3].
Figure 3: Overview of filesystem management structure [3].
The VFS is object-oriented. The four primary object types of the VFS are:

  • The superblock object, which represents a specific mounted filesystem.
  • The i-node object, which represents a specific file.
  • The dentry object, which represents a directory entry, which is a single component of a path.
  • The file object, which represents an open file as associated with a process.

Each of the above objects contain a pointer to a function table that lists the addresses of the actual functions that implement the defined operations (listed below) for that particular object. Hence, the VFS layer can perform an operation on one of the four above objects by calling the suitable function from the object’s function table without having knowledge of the kind of the object.

In each object an operation object is included that describes the methods that the kernel invokes against the four objects above:

  • The superblock operations object (<linux/fs.h>) contains methods such as write_inode() and sync_fs() that the kernel can invoke on a specific filesystem.
  • The i-node operations object (<linux/fs.h>) contains methods such as create() and link() that the kernel can invoke on a specific file.
  • The dentry operations object (<linux/dcache.h>) contains methods such as d_compare() and d_delete() that the kernel can invoke on a specific directory entry.
  • The file operations object (<linux/fs.h>) contains methods such as read() and write() that the kernel can invoke on an open file.

/proc filesystem (/procfs)

The /procfs (process filesystem) is an additional mechanism (registered to the VFS layer) for the kernel and kernel modules to send information to processes. It exists only in kernel memory and is typically mounted at /proc. Reading or writing files in /procfs invokes kernel functions that simulate reading or writing from a real file. /procfs contents are computed on demand according to user file I/O requests. In other words, /procfs provides a method of communication between kernel space and user space.

The /proc file implements two important functions: a directory structure and the file contents within. A UNIX filesystem is a set of file and directory i-nodes identified by their
i-node numbers. The /proc filesystem defines a unique and persistent i-node number for each directory and the associated files. Given this i-node number, /procfs identifies the required operation when a user tries to read from a file i-node or performs a lookup in a particular directory i-node. After the reading from one of these files, /procfs collects the appropriate information and formats it into textual form. Then, /procfs places this information into the requesting process’s read buffer.

The mapping from i-node number to information type splits the i-node number into two fields. In Linux the i-node number is 32 bits. The top 16 bits define the Process ID (PID) and the remaining bits define the type of information that is requested about that process. A zero PID in the i-node number means that the i-node contains global information. Separate global files exist in /proc to report information such as the kernel version, drivers currently running and performance statistics. The kernel of course can dynamically allocate new /proc i-node mappings. Each global /proc filesystem entry contains the file’s i-node number, file name and access permissions along with the special functions used to generate the file’s contents. Figures 4, 5 and 6 shows a tour into /proc filesystem.

Figure 4: Interactive tour of /proc.
Figure 4: Interactive tour of /proc.

/root filesystem

A filesystem* is a hierarchically structured tree. The root of the tree on UNIX-like operating systems is the root directory/filesystem which is identified by the slash character “/”. The sub-trees of the root directory include other subdirectories and files as shown in Figure 7.

A root filesystem is also contained (“root on root”) on the same partition on which the root directory is located (/root directory is where all the home directories for root user on a system are stored).

The root filesystem is the “parent” of the entire filesystem [6]. All other directories are “children” of this directory. The partition which the root filesystem resides on, is mounted during boot, i.e. when the system is booted up all the other filesystems are mounted to the root filesystem. If the root filesystem is corrupted that means that the system becomes unbootable (of course using a boot flash drive as special measure could be bootable).

*filesystem definition can be found in several papers, books and websites such as [7] and [8]: “A filesystem is the methods and data structures that an operating system uses to keep track of files on a disk or partition; that is, the way the files are organized on the disk.”

Figure 5: /proc CPU Information file.
Figure 5: /proc CPU Information file.
Figure 6: Information about Mozilla running process with PID 1797 in /proc.
Figure 6: Information about Mozilla running process with PID 1797 in /proc.

Figure 7: Linux file system layout from a RedHat system [5].
Figure 7: Linux file system layout from a RedHat system [5].
Important directories in Linux include:

/bin: Hold the most commonly used essential user programs.
/sbin: Hold essential maintenance or system programs (difference between the programs stored in /bin and /sbin is that the programs in /sbin are executable only by root).
/etc: Store the system wide configuration files required by many programs.
/home: The place where all the home directories for all the users on a system are stored.
/root: The directory where all the home directories for root user on a system are stored.
/dev: The special files representing hardware are kept in it.
/tmp and /var: Directories which are used to hold temporary files or files with constantly varying content.
/usr: Most programs and files directly relating to users of the system are stored.

/proc: Explained before. To sum up, it is a VFS, i.e. a special filesystem provided by the kernel as a way of providing information about the system to user programs. The main tasks of /proc filesystem is to provide information about the kernel and processes.

References

[1] Abraham Silberschatz, Peter B. Galvin, Greg Gagne, “Operating System Concepts”, 7th edition,
[2] Linux Kernel Development, 3rd edition.
[3] Linux Virtual FileSystem, Online: http://bethechip.com/
[4] Advanced Linux Programming, Online: http://www.advancedlinuxprogramming.com/
[5] General overview of the Linux file system, Online: http://www.tldp.org/LDP/intro-linux/html/sect_03_01.html
[6] Sven Vermeulen, “Linux Sea”, Chapter 5: The Linux File System
[7] Wikipedia, File System, Online: http://en.wikipedia.org/wiki/File_system
[8] Moshe Bar, “Linux File Systems”.

Advertisements