How Does Linux Boot?
A fundamental aspect of being able to troubleshoot a problem is understanding how everything works when they are problem-free. With that in mind, we’ll look over the process by which Linux systems boot so you can better understand what’s happened if your system fails to boot.
In the beginning…
It all starts with the boot loader: this is a small program that the server runs from the Master Boot Record (MBR) of the boot disk. This program is responsible for handing over from the computer’s Basic Input/Output System (BIOS) to the Operating System that the computer runs. This is a somewhat legacy design harking back to the days of early IBM PCs. Modern GUID Partition Table (GPT) systems are now replacing the old MBR system, and these use a dedicated partition on the disk for the boot loader rather than restricting it to the small space the MBR has under the old system. For most modern Linux systems the boot loader of choice is the GNU GRand Unified Bootloader, commonly referred to as GRUB. GRUB’s job is to load the Linux kernel image and a small initialization RAM disk that can then be used to boot the main Linux operating system. One nice feature about GRUB is that as you upgrade the kernel of your Linux system, it can offer you a menu with a choice to boot from a previously installed Kernel.
If you get to the GRUB menu screen then the chances are that the boot loader is working fine on your system. An indefinite flashing cursor, a GRUB error message or simply getting to a prompt that says “grub rescue>” are general indications that you have an issue with the bootloader, and you will likely need to boot from an alternate Linux system, such as a Live CD, to rescue that problem. On a VPS it’s not so simple, so if your system fails to boot as far as GRUB you may need to contact support to get it fixed.
From GRUB to Kernel
So once a Linux install is selected from the GRUB menu, the system Kernel and a boot image are loaded. This is the next point at which things could fail. If GRUB cannot find either the boot image or system Kernel as specified in its configuration, you will likely get an error message and be dropped into a “grub>” command prompt. From here you can do most everything you need to fix the problem, though doing so is somewhat beyond the scope of this article, and there is plenty of information online about resolving it. A slightly easier fix may be to reboot the system and in the GRUB menu select an alternate Kernel image at bootup, especially if the failure follows a recent upgrade.
Once the Kernel and boot image are loaded, Linux then begins to boot proper. The hardware is checked, the system handed over the the kernel, and the disk partitions are then mounted. Problems can also occur at this stage, such as an inability to mount the root file system, or incompatibilities with the loaded kernel and the hardware of the system that may lead to the boot process crashing. In this case, booting from a previous kernel in the GRUB menu can often help get the system started so you can resolve the problem for next time. Once this is done, the kernel then runs the first process which is init and will always be given process id (pid) 1. On old systems this would likely be the SystemV init system, which was replaced with Upstart on Ubuntu systems some years back. On most Linux systems released within the last year or so systemd is the preferred init system.
Init System
The init system is responsible for loading every other service on the system. These range from modules that interface with the kernel through to the services that you have running all the time such as the Apache web server, or dovecot mail server. It’s possible to boot the system into a number of different modes commonly referred to as run levels. For servers the normal run level is 3, which provides a multi-user mode with networking, and on desktops the run level 5 is used to show the aforementioned level with the addition of a graphical user interface. Another commonly used run level is 1, which is for single user mode. Run levels of 0 and 6 exist for shutting down the computer or rebooting respectively. With systemd, the concept of runlevels as numbers has gone in favor of named run levels which makes things a bit simpler to understand when looking at configuration files for services loaded by systemd.
Getting Started
Once the system has booted, the init system remains running the whole time, and is responsible for starting user sessions when they log in by launching their shell or GUI session. Problems can also occur during the system’s init stage, but normally these are non-terminal and the system will conclude booting enabling you to log in and view the system logs to analyze and resolve this failure.
So there we have it: a somewhat brief overview of how a Linux system boots, along with some pointers of what to look at based on how far a system boots before any errors may stop it.