initramfs
In the previous chapter, the boot process advanced to the point where the guest tries to load initramfs. If it wants it, here it is. In this chapter, we'll prepare an initramfs to be loaded by Linux, and Surtr and Ymir will work together to load it into the guest's memory space. Spoiler alert: by the end of this chapter, the guest Linux will boot completely. Sorry for the spoiler.
important
The source code for this chapter is in whiz-vmm-initramfs
branch.
Table of Contents
Creating initramfs
tip
For those who find creating initramfs troublesome, you can download the filesystem image generated by the following steps from here.
initramfs is a type of RAM FS that is extracted into memory. It has a simple structure where directories and files are compressed using cpio (plus gzip). The Linux kernel itself includes functionality to handle initramfs, allowing it to be used as a temporary filesystem before mounting any other FS. From this initramfs, kernel modules are loaded, which are then used to mount the actual root filesystem. In Ymir, for simplicity, we keep using initramfs exclusively and do not use any other filesystems.
First, create the initramfs that the guest will load. There are various ways to create initramfs, but here we will use buildroot. Buildroot is a toolchain for building embedded Linux, but in this case, we'll use it only to generate the filesystem. Download a suitable version from the buildroot download page and extract it. In the extracted directory, run make menuconfig
to configure. Since we don't need to build the Linux kernel itself this time, disable BR2_LINUX_KERNEL
. Then, enable the following options to generate a cpio-format initramfs:
cpio the root filesystem (for use as an initial RAM filesystem)
Run make
to generate the filesystem. The file will be output as ./output/images/rootfs.cpio
. You can extract and compress the cpio file with the following steps1:
# 展開
mkdir x && cd x
cpio -idv 2>/dev/null <../x
# 圧縮
cd x
find . -print0 | cpio --owner root --null -o --format=newc > ../rootfs.cpio
After the Linux kernel boots, it executes /init
from the filesystem. /init
calls /sbin/init
as the PID 1 process, and /sbin/init
runs /etc/init.d/rcS
. The rcS
script sequentially executes all scripts starting with S
located in /etc/init.d/
as subprocesses.
Extract the generated initramfs and remove unnecessary startup scripts. Since this series does not support networking, delete all network-related scripts:
rm ./x/etc/init.d/S41dhcpcd
rm ./x/extracted/etc/init.d/S40network
Also, add the following startup script as S999whiz
instead:
#!/bin/sh
mdev -s
mount -t proc none /proc
stty -opost
/bin/sh
umount /proc
poweroff -d 0 -f
The final directory structure will look like this:
> tree ./x/etc/init.d
./x/etc/init.d
├── rcK
├── rcS
├── S01syslogd
├── S02klogd
├── S02sysctl
├── S20seedrng
└── S999whiz
Once the preparations above are complete, gzip-compress rootfs.cpio
to create rootfs.cpio.gz
and copy it to the zig-out/img
directory where the Ymir kernel is located:
> tree ./zig-out/img
./zig-out/img
├── bzImage
├── efi
│ └── boot
│ └── BOOTX64.EFI
├── rootfs.cpio.gz
└── ymir.elf
Loading initramfs by Surtr
Just like loading bzImage
into memory, it is Surtr's responsibility to load rootfs.cpio.gz
into memory and pass it to Ymir. Add code to boot.zig
to allocate memory for rootfs.cpio.gz
and load it:
const initrd = openFile(root_dir, "rootfs.cpio.gz") catch return .Aborted;
const initrd_info_buffer_size: usize = @sizeOf(uefi.FileInfo) + 0x100;
var initrd_info_actual_size = initrd_info_buffer_size;
var initrd_info_buffer: [initrd_info_buffer_size]u8 align(@alignOf(uefi.FileInfo)) = undefined;
status = initrd.getInfo(&uefi.FileInfo.guid, &initrd_info_actual_size, &initrd_info_buffer);
if (status != .Success) return status;
const initrd_info: *const uefi.FileInfo = @alignCast(@ptrCast(&initrd_info_buffer));
var initrd_size = initrd_info.file_size;
var initrd_start: u64 = undefined;
const initrd_size_pages = (initrd_size + (page_size - 1)) / page_size;
status = boot_service.allocatePages(.AllocateAnyPages, .LoaderData, initrd_size_pages, @ptrCast(&initrd_start));
if (status != .Success) return status;
status = initrd.read(&initrd_size, @ptrFromInt(initrd_start));
if (status != .Success) return status;
Also, add information about the location where initramfs is loaded to GuestInfo
, which is the data passed from Surtr to Ymir:
pub const GuestInfo = extern struct {
...
/// Physical address the initrd is loaded.
initrd_addr: [*]u8,
/// Size in bytes of the initrd.
initrd_size: usize,
};
Before Surtr transfers control to Ymir, fill in the initramfs information:
const boot_info = defs.BootInfo{
...
.guest_info = .{
...
.initrd_addr = @ptrFromInt(initrd_start),
.initrd_size = initrd_size,
},
};
kernel_entry(boot_info);
With this, Surtr can load initramfs into memory and pass its information to Ymir. Let's verify that Ymir can access the passed information. Add the following code to kernelMain()
. Note that the initramfs address passed from Surtr is a physical address, so you need to convert it to a virtual address using phys2virt()
:
const initrd = b: {
const ptr: [*]u8 = @ptrFromInt(ymir.mem.phys2virt(guest_info.initrd_addr));
break :b ptr[0..guest_info.initrd_size];
};
log.info("initrd: 0x{X:0>16} (size=0x{X})", .{ @intFromPtr(initrd.ptr), initrd.len });
Passing initramfs to Guest
The SetupHeader
within BootParams
, used in the x86 Linux boot protocol, contains fields for specifying the physical address and size of the initramfs. By loading the initramfs at an appropriate address in guest memory and setting that address and size in SetupHeader
, the Linux kernel will recognize and mount the initramfs. In this series, we will load the initramfs at 0x0600_0000
:
pub const layout = struct {
...
pub const initrd = 0x0600_0000;
};
Let's load the initramfs in loadKernel()
:
fn loadKernel(self: *Self, kernel: []u8, initrd: []u8) Error!void {
...
// Load initrd
bp.hdr.ramdisk_image = linux.layout.initrd;
bp.hdr.ramdisk_size = @truncate(initrd.len);
try loadImage(guest_mem, initrd, linux.layout.initrd);
...
}
Summary
In this chapter, we created initramfs, had Surtr load it, and then passed it to the guest. Let's run the guest and see how it behaves:
[ 0.364946] Loading compiled-in X.509 certificates
[ 0.364946] PM: Magic number: 0:110:269243
[ 0.364946] printk: legacy console [netcon0] enabled
[ 0.364946] netconsole: network logging started
[ 0.364946] cfg80211: Loading compiled-in X.509 certificates for regulatory database
[ 0.364946] modprobe (41) used greatest stack depth: 13840 bytes left
[ 0.364946] Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 0.364946] Loaded X.509 cert 'wens: 61c038651aabdcf94bd0ac7ff06c7248db18c600'
[ 0.364946] platform regulatory.0: Direct firmware load for regulatory.db failed with error -2
[ 0.364946] cfg80211: failed to load regulatory.db
[ 0.364946] ALSA device list:
[ 0.364946] No soundcards found.
[ 0.364946] Freeing unused kernel image (initmem) memory: 2704K
[ 0.365946] Write protecting the kernel read-only data: 26624k
[ 0.365946] Freeing unused kernel image (rodata/data gap) memory: 1488K
[ 0.395946] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 0.395946] x86/mm: Checking user space page tables
[ 0.423946] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[ 0.423946] Run /init as init process
[ 0.423946] mount (44) used greatest stack depth: 13832 bytes left
[ 0.424946] ln (53) used greatest stack depth: 13824 bytes left
Starting syslogd: OK
Starting klogd: OK
Running sysctl: OK
Saving 256 bits of non-creditable seed for next boot
/bin/sh: can't access tty; job control turned off
~ # ls
bin init linuxrc opt run tmp
dev lib media proc sbin usr
etc lib64 mnt root sys var
Finally! Linux has booted! The virtualized serial input is also working, allowing you to freely operate the shell! It has been a long journey spanning nearly 30 chapters, but we have finally been able to run the guest. Although it is still at a toy level, this can be considered a functioning "hypervisor." In the next chapter, as a bonus, we will implement VMCALL to allow the guest to invoke VMM functions, concluding this series.
The author provides a custom script, smallkirby/lysithea, to automate these operations. Feel free to try it out if you're interested.