Page Allocator

In the previous chapters, we replaced the data structures provided by UEFI with Ymir's versions. The page table is one such structure, and currently, we are using the page tables provided by UEFI. To create new page tables tailored for Ymir, we first need to implement a page allocator.

In this chapter, we'll implement a page allocator responsible for managing page allocations. One notable feature of Zig is that it performs very little implicit memory allocation. Functions in the std library that require memory allocation always take an Allocator as an argument. Conversely, functions that don't take an Allocator do not perform dynamic memory allocation. In this chapter, we'll implement a page allocator that implements Zig's core Allocator interface1.

important

The source code for this chapter is in whiz-ymir-page_allocator branch.

Table of Contents

Allocator Interface

First, to get an overview of how to implement Zig's Allocator, let's start with a skeleton implementation:

ymir/mem/PageAllocator.zig
const Allocator = std.mem.Allocator;
const Self = @This();
const PageAllocator = Self;

pub const vtable = Allocator.VTable{
    .alloc = allocate,
    .free = free,
    .resize = resize,
};

fn allocate(ctx: *anyopaque, _: usize, _: u8, _: usize) ?[*]u8 { @panic("unimplemented"); }
fn free(ctx: *anyopaque, _: []u8, _: u8, _: usize) void { }
fn resize(ctx: *anyopaque, _: []u8, _: u8, _: usize, _: usize) bool { @panic("unimplemented"); }

PageAllocator.zig differs slightly from previous files in that the file itself is treated as a struct (type)2. Because of this, the type can be accessed from other files like this:

zig
const PageAllocator = @import("mem/PageAllocator.zig");
// 以下と同じ
const PageAllocator = @import("mem/PageAllocator.zig").PageAllocator; // <= 冗長

Self and PageAllocator are type aliases referring to the PageAllocator.zig file itself. Since this file is treated as a struct, the vtable becomes a constant field of this struct (and soon after, you will see definitions of non-constant member variables).

Allocator type has two member variables: ptr and vtable. ptr is a pointer to the actual allocator instance, while vtable is a table of function pointers that the allocator must implement, as defined here. The standard Zig allocators provide methods that return this Allocator type. This design allows users of an allocator to treat it uniformly as an Allocator regardless of its internal implementation, providing great flexibility and abstraction.

The vtable requires three functions, whose roles are fairly clear from their names: they handle memory allocation, deallocation, and reallocation. When these functions are called through the Allocator interface, the first argument ctx receives the Allocator.ptr. Since this points to the allocator instance itself, calling through the common Allocator interface still allows invoking each allocator’s internal implementation. In other words, what we need to do here is define the internal implementation of the page allocator in PageAllocator.zig and provide the three APIs: allocate(), free(), and resize(). Once these three are implemented, the rest of the utility functions will be supplied by the Allocator interface itself.

Bitmap

Memory Usable for Ymir

In Ymir's PageAllocator, available (allocatable) pages are managed using a bitmap. How do we determine which pages are available? We use the memory map provided by UEFI. During the kernel boot process, Surtr passed the memory map to Ymir. The PageAllocator takes this memory map for initialization and scans the memory to record the available pages in the bitmap:

ymir/mem/PageAllocator.zig
pub fn init(self: *Self, map: MemoryMap) void {
    var avail_end: Phys = 0;
    var desc_iter = MemoryDescriptorIterator.new(map);

    while (true) {
        const desc: *uefi.tables.MemoryDescriptor = desc_iter.next() orelse break;
        ...
    }
}

MemoryDescriptorIterator is the iterator we developed in Memory Map and Cleanup, providing a way to iterate over the memory map. Using this iterator, we sequentially retrieve entries from the memory map. The memory map also records the memory types. Among these, Ymir treats Conventional Memory and Boot Services Code as areas that Ymir can freely use.

note

Actually, Boot Services Data is also a usable area. However, this region still contains data currently used by Ymir; the page tables. Since Ymir hasn’t yet created its own page tables and is still using those provided by UEFI, this area must not be overwritten. Later chapters will create Ymir’s own page tables, after which the Boot Services Data region can be freed and made available to Ymir. But this series does not cover that. The original Ymir project treats this region as usable, so if you’re interested, feel free to check it out.

Define a function that takes a memory descriptor as input and returns whether the memory is available for use:

ymir/mem/PageAllocator.zig
inline fn isUsableMemory(descriptor: *uefi.tables.MemoryDescriptor) bool {
    return switch (descriptor.type) {
        .ConventionalMemory,
        .BootServicesCode,
        => true,
        else => false,
    };
}

Memory Size to Manage

The bitmap used by PageAllocator maps one bit to one page. Although Zig allows integer types of arbitrary bit widths, creating an array like [N]u1 results in each element occupying one byte. Therefore, we implement the bitmap as an array of u64 instead:

ymir/mem/PageAllocator.zig
/// Maximum physical memory size in bytes that can be managed by this allocator.
const max_physical_size = 128 * gib;
/// Maximum page frame count.
const frame_count = max_physical_size / 4096; // 32Mi frames

/// Single unit of bitmap line.
const MapLineType = u64;
/// Bits per map line.
const bits_per_mapline = @sizeOf(MapLineType) * 8; // 64
/// Number of map lines.
const num_maplines = frame_count / bits_per_mapline; // 512Ki lines
/// Bitmap type.
const BitMap = [num_maplines]MapLineType;

The bitmap size is fixed, which means the size of the bitmap directly limits the maximum manageable memory size. This time, it's set to 128 GiB. In terms of page count, that's 128 GiB / 4 KiB = 32 Mi pages (frame_count). This should be more than enough. Based on this page count, the size of the bitmap is calculated as num_maplines. Since num_maplines is 512Ki, the bitmap size ends up being \(512 \text{Ki} \times 8 = 4 \text{MiB}\). Using 4 MiB just for the bitmap might seem a bit excessive, but it simplifies implementation, so it's an acceptable trade-off. Besides, Ymir itself uses very little memory, so this isn't a real concern.

Phys-Virt Translation

PageAllocator manages pages using page numbers. A page number can be calculated from a physical address as follows:

ymir/mem/PageAllocator.zig
const FrameId = u64;
const bytes_per_frame = 4 * kib;

inline fn phys2frame(phys: Phys) FrameId {
    return phys / bytes_per_frame;
}

inline fn frame2phys(frame: FrameId) Phys {
    return frame * bytes_per_frame;
}

FrameId represents a page number. You can obtain the page number by truncating the lower 10 bits of the physical address.

Since we're dealing with page numbers, this allocator operates on physical addresses. However, the allocator must return virtual addresses. This means we need to perform translation between physical and virtual addresses. For now, the page table provided by UEFI uses a direct mapping, so physical and virtual addresses are the same. But in the next chapter, we'll reconstruct the memory map, and they will no longer be equal. To prepare for that, let's define functions that convert between physical and virtual addresses.

ymir/mem.zig
pub fn virt2phys(addr: anytype) Phys {
    return @intCast(addr);
}
pub fn phys2virt(addr: anytype) Virt {
    return @intCast(addr);
}

For now, these functions simply return the input as-is. However, once we reconstruct the memory map, they will be updated to perform proper address translation.

Utilities

Let's define some helper functions to operate on the bitmap we created:

ymir/mem/PageAllocator.zig
const Status = enum(u1) {
    /// Page frame is in use.
    used = 0,
    /// Page frame is unused.
    unused = 1,

    pub inline fn from(boolean: bool) Status {
        return if (boolean) .used else .unused;
    }
};

fn get(self: *Self, frame: FrameId) Status {
    const line_index = frame / bits_per_mapline;
    const bit_index: u6 = @truncate(frame % bits_per_mapline);
    return Status.from(self.bitmap[line_index] & bits.tobit(MapLineType, bit_index) != 0);
}

fn set(self: *Self, frame: FrameId, status: Status) void {
    const line_index = frame / bits_per_mapline;
    const bit_index: u6 = @truncate(frame % bits_per_mapline);
    switch (status) {
        .used => self.bitmap[line_index] |= bits.tobit(MapLineType, bit_index),
        .unused => self.bitmap[line_index] &= ~bits.tobit(MapLineType, bit_index),
    }
}

Status represents the allocation status of a single page, corresponding to a single bit in the bitmap. The get() function returns the Status of the page specified by a FrameId by reading the appropriate bit from the bitmap. The bit_index variable indicates the offset within a 64-bit word of the bitmap. That ranges from 0 to 63, so we use the u6 type for it. Conversely, the set() function sets the Status of the specified page by updating the corresponding bit in the bitmap.

We'll also prepare a helper function that modifies the status of multiple pages at once, rather than updating them one page at a time.

ymir/mem/PageAllocator.zig
fn markAllocated(self: *Self, frame: FrameId, num_frames: usize) void {
    for (0..num_frames) |i| {
        self.set(frame + i, .used);
    }
}

fn markNotUsed(self: *Self, frame: FrameId, num_frames: usize) void {
    for (0..num_frames) |i| {
        self.set(frame + i, .unused);
    }
}

Iterating over Memory Map

Now that we've built the bitmap, we'll use it to initialize the memory allocator. In init(), we iterate over each memory descriptor in the memory map, and if the region is usable by Ymir, we mark the corresponding pages in the bitmap.

ymir/mem/PageAllocator.zig
frame_begin: FrameId = 1,
frame_end: FrameId,

pub fn init(self: *Self, map: MemoryMap) void {
    ...
    while (true) {
        const desc: *uefi.tables.MemoryDescriptor = desc_iter.next() orelse break;

        // Mark holes between regions as allocated (used).
        if (avail_end < desc.physical_start) {
            self.markAllocated(phys2frame(avail_end), desc.number_of_pages);
        }
        // Mark the region described by the descriptor as used or unused.
        const phys_end = desc.physical_start + desc.number_of_pages * page_size;
        if (isUsableMemory(desc)) {
            avail_end = phys_end;
            self.markNotUsed(phys2frame(desc.physical_start), desc.number_of_pages);
        } else {
            self.markAllocated(phys2frame(desc.physical_start), desc.number_of_pages);
        }

        self.frame_end = phys2frame(avail_end);
    }
}

frame_begin and frame_end are member variables of PageAllocator and define the range of page numbers that this allocator manages. In the second half of the code, we check whether the memory region is usable, and mark the corresponding pages in the bitmap as either allocated or available accordingly.

With this, we've successfully scanned the UEFI memory map and recorded the allocatable pages in our bitmap.

allocate

From here, we'll implement each function required by the Allocator vtable. First up is allocate(), which allocates memory of the requested size:

ymir/mem/PageAllocator.zig
const p2v = phys2virt;
const v2p = virt2phys;

fn allocate(ctx: *anyopaque, n: usize, _: u8, _: usize) ?[*]u8 {
    const self: *PageAllocator = @alignCast(@ptrCast(ctx));

    const num_frames = (n + page_size - 1) / page_size;
    var start_frame = self.frame_begin;

    while (true) {
        var i: usize = 0;
        while (i < num_frames) : (i += 1) {
            if (start_frame + i >= self.frame_end) return null;
            if (self.get(start_frame + i) == .used) break;
        }
        if (i == num_frames) {
            self.markAllocated(start_frame, num_frames);
            return @ptrFromInt(p2v(frame2phys(start_frame)));
        }

        start_frame += i + 1;
    }
}
ArgumentDescription
0: ctxAllocator.ptr. Pointer to PageAllocator instance.
1: nSize in bytes to allocate.
2: _要求するアラインメント
Required alignment.I don't know3

The first argument, ctx, is a Allocator.ptr. Since the actual type of the allocator can be any structure, it's typed as anyopaque. Here, we cast the received pointer to *PageAllocator so that we can use it as *Self.

First, the requested address is converted into a page number. Then, the bitmap is scanned to find a sequence of available pages. Since the allocated region must be contiguous, we look for a block of consecutive free pages. If such a region is found, markAllocated() is used to mark those pages as allocated, and the corresponding address is returned. If no suitable region is found, null is returned.

note

The second argument of allocate() specifies the requested alignment. For example, if 0x30 is specified, the returned pointer must be aligned to 0x00, 0x30, 0x60, and so on. However, the maximum alignment expected by Allocator is the page size4. Since the page allocator inherently only returns page-aligned addresses, this alignment argument can be safely ignored.

free

Next, we implement free(), which deallocates allocated memory. The pointer passed to free() is of type []u8. This is called a slice type, a fat pointer that has both a pointer and a size. Thanks to this, Zig's allocator does not need to keep metadata linking the freed memory address with its size. Instead, the caller is responsible for providing both the address and size (the slice). If only a raw pointer were passed, the allocator would have to maintain metadata to know the size of the allocated block at that address (like C's malloc() does). This design significantly simplifies the implementation:

ymir/mem/PageAllocator.zig
fn free(ctx: *anyopaque, slice: []u8, _: u8, _: usize) void {
    const self: *PageAllocator = @alignCast(@ptrCast(ctx));

    const num_frames = (slice.len + page_size - 1) / page_size;
    const start_frame_vaddr: Virt = @intFromPtr(slice.ptr) & ~page_mask;
    const start_frame = phys2frame(v2p(start_frame_vaddr));
    self.markNotUsed(start_frame, num_frames);
}

resize

Finally, there's resize(), which changes the size of allocated memory. In this series, we will not implement this function. Since there is no use case that requires resizing, this is not an issue.

ymir/mem/PageAllocator.zig
fn resize(_: *anyopaque, _: []u8, _: u8, _: usize, _: usize) bool {
    @panic("PageAllocator does not support resizing");
}

A proper implementation of resize() itself is not very difficult. It basically just calls free() followed by allocate(). Feel free to implement it if you want to.

Page-Level Allocation

With this, the implementation of the Allocator interface is complete. You can now create an Allocator, but I will implement one more helper function. The Allocator interface itself does not inherently assume page-level memory allocation5. However, in an OS context, it is common to allocate memory in page-sized units. Therefore, having a function that allows allocation by specifying the number of pages is convenient. Additionally, such a function is necessary if you want to specify an alignment larger than the page size4.

The function you created cannot be called directly through the Allocator. However, since Allocator is just an interface, it is possible to call functions not provided by Allocator by accessing the underlying allocator instance directly.

Now, let's implement a function that allocates memory in page-sized units:

ymir/mem/PageAllocator.zig
pub fn allocPages(self: *PageAllocator, num_pages: usize, align_size: usize) ?[]u8 {
    const num_frames = num_pages;
    const align_frame = (align_size + page_size - 1) / page_size;
    var start_frame = align_frame;

    while (true) {
        var i: usize = 0;
        while (i < num_frames) : (i += 1) {
            if (start_frame + i >= self.frame_end) return null;
            if (self.get(start_frame + i) == .used) break;
        }
        if (i == num_frames) {
            self.markAllocated(start_frame, num_frames);
            const virt_addr: [*]u8 = @ptrFromInt(p2v(frame2phys(start_frame)));
            return virt_addr[0 .. num_pages * page_size];
        }

        start_frame += align_frame;
        if (start_frame + num_frames >= self.frame_end) return null;
    }
}

中身はほぼ allocate() と同じです。 引数はサイズをの代わりにページ数を受け取ります。 align_size にはページサイズ以上のアラインメントを指定することができ、空きページを探索する際にはこのアラインメントを考慮します。

Instantiation of Allocator

With the preparations complete, let's instantiate an Allocator that can be used in Ymir:

ymir/mem/PageAllocator.zig
pub fn newUninit() Self {
    return Self{
        .frame_end = undefined,
        .bitmap = undefined,
    };
}
ymir/mem.zig
pub const PageAllocator = @import("mem/PageAllocator.zig");
pub var page_allocator_instance = PageAllocator.newUninit();
pub const page_allocator = Allocator{
    .ptr = &page_allocator_instance,
    .vtable = &PageAllocator.vtable,
};

pub fn initPageAllocator(map: MemoryMap) void {
    page_allocator_instance.init(map);
}

page_allocator_instance is the sole instance of PageAllocator. Generally, this instance should not be accessed directly. The only time you need to use it is when calling the previously mentioned allocPages() function. In fact, we don't really want to expose this instance directly, so ideally it shouldn't be marked as pub. The same goes for the PageAllocator type itself. However, since Allocator.alignedAlloc() does not support alignments larger than the page size, this is unavoidable4.

The core Allocator is created by specifying a ptr and a vtable. The ptr points to the page_allocator_instance. This setup allows you to use not only the three functions we implemented earlier but also various functions provided by the Allocator interface, such as alloc(), create(), alignedAlloc(), and allocSentinel().

When using it, you simply treat it as an Allocator like this (no need to worry about the internal implementation):

ymir/main.zig
mem.initPageAllocator(boot_info.memory_map);
log.info("Initialized page allocator", .{});
const page_allocator = ymir.mem.page_allocator;

const array = try page_allocator.alloc(u32, 4);
log.debug("Memory allocated @ {X:0>16}", .{@intFromPtr(array.ptr)});
page_allocator.free(array);

Summary

In this chapter, we implemented a PageAllocator that tracks available pages based on the memory map provided by UEFI. With the memory allocator in place, many new possibilities open up. For example, you can now allocate pages for page tables, enabling you to rebuild the memory map. Additionally, with VT-x, you need to allocate pages for VMCS for each vCPU. The PageAllocator we implemented is also usable as a general-purpose allocator for smaller chunks. Of course, allocating something like 8 bytes will still reserve a full 4 KiB page, so it's not memory-efficient. With that said, next we will implement a more efficient allocator for general use. But before that, in the next chapter, let's first rebuild the page tables. After physical and virtual addresses are no longer directly mapped, we will revisit allocators again.

1

Strictly speaking, Zig does not have the concept of an interface. However, since it is quite similar to what other languages call an interface, we refer to it as such.

2

Strictly speaking, all the files we've created so far are treated as types (similar to structs). That's why accessing functions like ymir.bits.XXX() is possible.

3

No one knows what this argument is.

4

With Allocator, alignments larger than the page size are disallowed as specified here.

5

This limitation is not because allocator instances are not designed for page-level allocations, but rather a restriction of the Allocator interface itself.