Page Allocator
In the previous chapters, we replaced the data structures provided by UEFI with Ymir's versions. The page table is one such structure, and currently, we are using the page tables provided by UEFI. To create new page tables tailored for Ymir, we first need to implement a page allocator.
In this chapter, we'll implement a page allocator responsible for managing page allocations. One notable feature of Zig is that it performs very little implicit memory allocation. Functions in the std
library that require memory allocation always take an Allocator
as an argument. Conversely, functions that don't take an Allocator
do not perform dynamic memory allocation. In this chapter, we'll implement a page allocator that implements Zig's core Allocator
interface1.
important
The source code for this chapter is in whiz-ymir-page_allocator
branch.
Table of Contents
- Allocator Interface
- Bitmap
- allocate
- free
- resize
- Page-Level Allocation
- Instantiation of Allocator
- Summary
Allocator Interface
First, to get an overview of how to implement Zig's Allocator
, let's start with a skeleton implementation:
const Allocator = std.mem.Allocator;
const Self = @This();
const PageAllocator = Self;
pub const vtable = Allocator.VTable{
.alloc = allocate,
.free = free,
.resize = resize,
};
fn allocate(ctx: *anyopaque, _: usize, _: u8, _: usize) ?[*]u8 { @panic("unimplemented"); }
fn free(ctx: *anyopaque, _: []u8, _: u8, _: usize) void { }
fn resize(ctx: *anyopaque, _: []u8, _: u8, _: usize, _: usize) bool { @panic("unimplemented"); }
PageAllocator.zig
differs slightly from previous files in that the file itself is treated as a struct (type)2. Because of this, the type can be accessed from other files like this:
const PageAllocator = @import("mem/PageAllocator.zig");
// 以下と同じ
const PageAllocator = @import("mem/PageAllocator.zig").PageAllocator; // <= 冗長
Self
and PageAllocator
are type aliases referring to the PageAllocator.zig
file itself. Since this file is treated as a struct, the vtable
becomes a constant field of this struct (and soon after, you will see definitions of non-constant member variables).
Allocator
type has two member variables: ptr
and vtable
. ptr
is a pointer to the actual allocator instance, while vtable
is a table of function pointers that the allocator must implement, as defined here. The standard Zig allocators provide methods that return this Allocator
type. This design allows users of an allocator to treat it uniformly as an Allocator
regardless of its internal implementation, providing great flexibility and abstraction.
The vtable
requires three functions, whose roles are fairly clear from their names: they handle memory allocation, deallocation, and reallocation. When these functions are called through the Allocator
interface, the first argument ctx
receives the Allocator.ptr
. Since this points to the allocator instance itself, calling through the common Allocator
interface still allows invoking each allocator’s internal implementation. In other words, what we need to do here is define the internal implementation of the page allocator in PageAllocator.zig
and provide the three APIs: allocate()
, free()
, and resize()
. Once these three are implemented, the rest of the utility functions will be supplied by the Allocator
interface itself.
Bitmap
Memory Usable for Ymir
In Ymir's PageAllocator
, available (allocatable) pages are managed using a bitmap. How do we determine which pages are available? We use the memory map provided by UEFI. During the kernel boot process, Surtr passed the memory map to Ymir. The PageAllocator
takes this memory map for initialization and scans the memory to record the available pages in the bitmap:
pub fn init(self: *Self, map: MemoryMap) void {
var avail_end: Phys = 0;
var desc_iter = MemoryDescriptorIterator.new(map);
while (true) {
const desc: *uefi.tables.MemoryDescriptor = desc_iter.next() orelse break;
...
}
}
MemoryDescriptorIterator
is the iterator we developed in Memory Map and Cleanup, providing a way to iterate over the memory map. Using this iterator, we sequentially retrieve entries from the memory map. The memory map also records the memory types. Among these, Ymir treats Conventional Memory and Boot Services Code as areas that Ymir can freely use.
note
Actually, Boot Services Data is also a usable area. However, this region still contains data currently used by Ymir; the page tables. Since Ymir hasn’t yet created its own page tables and is still using those provided by UEFI, this area must not be overwritten. Later chapters will create Ymir’s own page tables, after which the Boot Services Data region can be freed and made available to Ymir. But this series does not cover that. The original Ymir project treats this region as usable, so if you’re interested, feel free to check it out.
Define a function that takes a memory descriptor as input and returns whether the memory is available for use:
inline fn isUsableMemory(descriptor: *uefi.tables.MemoryDescriptor) bool {
return switch (descriptor.type) {
.ConventionalMemory,
.BootServicesCode,
=> true,
else => false,
};
}
Memory Size to Manage
The bitmap used by PageAllocator
maps one bit to one page. Although Zig allows integer types of arbitrary bit widths, creating an array like [N]u1
results in each element occupying one byte. Therefore, we implement the bitmap as an array of u64
instead:
/// Maximum physical memory size in bytes that can be managed by this allocator.
const max_physical_size = 128 * gib;
/// Maximum page frame count.
const frame_count = max_physical_size / 4096; // 32Mi frames
/// Single unit of bitmap line.
const MapLineType = u64;
/// Bits per map line.
const bits_per_mapline = @sizeOf(MapLineType) * 8; // 64
/// Number of map lines.
const num_maplines = frame_count / bits_per_mapline; // 512Ki lines
/// Bitmap type.
const BitMap = [num_maplines]MapLineType;
The bitmap size is fixed, which means the size of the bitmap directly limits the maximum manageable memory size. This time, it's set to 128 GiB. In terms of page count, that's 128 GiB / 4 KiB = 32 Mi pages (frame_count
). This should be more than enough. Based on this page count, the size of the bitmap is calculated as num_maplines
. Since num_maplines
is 512Ki, the bitmap size ends up being \(512 \text{Ki} \times 8 = 4 \text{MiB}\). Using 4 MiB just for the bitmap might seem a bit excessive, but it simplifies implementation, so it's an acceptable trade-off. Besides, Ymir itself uses very little memory, so this isn't a real concern.
Phys-Virt Translation
PageAllocator
manages pages using page numbers. A page number can be calculated from a physical address as follows:
const FrameId = u64;
const bytes_per_frame = 4 * kib;
inline fn phys2frame(phys: Phys) FrameId {
return phys / bytes_per_frame;
}
inline fn frame2phys(frame: FrameId) Phys {
return frame * bytes_per_frame;
}
FrameId
represents a page number. You can obtain the page number by truncating the lower 10 bits of the physical address.
Since we're dealing with page numbers, this allocator operates on physical addresses. However, the allocator must return virtual addresses. This means we need to perform translation between physical and virtual addresses. For now, the page table provided by UEFI uses a direct mapping, so physical and virtual addresses are the same. But in the next chapter, we'll reconstruct the memory map, and they will no longer be equal. To prepare for that, let's define functions that convert between physical and virtual addresses.
pub fn virt2phys(addr: anytype) Phys {
return @intCast(addr);
}
pub fn phys2virt(addr: anytype) Virt {
return @intCast(addr);
}
For now, these functions simply return the input as-is. However, once we reconstruct the memory map, they will be updated to perform proper address translation.
Utilities
Let's define some helper functions to operate on the bitmap we created:
const Status = enum(u1) {
/// Page frame is in use.
used = 0,
/// Page frame is unused.
unused = 1,
pub inline fn from(boolean: bool) Status {
return if (boolean) .used else .unused;
}
};
fn get(self: *Self, frame: FrameId) Status {
const line_index = frame / bits_per_mapline;
const bit_index: u6 = @truncate(frame % bits_per_mapline);
return Status.from(self.bitmap[line_index] & bits.tobit(MapLineType, bit_index) != 0);
}
fn set(self: *Self, frame: FrameId, status: Status) void {
const line_index = frame / bits_per_mapline;
const bit_index: u6 = @truncate(frame % bits_per_mapline);
switch (status) {
.used => self.bitmap[line_index] |= bits.tobit(MapLineType, bit_index),
.unused => self.bitmap[line_index] &= ~bits.tobit(MapLineType, bit_index),
}
}
Status
represents the allocation status of a single page, corresponding to a single bit in the bitmap. The get()
function returns the Status
of the page specified by a FrameId
by reading the appropriate bit from the bitmap. The bit_index
variable indicates the offset within a 64-bit word of the bitmap. That ranges from 0 to 63, so we use the u6
type for it. Conversely, the set()
function sets the Status
of the specified page by updating the corresponding bit in the bitmap.
We'll also prepare a helper function that modifies the status of multiple pages at once, rather than updating them one page at a time.
fn markAllocated(self: *Self, frame: FrameId, num_frames: usize) void {
for (0..num_frames) |i| {
self.set(frame + i, .used);
}
}
fn markNotUsed(self: *Self, frame: FrameId, num_frames: usize) void {
for (0..num_frames) |i| {
self.set(frame + i, .unused);
}
}
Iterating over Memory Map
Now that we've built the bitmap, we'll use it to initialize the memory allocator. In init()
, we iterate over each memory descriptor in the memory map, and if the region is usable by Ymir, we mark the corresponding pages in the bitmap.
frame_begin: FrameId = 1,
frame_end: FrameId,
pub fn init(self: *Self, map: MemoryMap) void {
...
while (true) {
const desc: *uefi.tables.MemoryDescriptor = desc_iter.next() orelse break;
// Mark holes between regions as allocated (used).
if (avail_end < desc.physical_start) {
self.markAllocated(phys2frame(avail_end), desc.number_of_pages);
}
// Mark the region described by the descriptor as used or unused.
const phys_end = desc.physical_start + desc.number_of_pages * page_size;
if (isUsableMemory(desc)) {
avail_end = phys_end;
self.markNotUsed(phys2frame(desc.physical_start), desc.number_of_pages);
} else {
self.markAllocated(phys2frame(desc.physical_start), desc.number_of_pages);
}
self.frame_end = phys2frame(avail_end);
}
}
frame_begin
and frame_end
are member variables of PageAllocator
and define the range of page numbers that this allocator manages. In the second half of the code, we check whether the memory region is usable, and mark the corresponding pages in the bitmap as either allocated or available accordingly.
With this, we've successfully scanned the UEFI memory map and recorded the allocatable pages in our bitmap.
allocate
From here, we'll implement each function required by the Allocator
vtable. First up is allocate()
, which allocates memory of the requested size:
const p2v = phys2virt;
const v2p = virt2phys;
fn allocate(ctx: *anyopaque, n: usize, _: u8, _: usize) ?[*]u8 {
const self: *PageAllocator = @alignCast(@ptrCast(ctx));
const num_frames = (n + page_size - 1) / page_size;
var start_frame = self.frame_begin;
while (true) {
var i: usize = 0;
while (i < num_frames) : (i += 1) {
if (start_frame + i >= self.frame_end) return null;
if (self.get(start_frame + i) == .used) break;
}
if (i == num_frames) {
self.markAllocated(start_frame, num_frames);
return @ptrFromInt(p2v(frame2phys(start_frame)));
}
start_frame += i + 1;
}
}
Argument | Description |
---|---|
0: ctx | Allocator.ptr . Pointer to PageAllocator instance. |
1: n | Size in bytes to allocate. |
2: _ | 要求するアラインメント |
Required alignment. | I don't know3 |
The first argument, ctx
, is a Allocator.ptr
. Since the actual type of the allocator can be any structure, it's typed as anyopaque
. Here, we cast the received pointer to *PageAllocator
so that we can use it as *Self
.
First, the requested address is converted into a page number. Then, the bitmap is scanned to find a sequence of available pages. Since the allocated region must be contiguous, we look for a block of consecutive free pages. If such a region is found, markAllocated()
is used to mark those pages as allocated, and the corresponding address is returned. If no suitable region is found, null
is returned.
note
The second argument of allocate()
specifies the requested alignment. For example, if 0x30
is specified, the returned pointer must be aligned to 0x00
, 0x30
, 0x60
, and so on. However, the maximum alignment expected by Allocator
is the page size4. Since the page allocator inherently only returns page-aligned addresses, this alignment argument can be safely ignored.
free
Next, we implement free()
, which deallocates allocated memory. The pointer passed to free()
is of type []u8
. This is called a slice type, a fat pointer that has both a pointer and a size. Thanks to this, Zig's allocator does not need to keep metadata linking the freed memory address with its size. Instead, the caller is responsible for providing both the address and size (the slice). If only a raw pointer were passed, the allocator would have to maintain metadata to know the size of the allocated block at that address (like C's malloc()
does). This design significantly simplifies the implementation:
fn free(ctx: *anyopaque, slice: []u8, _: u8, _: usize) void {
const self: *PageAllocator = @alignCast(@ptrCast(ctx));
const num_frames = (slice.len + page_size - 1) / page_size;
const start_frame_vaddr: Virt = @intFromPtr(slice.ptr) & ~page_mask;
const start_frame = phys2frame(v2p(start_frame_vaddr));
self.markNotUsed(start_frame, num_frames);
}
resize
Finally, there's resize()
, which changes the size of allocated memory. In this series, we will not implement this function. Since there is no use case that requires resizing, this is not an issue.
fn resize(_: *anyopaque, _: []u8, _: u8, _: usize, _: usize) bool {
@panic("PageAllocator does not support resizing");
}
A proper implementation of resize()
itself is not very difficult. It basically just calls free()
followed by allocate()
. Feel free to implement it if you want to.
Page-Level Allocation
With this, the implementation of the Allocator
interface is complete. You can now create an Allocator
, but I will implement one more helper function. The Allocator
interface itself does not inherently assume page-level memory allocation5. However, in an OS context, it is common to allocate memory in page-sized units. Therefore, having a function that allows allocation by specifying the number of pages is convenient. Additionally, such a function is necessary if you want to specify an alignment larger than the page size4.
The function you created cannot be called directly through the Allocator
. However, since Allocator
is just an interface, it is possible to call functions not provided by Allocator
by accessing the underlying allocator instance directly.
Now, let's implement a function that allocates memory in page-sized units:
pub fn allocPages(self: *PageAllocator, num_pages: usize, align_size: usize) ?[]u8 {
const num_frames = num_pages;
const align_frame = (align_size + page_size - 1) / page_size;
var start_frame = align_frame;
while (true) {
var i: usize = 0;
while (i < num_frames) : (i += 1) {
if (start_frame + i >= self.frame_end) return null;
if (self.get(start_frame + i) == .used) break;
}
if (i == num_frames) {
self.markAllocated(start_frame, num_frames);
const virt_addr: [*]u8 = @ptrFromInt(p2v(frame2phys(start_frame)));
return virt_addr[0 .. num_pages * page_size];
}
start_frame += align_frame;
if (start_frame + num_frames >= self.frame_end) return null;
}
}
中身はほぼ allocate()
と同じです。 引数はサイズをの代わりにページ数を受け取ります。 align_size
にはページサイズ以上のアラインメントを指定することができ、空きページを探索する際にはこのアラインメントを考慮します。
Instantiation of Allocator
With the preparations complete, let's instantiate an Allocator
that can be used in Ymir:
pub fn newUninit() Self {
return Self{
.frame_end = undefined,
.bitmap = undefined,
};
}
pub const PageAllocator = @import("mem/PageAllocator.zig");
pub var page_allocator_instance = PageAllocator.newUninit();
pub const page_allocator = Allocator{
.ptr = &page_allocator_instance,
.vtable = &PageAllocator.vtable,
};
pub fn initPageAllocator(map: MemoryMap) void {
page_allocator_instance.init(map);
}
page_allocator_instance
is the sole instance of PageAllocator
. Generally, this instance should not be accessed directly. The only time you need to use it is when calling the previously mentioned allocPages()
function. In fact, we don't really want to expose this instance directly, so ideally it shouldn't be marked as pub
. The same goes for the PageAllocator
type itself. However, since Allocator.alignedAlloc()
does not support alignments larger than the page size, this is unavoidable4.
The core Allocator
is created by specifying a ptr
and a vtable
. The ptr
points to the page_allocator_instance
. This setup allows you to use not only the three functions we implemented earlier but also various functions provided by the Allocator
interface, such as alloc()
, create()
, alignedAlloc()
, and allocSentinel()
.
When using it, you simply treat it as an Allocator
like this (no need to worry about the internal implementation):
mem.initPageAllocator(boot_info.memory_map);
log.info("Initialized page allocator", .{});
const page_allocator = ymir.mem.page_allocator;
const array = try page_allocator.alloc(u32, 4);
log.debug("Memory allocated @ {X:0>16}", .{@intFromPtr(array.ptr)});
page_allocator.free(array);
Summary
In this chapter, we implemented a PageAllocator
that tracks available pages based on the memory map provided by UEFI. With the memory allocator in place, many new possibilities open up. For example, you can now allocate pages for page tables, enabling you to rebuild the memory map. Additionally, with VT-x, you need to allocate pages for VMCS for each vCPU. The PageAllocator
we implemented is also usable as a general-purpose allocator for smaller chunks. Of course, allocating something like 8 bytes will still reserve a full 4 KiB page, so it's not memory-efficient. With that said, next we will implement a more efficient allocator for general use. But before that, in the next chapter, let's first rebuild the page tables. After physical and virtual addresses are no longer directly mapped, we will revisit allocators again.
Strictly speaking, Zig does not have the concept of an interface. However, since it is quite similar to what other languages call an interface, we refer to it as such.
Strictly speaking, all the files we've created so far are treated as types (similar to structs). That's why accessing functions like ymir.bits.XXX()
is possible.
No one knows what this argument is.
With Allocator
, alignments larger than the page size are disallowed as specified here.
This limitation is not because allocator instances are not designed for page-level allocations, but rather a restriction of the Allocator
interface itself.