Serial Output

In the previous chapter, we handed control over from Surtr to Ymir. Starting with this chapter, we'll begin implementing the Ymir Kernel. The very first thing we need to do is get logging output - just like we did back in Surtr. We'll use the serial port1 for this. Thanks to the boot options we set in build.zig, QEMU redirects the serial port to standard output. Once we get a logging output, it will make further development much smoother. Note that we'll build a full logging system in the next chapter; for now, the goal is simply to get basic serial output working.

important

The Source code for this chapter is in whiz-ymir-serial_output branch.

Table of Contents

Creating the arch Directory

x86 Directory

Just like we did for Surtr, we'll separate architecture-specific code into its own directory. Here's what the Ymir directory structure looks like:

sh
tree ./ymir

./ymir
├── arch
│   └── x86
│       ├── arch.zig
│       ├── asm.zig
│       └── serial.zig
├── arch.zig
├── linker.ld
└── main.zig

The contents of ymir/arch.zig are identical to those of surtr/arch.zig. Its purpose is to export ymir/arch/x86/arch.zig to the higher-level directory.

ymir/arch/x86/arch.zig serves as the root of the ymir/arch/x86 directory. When using architecture-specific code from directories above ymir/arch, this is the file that should always be imported. Here's what ymir/arch/x86/arch.zig looks like:

ymir/arch/x86/arch.zig
// `/arch` 以外から使いたいモジュール
pub const serial = @import("serial.zig");
// `/arch` 以外に露出したくないモジュール
const am = @import("asm.zig");

Since serial.zig is marked pub, it can be accessed from higher-level directories as arch.serial. On the other hand, asm.zig is not public, so it's inaccessible outside the arch directory. This is intentional; we want to encapsulate architecture-specific code as much as possible, so any assembly code will be kept within asm.zig and hidden from the rest of the codebase.

Creating ymir Module

At the moment, to reference ymir/piyo/neko.zig from ymir/hoge/fuga.zig, you need to use a relative path like this:

ymir/hoge/fuga.zig
const neko = @import("../piyo/neko.zig");

This not only looks awkward, but it also carries the risk of accidentally referencing modules that are meant to be hidden, as shown below:

zig
const am = @import("arch/x86/asm.zig"); // 本来はアクセスさせたくない

To prevent this, we'll create a root module and require all modules to be accessed through this root module. The module setup is defined in build.zig, just like we did for Surtr:

build.zig
const ymir_module = b.createModule(.{
    .root_source_file = b.path("ymir/ymir.zig"),
});
ymir_module.addImport("ymir", ymir_module);
ymir_module.addImport("surtr", surtr_module);

Add the defined ymir module to the ymir executable:

build.zig
ymir.root_module.addImport("ymir", ymir_module);

With this setup, you can now import the Ymir module simply by using @import("ymir"). The root module file, ymir/ymir.zig, exports all the necessary child modules:

ymir/ymir.zig
pub const arch = @import("arch.zig");

Now, you can access arch/x86/arch.zig from any file using @import("ymir").arch. To test this, let's define a simple function in arch/x86/arch.zig and call it from main.zig:

zig
// -- ymir/arch/x86/arch.zig --
pub fn someFunction() void {}

// -- ymir/main.zig --
const ymir = @import("ymir");
const arch = ymir.arch;
arch.someFunction();

You can now access the ymir module just like std. From here on, to prevent unintended imports, directly importing files outside the same directory using relative paths will be generally prohibited.

tip

For those unfamiliar with Zig, adding the ymir module to the ymir executable might feel awkward. The author initially felt the same, but after asking on Ziggit, it turns out this approach is not only legal but also considered natural2.

Bisc Definition of Serial

The UART we are targeting here is the 8250 UART. It supports both input and output, but in this chapter, we'll focus solely on output.

Let's define constants and structures for the serial port. We'll start with the COM ports, which are accessed via I/O ports. To map COM ports to their corresponding I/O ports, create ymir/arch/x86/serial.zig:

ymir/arch/x86/serial.zig
pub const Ports = enum(u16) {
    com1 = 0x3F8,
    com2 = 0x2F8,
    com3 = 0x3E8,
    com4 = 0x2E8,
};

This time, we'll define only four ports. Depending on the actual hardware, there might be more or fewer. For Ymir, we'll be using only COM1.

Each port has its own data register. The data registers are accessed using an offset based on the address of COM port:

ymir/arch/x86/serial.zig
const offsets = struct {
    /// Transmitter Holding Buffer: DLAB=0, W
    pub const txr = 0;
    /// Receiver Buffer: DLAB=0, R
    pub const rxr = 0;
    /// Divisor Latch Low Byte: DLAB=1, R/W
    pub const dll = 0;
    /// Interrupt Enable Register: DLAB=0, R/W
    pub const ier = 1;
    /// Divisor Latch High Byte: DLAB=1, R/W
    pub const dlh = 1;
    /// Interrupt Identification Register: DLAB=X, R
    pub const iir = 2;
    /// FIFO Control Register: DLAB=X, W
    pub const fcr = 2;
    /// Line Control Register: DLAB=X, R/W
    pub const lcr = 3;
    /// Modem Control Register: DLAB=X, R/W
    pub const mcr = 4;
    /// Line Status Register: DLAB=X, R
    pub const lsr = 5;
    /// Modem Status Register: DLAB=X, R
    pub const msr = 6;
    /// Scratch Register: DLAB=X, R/W
    pub const sr = 7;
};

In reality, the register accessed depends on three factors: the offset, whether it's a read or write operation, and the current value of the DLAB bit. For details on which register corresponds to which conditions, refer to the comments within offsets or consult the reference material1.

Initialization

Let's initialize the serial port. Access to the COM port is done using the IN and OUT instructions to the corresponding I/O ports. Let's define the necessary assembly instructions:

ymir/arch/x86/asm.zig
pub inline fn inb(port: u16) u8 {
    return asm volatile (
        \\inb %[port], %[ret]
        : [ret] "={al}" (-> u8),
        : [port] "{dx}" (port),
    );
}

pub inline fn outb(value: u8, port: u16) void {
    asm volatile (
        \\outb %[value], %[port]
        :
        : [value] "{al}" (value),
          [port] "{dx}" (port),
    );
}

Using these functions, we'll read from and write to the I/O ports to initialize the serial port. The registers are configured as shown in the following table3. Note that all registers are 8-bit:

RegisterDescriptionValue
LCR (Line Control)Line Protocol8n1 (8 data bit / No parity / 1 stop bits)
IER (Interrupt Enable)Interrupts to enable0 (disable all interrupts)
FCR (FIFO Control)FIFO buffer0 (disable FIFO)
ymir/arch/x86/serial.zig
const am = @import("asm.zig");

pub fn initSerial(port: Ports, baud: u32) void {
    const p = @intFromEnum(port);
    am.outb(0b00_000_0_00, p + offsets.lcr); // 8n1: no parity, 1 stop bit, 8 data bit
    am.outb(0, p + offsets.ier); // Disable interrupts
    am.outb(0, p + offsets.fcr); // Disable FIFO
    ...
}

Next, let's set the baud rate. The baud rate is the number of bits transmitted per second on the signal line. Each character is framed with a start bit at the beginning and a stop bit (which is 1 in this case) at the end. Therefore, for 8n1 framing, transmitting one character (8 bits) requires 10 bits total. About 80% of the transmitted bits represent actual data. The maximum stable baud rate is said to be 115200, so Ymir will use this value.

UART operates with a clock of 115200 ticks per second. UEFI calculates the baud rate by dividing this clock frequency by a value called the divisor:

\[ \text{Baud Rate} = \frac{115200} { \text{Divisor} } \]

Therefore, if you want to set the baud rate to \(\text{B}\), you calculate the divisor as follows:

\[ \text{Divisor} = \frac{115200} { \text{B} } \]

Set the baud rate as follows:

ymir/arch/x86/serial.zig
{
    ...
    const divisor = 115200 / baud;
    const c = am.inb(p + offsets.lcr);
    am.outb(c | 0b1000_0000, p + offsets.lcr); // Enable DLAB
    am.outb(@truncate(divisor & 0xFF), p + offsets.dll);
    am.outb(@truncate((divisor >> 8) & 0xFF), p + offsets.dlh);
    am.outb(c & 0b0111_1111, p + offsets.lcr); // Disable DLAB
}

Setting the DLAB: Divisor Latch Access Bit in the LCR enables access to the DLL: Divisor Latch Low and DLH: Divisor Latch High registers. After setting DLAB, write the lower and upper bytes of the computed divisor to DLL and DLH respectively. The variable c holds the original LCR value before DLAB was set, and is used to restore LCR after the divisor has been configured.

Writing Characters

Next, let's try writing some characters to the initialized serial port.

To write to the serial port, you need to wait until the TX-buffer becomes empty. You can check this by examining the THRE: Transmitter Holding Register Empty bit in the LSR: Line Status Register. If it's not empty, wait until it is:

ymir/arch/x86/serial.zig
const bits = ymir.bits;

pub fn writeByte(byte: u8, port: Ports) void {
    // Wait until the transmitter holding buffer is empty
    while ((am.inb(@intFromEnum(port) + offsets.lsr) & 0b0010_0000) == 0) {
        am.relax();
    }

    // Put char into the transmitter holding buffer
    am.outb(byte, @intFromEnum(port));
}

Since LSR[5] corresponds to THRE, we wait until this bit is set - meaning the TX-buffer is empty. The am.relax() function issues a rep; nop instruction, which is used here to slightly relax the CPU during the wait.

Once the TX-buffer is empty, we write the given byte to the COM port. Since offset 0 from the COM port base corresponds to the TX-buffer, we can write directly using am.outb(byte, @intFromEnum(port)).

Let's test if we can actually write characters. Add some test code to main.zig. Unlike Surtr's log output, we don't need USC-2 here - plain ASCII characters are fine:

zig
const ymir = @import("ymir");
const arch = ymir.arch;

arch.serial.initSerial(.com1, 115200);
for ("Hello, Ymir!\n") |c|
    arch.serial.writeByte(c, .com1);

If you run it and see Hello, Ymir! printed, then it’s working correctly.

Serial Wrapper Class

While serial output is now functional, it's cumbersome to write a for loop just to print the string Hello, Ymir!. Also, directly calling files under arch feels a bit too architecture-dependent. Therefore, we'll create a Serial struct to wrap the raw serial functionality.

To abstract arch/x86/serial.zig, create ymir/serial.zig directly under the root directory:

ymir/serial.zig
const ymir = @import("ymir");
const arch = ymir.arch;

pub const Serial = struct {
    const Self = @This();
    const WriteFn = *const fn (u8) void;
    const ReadFn = *const fn () ?u8;

    _write_fn: WriteFn = undefined,
    _read_fn: ReadFn = undefined,
    ...
};

Serial holds function pointers _write_fn and _read_fn for serial output and input respectively4. It is instantiated with a baud rate of 115200 as follows:

ymir/serial.zig
pub fn init() Serial {
    var serial = Serial{};
    arch.serial.initSerial(&serial, .com1, 115200);
    return serial;
}

After creating an empty Serial struct, it is passed to initSerial(). Modify the previously implemented initSerial() to accept a *Serial as its first argument. Then set the appropriate output function to the _write_fn function pointer with the passed *Serial:

ymir/arch/x86/serial.zig
pub fn initSerial(serial: *Serial, port: Ports, baud: u32) void {
    ...
    serial._write_fn = switch (port) {
        .com1 => writeByteCom1,
        .com2 => writeByteCom2,
        .com3 => writeByteCom3,
        .com4 => writeByteCom4,
    };
}

writeByteComN() is a helper function that outputs to the COM port corresponding to the given Port. Its implementation is writeByte() function we implemented earlier:

ymir/arch/x86/serial.zig
fn writeByteCom1(byte: u8) void {
    writeByte(byte, .com1);
}
...

Now the output function is set in Serial. To make it user-friendly, let's provide functions for outputting a single character and a string:

ymir/serial.zig
pub fn write(self: Self, c: u8) void {
    self._write_fn(c);
}

pub fn writeString(self: Self, s: []const u8) void {
    for (s) |c| {
        self.write(c);
    }
}

Summary

In this chapter, we implemented the basic functionality for serial output. We also separated the architecture-dependent parts into the arch directory and created the Serial struct to abstract them. In main.zig, initialize and use it as follows:

ymir/main.zig
const sr = serial.init();
sr.writeString("Hello, Ymir!\n");

When you run it, you should see Hello, Ymir! displayed as follows:

txt
[INFO ] (surtr): Initialized bootloader log.
[INFO ] (surtr): Got boot services.
[INFO ] (surtr): Located simple file system protocol.
[INFO ] (surtr): Opened filesystem volume.
[INFO ] (surtr): Opened kernel file.
[INFO ] (surtr): Parsed kernel ELF header.
[INFO ] (surtr): Kernel image: 0x0000000000100000 - 0x0000000000108000 (0x8 pages)
[INFO ] (surtr): Allocated memory for kernel image @ 0x0000000000100000 ~ 0x0000000000108000
[INFO ] (surtr): Mapped memory for kernel image.
[INFO ] (surtr): Loading kernel image...
[INFO ] (surtr):   Seg @ 0xFFFFFFFF80100000 - 0xFFFFFFFF80100273
[INFO ] (surtr):   Seg @ 0xFFFFFFFF80101000 - 0xFFFFFFFF80102000
[INFO ] (surtr):   Seg @ 0xFFFFFFFF80102000 - 0xFFFFFFFF80107000
[INFO ] (surtr):   Seg @ 0xFFFFFFFF80107000 - 0xFFFFFFFF80108000
[INFO ] (surtr): Exiting boot services.
Hello, Ymir!

Using the serial output implemented in this chapter, you can build a logging system similar to the one implemented in Surtr. In the next chapter, we will first create a library to assist with bitwise operations, and then implement the logging system in the following chapter.

3

The reason why asm.zig is imported as am in the code is because asm is a reserved keyword in Zig.

4

Note that serial input will be covered in a much later chapter.