Serial Output
In the previous chapter, we handed control over from Surtr to Ymir. Starting with this chapter, we'll begin implementing the Ymir Kernel. The very first thing we need to do is get logging output - just like we did back in Surtr. We'll use the serial port1 for this. Thanks to the boot options we set in build.zig
, QEMU redirects the serial port to standard output. Once we get a logging output, it will make further development much smoother. Note that we'll build a full logging system in the next chapter; for now, the goal is simply to get basic serial output working.
important
The Source code for this chapter is in whiz-ymir-serial_output
branch.
Table of Contents
- Creating the
arch
Directory - Bisc Definition of Serial
- Initialization
- Writing Characters
Serial
Wrapper Class- Summary
Creating the arch
Directory
x86 Directory
Just like we did for Surtr, we'll separate architecture-specific code into its own directory. Here's what the Ymir directory structure looks like:
tree ./ymir
./ymir
├── arch
│ └── x86
│ ├── arch.zig
│ ├── asm.zig
│ └── serial.zig
├── arch.zig
├── linker.ld
└── main.zig
The contents of ymir/arch.zig
are identical to those of surtr/arch.zig
. Its purpose is to export ymir/arch/x86/arch.zig
to the higher-level directory.
ymir/arch/x86/arch.zig
serves as the root of the ymir/arch/x86
directory. When using architecture-specific code from directories above ymir/arch
, this is the file that should always be imported. Here's what ymir/arch/x86/arch.zig
looks like:
// `/arch` 以外から使いたいモジュール
pub const serial = @import("serial.zig");
// `/arch` 以外に露出したくないモジュール
const am = @import("asm.zig");
Since serial.zig
is marked pub
, it can be accessed from higher-level directories as arch.serial
. On the other hand, asm.zig
is not public, so it's inaccessible outside the arch
directory. This is intentional; we want to encapsulate architecture-specific code as much as possible, so any assembly code will be kept within asm.zig
and hidden from the rest of the codebase.
Creating ymir
Module
At the moment, to reference ymir/piyo/neko.zig
from ymir/hoge/fuga.zig
, you need to use a relative path like this:
const neko = @import("../piyo/neko.zig");
This not only looks awkward, but it also carries the risk of accidentally referencing modules that are meant to be hidden, as shown below:
const am = @import("arch/x86/asm.zig"); // 本来はアクセスさせたくない
To prevent this, we'll create a root module and require all modules to be accessed through this root module. The module setup is defined in build.zig
, just like we did for Surtr:
const ymir_module = b.createModule(.{
.root_source_file = b.path("ymir/ymir.zig"),
});
ymir_module.addImport("ymir", ymir_module);
ymir_module.addImport("surtr", surtr_module);
Add the defined ymir
module to the ymir
executable:
ymir.root_module.addImport("ymir", ymir_module);
With this setup, you can now import the Ymir module simply by using @import("ymir")
. The root module file, ymir/ymir.zig
, exports all the necessary child modules:
pub const arch = @import("arch.zig");
Now, you can access arch/x86/arch.zig
from any file using @import("ymir").arch
. To test this, let's define a simple function in arch/x86/arch.zig
and call it from main.zig
:
// -- ymir/arch/x86/arch.zig --
pub fn someFunction() void {}
// -- ymir/main.zig --
const ymir = @import("ymir");
const arch = ymir.arch;
arch.someFunction();
You can now access the ymir
module just like std
. From here on, to prevent unintended imports, directly importing files outside the same directory using relative paths will be generally prohibited.
tip
For those unfamiliar with Zig, adding the ymir
module to the ymir
executable might feel awkward. The author initially felt the same, but after asking on Ziggit, it turns out this approach is not only legal but also considered natural2.
Bisc Definition of Serial
The UART we are targeting here is the 8250 UART. It supports both input and output, but in this chapter, we'll focus solely on output.
Let's define constants and structures for the serial port. We'll start with the COM ports, which are accessed via I/O ports. To map COM ports to their corresponding I/O ports, create ymir/arch/x86/serial.zig
:
pub const Ports = enum(u16) {
com1 = 0x3F8,
com2 = 0x2F8,
com3 = 0x3E8,
com4 = 0x2E8,
};
This time, we'll define only four ports. Depending on the actual hardware, there might be more or fewer. For Ymir, we'll be using only COM1.
Each port has its own data register. The data registers are accessed using an offset based on the address of COM port:
const offsets = struct {
/// Transmitter Holding Buffer: DLAB=0, W
pub const txr = 0;
/// Receiver Buffer: DLAB=0, R
pub const rxr = 0;
/// Divisor Latch Low Byte: DLAB=1, R/W
pub const dll = 0;
/// Interrupt Enable Register: DLAB=0, R/W
pub const ier = 1;
/// Divisor Latch High Byte: DLAB=1, R/W
pub const dlh = 1;
/// Interrupt Identification Register: DLAB=X, R
pub const iir = 2;
/// FIFO Control Register: DLAB=X, W
pub const fcr = 2;
/// Line Control Register: DLAB=X, R/W
pub const lcr = 3;
/// Modem Control Register: DLAB=X, R/W
pub const mcr = 4;
/// Line Status Register: DLAB=X, R
pub const lsr = 5;
/// Modem Status Register: DLAB=X, R
pub const msr = 6;
/// Scratch Register: DLAB=X, R/W
pub const sr = 7;
};
In reality, the register accessed depends on three factors: the offset, whether it's a read or write operation, and the current value of the DLAB bit. For details on which register corresponds to which conditions, refer to the comments within offsets
or consult the reference material1.
Initialization
Let's initialize the serial port. Access to the COM port is done using the IN and OUT instructions to the corresponding I/O ports. Let's define the necessary assembly instructions:
pub inline fn inb(port: u16) u8 {
return asm volatile (
\\inb %[port], %[ret]
: [ret] "={al}" (-> u8),
: [port] "{dx}" (port),
);
}
pub inline fn outb(value: u8, port: u16) void {
asm volatile (
\\outb %[value], %[port]
:
: [value] "{al}" (value),
[port] "{dx}" (port),
);
}
Using these functions, we'll read from and write to the I/O ports to initialize the serial port. The registers are configured as shown in the following table3. Note that all registers are 8-bit:
Register | Description | Value |
---|---|---|
LCR (Line Control) | Line Protocol | 8n1 (8 data bit / No parity / 1 stop bits) |
IER (Interrupt Enable) | Interrupts to enable | 0 (disable all interrupts) |
FCR (FIFO Control) | FIFO buffer | 0 (disable FIFO) |
const am = @import("asm.zig");
pub fn initSerial(port: Ports, baud: u32) void {
const p = @intFromEnum(port);
am.outb(0b00_000_0_00, p + offsets.lcr); // 8n1: no parity, 1 stop bit, 8 data bit
am.outb(0, p + offsets.ier); // Disable interrupts
am.outb(0, p + offsets.fcr); // Disable FIFO
...
}
Next, let's set the baud rate. The baud rate is the number of bits transmitted per second on the signal line. Each character is framed with a start bit at the beginning and a stop bit (which is 1
in this case) at the end. Therefore, for 8n1 framing, transmitting one character (8 bits) requires 10 bits total. About 80% of the transmitted bits represent actual data. The maximum stable baud rate is said to be 115200
, so Ymir will use this value.
UART operates with a clock of 115200
ticks per second. UEFI calculates the baud rate by dividing this clock frequency by a value called the divisor:
\[ \text{Baud Rate} = \frac{115200} { \text{Divisor} } \]
Therefore, if you want to set the baud rate to \(\text{B}\), you calculate the divisor as follows:
\[ \text{Divisor} = \frac{115200} { \text{B} } \]
Set the baud rate as follows:
{
...
const divisor = 115200 / baud;
const c = am.inb(p + offsets.lcr);
am.outb(c | 0b1000_0000, p + offsets.lcr); // Enable DLAB
am.outb(@truncate(divisor & 0xFF), p + offsets.dll);
am.outb(@truncate((divisor >> 8) & 0xFF), p + offsets.dlh);
am.outb(c & 0b0111_1111, p + offsets.lcr); // Disable DLAB
}
Setting the DLAB: Divisor Latch Access Bit in the LCR enables access to the DLL: Divisor Latch Low and DLH: Divisor Latch High registers. After setting DLAB, write the lower and upper bytes of the computed divisor
to DLL and DLH respectively. The variable c
holds the original LCR value before DLAB was set, and is used to restore LCR after the divisor has been configured.
Writing Characters
Next, let's try writing some characters to the initialized serial port.
To write to the serial port, you need to wait until the TX-buffer becomes empty. You can check this by examining the THRE: Transmitter Holding Register Empty bit in the LSR: Line Status Register. If it's not empty, wait until it is:
const bits = ymir.bits;
pub fn writeByte(byte: u8, port: Ports) void {
// Wait until the transmitter holding buffer is empty
while ((am.inb(@intFromEnum(port) + offsets.lsr) & 0b0010_0000) == 0) {
am.relax();
}
// Put char into the transmitter holding buffer
am.outb(byte, @intFromEnum(port));
}
Since LSR[5]
corresponds to THRE, we wait until this bit is set - meaning the TX-buffer is empty. The am.relax()
function issues a rep; nop
instruction, which is used here to slightly relax the CPU during the wait.
Once the TX-buffer is empty, we write the given byte
to the COM port. Since offset 0
from the COM port base corresponds to the TX-buffer, we can write directly using am.outb(byte, @intFromEnum(port))
.
Let's test if we can actually write characters. Add some test code to main.zig
. Unlike Surtr's log output, we don't need USC-2 here - plain ASCII characters are fine:
const ymir = @import("ymir");
const arch = ymir.arch;
arch.serial.initSerial(.com1, 115200);
for ("Hello, Ymir!\n") |c|
arch.serial.writeByte(c, .com1);
If you run it and see Hello, Ymir!
printed, then it’s working correctly.
Serial
Wrapper Class
While serial output is now functional, it's cumbersome to write a for
loop just to print the string Hello, Ymir!
. Also, directly calling files under arch
feels a bit too architecture-dependent. Therefore, we'll create a Serial
struct to wrap the raw serial functionality.
To abstract arch/x86/serial.zig
, create ymir/serial.zig
directly under the root directory:
const ymir = @import("ymir");
const arch = ymir.arch;
pub const Serial = struct {
const Self = @This();
const WriteFn = *const fn (u8) void;
const ReadFn = *const fn () ?u8;
_write_fn: WriteFn = undefined,
_read_fn: ReadFn = undefined,
...
};
Serial
holds function pointers _write_fn
and _read_fn
for serial output and input respectively4. It is instantiated with a baud rate of 115200
as follows:
pub fn init() Serial {
var serial = Serial{};
arch.serial.initSerial(&serial, .com1, 115200);
return serial;
}
After creating an empty Serial
struct, it is passed to initSerial()
. Modify the previously implemented initSerial()
to accept a *Serial
as its first argument. Then set the appropriate output function to the _write_fn
function pointer with the passed *Serial
:
pub fn initSerial(serial: *Serial, port: Ports, baud: u32) void {
...
serial._write_fn = switch (port) {
.com1 => writeByteCom1,
.com2 => writeByteCom2,
.com3 => writeByteCom3,
.com4 => writeByteCom4,
};
}
writeByteComN()
is a helper function that outputs to the COM port corresponding to the given Port
. Its implementation is writeByte()
function we implemented earlier:
fn writeByteCom1(byte: u8) void {
writeByte(byte, .com1);
}
...
Now the output function is set in Serial
. To make it user-friendly, let's provide functions for outputting a single character and a string:
pub fn write(self: Self, c: u8) void {
self._write_fn(c);
}
pub fn writeString(self: Self, s: []const u8) void {
for (s) |c| {
self.write(c);
}
}
Summary
In this chapter, we implemented the basic functionality for serial output. We also separated the architecture-dependent parts into the arch
directory and created the Serial
struct to abstract them. In main.zig
, initialize and use it as follows:
const sr = serial.init();
sr.writeString("Hello, Ymir!\n");
When you run it, you should see Hello, Ymir!
displayed as follows:
[INFO ] (surtr): Initialized bootloader log.
[INFO ] (surtr): Got boot services.
[INFO ] (surtr): Located simple file system protocol.
[INFO ] (surtr): Opened filesystem volume.
[INFO ] (surtr): Opened kernel file.
[INFO ] (surtr): Parsed kernel ELF header.
[INFO ] (surtr): Kernel image: 0x0000000000100000 - 0x0000000000108000 (0x8 pages)
[INFO ] (surtr): Allocated memory for kernel image @ 0x0000000000100000 ~ 0x0000000000108000
[INFO ] (surtr): Mapped memory for kernel image.
[INFO ] (surtr): Loading kernel image...
[INFO ] (surtr): Seg @ 0xFFFFFFFF80100000 - 0xFFFFFFFF80100273
[INFO ] (surtr): Seg @ 0xFFFFFFFF80101000 - 0xFFFFFFFF80102000
[INFO ] (surtr): Seg @ 0xFFFFFFFF80102000 - 0xFFFFFFFF80107000
[INFO ] (surtr): Seg @ 0xFFFFFFFF80107000 - 0xFFFFFFFF80108000
[INFO ] (surtr): Exiting boot services.
Hello, Ymir!
Using the serial output implemented in this chapter, you can build a logging system similar to the one implemented in Surtr. In the next chapter, we will first create a library to assist with bitwise operations, and then implement the logging system in the following chapter.
The reason why asm.zig
is imported as am
in the code is because asm
is a reserved keyword in Zig.
Note that serial input will be covered in a much later chapter.