King Don Jon: February 2025

Monday, 17 February 2025

RISK-V : Bare Metal OS : 3 User Applications

Intro

We have some basic capabilities in our supervisor now and we can turn our attention to user applications. We can already run multiple independent processes with their own registers and stack. We need a few more features to be able to run user applications as processes within our OS.

Firstly we need a virtual address mechanism so that each process/application can have an address space. The same virtual addresses can be used in multiple applications but these will map to different physical addresses so that applications dont interfere with each other.

Secondly we need a mechanism for creating and running user applications. Our applications will be written in C and the compiled executables will be loaded at the same time as the kernel.

Finally we need the ability to switch between supervisor mode and user mode so that applications dont have the capability to interfere with each other or the kernel.

Virtual Memory

Hardware architectures incorporate methods to implement virtual memory within their specification. With 32-bit addresses from
0x0000-0000 to 0xffff-ffff each program can have a 4GB virtual address space. The physical address space may be much smaller.
Processor hardware (not the Operating Syestem) provides the ability to translate virtual addresses to physical addresses.

This implementation uses the 32-bit RISC-V implementations called Sv32.
Memory is divided into 4KB pages. Virtual addresses are mapped to physical memory using a two level page table.
The first level page table contains 1024 4 byte entries pointing to upto 1024 second level pages.
This structure allows upto 1024x1024= 1M pages which equates to 4GB virtual memory.
If the kernel+data starts at 0x8000-0000 and is 2MB in size the first level page table will just have the 512th entry completed pointing to a single second level page containing entries for 2048 data pages.

Page table entries are setup using the map_page function which looks at a data page's virtual address to determine the associated 1st level table entry. If this is zero, it sets up a new 2nd level page. It then puts a pointer to the data page's physical address in the table.
In this way Page Table nodes are setup for each page in a process.

In our implementation we use virtual addresses for the kernel which are the same as physical addresses. This allows kernel programs to work regardless of whether virtual addressing is on or off.
When we create a process we must map all the kernel pages into the table so that they can be accessed by the process.
We must also put the pointer to the start of the page table into our process structure.

Now comes the magic. When we context switch to a new process we simply change the satp (supervisor address translation and protection) register to the appropriate page table. Now the processor will automatically translate the processes virtual addresses to actual physical addresses using the page table.
Note that the sfence instructions ensure that writing the new satp is completed as a single operation without interruption.

This is awesome 😁😁😁😁

Application

I confess I got a bit lost setting up the application, partly because the instructions aren't so clear and partly because my attempt didn't work.

First we set up a basic program user.c which has a function start. The linker specifies that the program starts at 0x1000000 and that start is the entry point. The start function sets up the stack, calls main to carry out functions then calls exit which is initially an infinite loop.

main is defined in the file shell.c and initially it just loops forever.

The build script is full of weird magic. First we build the application as shell.elf. Then we convert it from elf format to a raw binary and THEN we convert it into a format which can be embedded into the kernel.
Finally the kernel is build, including shell.bin.o which is our application.

Run Application In User Mode

Now we have the program we want to run appended to the kernel, we can create a process by allocating memory for the program, copying the program into the memory pages and adding page table entries. The jiggery pokery in the build script gives us the information we need regarding the program location within the kernel. We now have a process containing our application ready to run.

Finally we need to switch to user mode which is accomplished with the privileged sret instruction.

This is executed in the user_entry function so is executed as soon as the context switch to the user program is completed. The sepc register contains the program counter which sret jumps to.

Our first program is pretty dull, we can only look at memory and status registers to see what it is doing.

Of course in user mode we cant read/write from/to the console as that is the supervisors role. We need to be able to execute system calls.

System Calls

System calls are familiar from assembly language programming. You ask the kernel to do some work for you. In this example, we ask the kernel to read a character from the console and return it in an integer varaible.

You specify a number of parameters then issue an ecall instruction. ecall causes an exception which the kernel must handle.

In this case the only argument we pass is the ecall identifier SYS_GETCHAR

Within the kernel, the exception causes the handle_trap function to be executed.
By checking CSRs handle_trap determines that an ecall is responsible for the exception and invokes handle_syscall.

handle_syscall looks at the identifier, and sees that it is being requested to read a character from the console. Using its own getchar() function it makes an SBI call to see whether a console character is available so that it can return it to the user. Note that the function loops until a character is available. After each check it yields so that other kernel processes can take their turn.

User Shell

Finally we enter familiar territory. We can write a C program to display a prompt, get some input and display a console message.

Outro

The tutorial has a couple more chapters demonstrating disk i/o and a simple file system. These are the essential next step if you are writing a disk Operating System. However I have learnt so much on this journey that I will stop here.

There is a complete working copy of "OS in 1,000 lines" on github which I have tried out.
It gives you a nice warm feeling to be able to execute commands from the shell, even though they are very basic.

Looking back I have found out so many new concepts:

QEMU provides a pretty realistic RISC-V environment for investigations. It needs openSBI to initialise the "virtual hardware" for us.
openSBI is another level of software to consider, it runs in M-mode (machine) and does all the hardware specific (memory, interfaces, peripherals) initialisation for us.
openSBI calls allow programs running in S-mode (supervisor) to use hardware functions such as writing to the console.
We can write our Operating System in C. The occasional assembly language instructions we need to use for privileged instructions and initialisation can easily be specified using inline assembler.
Our OS could use the C standard library.
Our OS is running in S-mode and can access CSRs (Control and Status Registers) to find the cause and type of exceptions and other information.
We define two areas of storage for a stack and free memory. We allocate memory in 4096 byte pages.
A process has its own stack and memory. We move between processes using a context switch.
RISC-V hardware converts virtual to physical addresses based on a two level page table.
Switching to user mode protects the kernel and isolates processes from each other. It is simple to implement.
We can easily write a user shell to allow us to use basic functions

As a follow on I may investigate xv6, first as a QEMU VM, followed by running it on Nezha or VisionFIre2 hardware.

Thursday, 13 February 2025

RISC-V Bare Metal OS : 2

The Story So Far

We have made an excellent start to our OS, we have loaded SBI, initialised harder, booted the processor and communicated with SBI to print a Hello World message. In this installment we will do some consolidation making our environment useable.

Kernel Panic

If our kernel crashes, we want it to print some diagnostic information. We define a short macro called PANIC which prints out an error message.
Because PANIC is a macro the text is inserted inline before compilation and the values for source file and line numberfilled in so they can be printed if PANIC is called.

The while(1) specifies that the program loops once it has printed the message. do...while(0) specifies that the block is only executed once.

When the PANIC macro is included in our kernel it causes the "booted!" message to be printed out

followed by the program source file name and number. This simple mechanism will be very helpful in debugging our programs if we put PANIC calls wherever necessary.

Exceptions

An exception will occur if our processor encounters serious errors such as invalid instruction, invalid memory address (page fault) or a system call. When an exception occurs the values in various communication and status registers (CSRs) provide details to help resolve the issue.

User mode applications are not able to access CSRs but if the kernel operates in supervisor mode it can read and write CSRs to deal with problems. Thankfully openSBI initialises our machine in supervisor mode so our kernel can process exceptions.

Risc-v also incorporates machine mode which deals with hardware level issues. We are not concerned with these as we building a kernel and openSBI has already configured the platform for us. I may return to M-mode if/when I look at bare metal programming on a real risc-v processor, for example the VisionFire2.

We will configure our kernel to print CSRs and stop when it encounters an exception.

First we can define a couple of macros with inline assembly language so that we can use privileged instructions CSRR and CSRW to respectively read and write CSRs.

Next we define a handle_trap function which obtains the values of three CSRs, prints them and uses a PANIC macros to stop processing.

We also setup a function called kernel_entry which saves registers then calls handle_trap. Near the start of our kernel we setup the exception handler by setting the stvec register to point to kernel_entry.

The kernel then executes "unimp", a pseudo instruction which triggers an illegal instruction exception.

When our kernel runs it encounters the illegal instruction, which triggers an exception.

Processing jumps to the address in stvec, which is the kernel_entry function.

kernel_entry saves registers then calls handle_trap to retrieve CSR values and display them using the PANIC function. These values can be used to determine what caused the exception error. In the example above we can see that the program counter sepc was at address 80200134 when the exception occured. scause=2 indicates an illegal instruction. Using the llvm-addr2line utility we discover that address 80200134 corresponds to line 135 in kernel.c. Looking at kernel.c we see that the "unimp" instruction is indeed at line 135! Magic😀

Allocate Memory

We use a very simple allocation system which allocates 4096 bytes (hex 0x1000) of physical memory for each request and never frees up memory after use. Each time alloc_pages is called it returns the memory address __free_ram and increases __free_ram by 4096.

Our test program calls alloc_pages twice, the first time two pages are allocated starting at 8022-1000 and then a page is allocated at 8022-3000

Physical memory starts at 8020-0000 and we start allocating physical memory at the initial value of __free_ram 8022-1000 which follows programs, data, variables and the stack.

Process

Now we need the concept of a process. The kernel needs to be able to run more than one application at a time. Each process will have its own execution context (registers etc) and resources (address space etc). We need the ability to setup a process for each application and to switch between the contexts to execute different applications.

A Process Control Block (PCB) structure is defined which contains pid, state, stack pointer and an 8KB stack.

Our kernel has PROCS_MAX =8 so it can run 8 processes.
A PCB structure array procs contains 8 entries.

The create_process function is quite simple. First it checks whether there is a free process slot.

If there is a free process slot it initialises the stack pointer and saves (initialises) 0s to the stack for register values.

These values will be restored to registers the first time the process is started.

The context switch is also very simple. First it saves "callee registers" (ra, s0-s11) next it switches to the stack pointer for the new process and restores its callee registers.

Now comes the real magic. We define two functions proc_a_entry and proc_b_entry which will do the work. We keep this first test simple, each process prints a character then switches to the other process. A delay is incorporated to stop characters coming out too quickly.

Next, processes proc_a and proc_b are created and the functions start address is passed over. This will become the program counter (PC).

Finally proc_a_entry is invoked. It prints 'A' and context switches to proc_b. The switch includes restoring the return register for proc_b so the context_switch returns into proc_b which prints 'B' then switches back to proc_a.

The processes continue to swap and a string of A s and Bs is printed.
I think this is awesome.

The actual C programming is a little complicated for me because it uses pointers so much, but the result is wonderful.😁😁

Scheduler

Our first approach to context switching was awesome but we certainly dont want each process to specify which process will run next. Instead, at a convenient time we tell the function to yield. The yield function is responsible for scheduling another task to run. Note that functions are still running within the kernel and the programmer has responsibilty to pass control to the scheduler. I believe (not sure) that you can use multiple yield functions at various points in a function.

The yield function contains the scheduler. It steps through the Program Control Blocks until it finds another process that is free to run and then does a context_switch to that process. If it doesn't find another process it continues with the current process.

This is beautifully simple 😁

There is a little extra work required in the exception handler to keep track of the current stack pointers but it isn't a major change (well I don't really understand it) so I wont include it here.

The Story So Far

We have made awesome progress on our kernel; we can boot it up, write output to a console, use standard library functions (ours are somewhat simplified). We can also deal with crashes, allocate memory and run / schedule multiple processes.

So far everything is running within the kernel which is running in supervisor mode Next we must prepare the ability to run user mode applications and prevent them adversely affecting the kernel or other user programs.