Saturday, 25 January 2025

RISC-V Bare Metal OS

Intro

I saw this article on hackaday and, as you can imagine, it set my heart racing.

Bare-metal programming isn't trivial, it typically requires intimate knowledge of hardware and a good understanding of assembly language programming.  The thought of writing a bare metal OS is pretty stunning - it isn't something I have contemplated previously.

I don't want to try learning a new assembly language but I am familiar with RISC-V assembler so I am not put off by it.
This project is built on QEMU so it avoids the need to purchase some new hardware.  In addition real-world hardware programming often has a tortuous method to load and test programs which it is good to avoid.

Preparation

There is an excellent online "book" which shows the code required at each step and clear instructions how to build and test it.

The installation instructions didn't work on Windows (always some complication with Windows) so I setup on PI41 linux where everything worked perfectly.

Code for the project is compiled using Clang (front-end) / LLVM (back-end).  Programs are written in C.  The small amount of assembly language required is incorporated using inline assembly instructions.  This closely integrates the C and assembly code so that they share data easily.

The only other software required in addition to QEMU, Clang and LLVM is openSBI.

QEMU includes a nice "generic" platform for RISC-V virtual machines called 'virt'.  The specification is shown on the right.  There is much more virtual hardware provided than we need.  In fact we just need an RV32 core, some memory and the UART console.  Later on we will add disk I/O for which we will attach a virtio device to the Virtual Machine.

Even though we are doing bare metal programming we need lower-level firmware SBI  (Supervisor Binary Interface, like BIOS/UEFI) to initialise the hardware prior to loading our programs.  Fortunately there is an openSBI implementation for our 'virt' machine.  This gives us everything we need.

First Boot

There is a lot of magic in our first step.

The memory layout is define in kernel.ld.  There are four sections text(the code), rodata (read-only data), data (variables), bss (zero-initialised variables).
Code and data are placed in memory starting at 32-bit address 80:20:00:00.  32 bit addresses provide 4GB storage.  "80" tells us that we are in the top half of storage (reserved for supervisor mode?) and "20" tells us that the start address is 2MB from the start.
The entry point will be the boot function address.

Our minimal first supervisor code is very short.  Luckily most of it can be written in C, in line assembly is used where C cannot cope.
The boot function, simply sets the stack pointer and jumps into the kernel_main program.  There are only two assembly instructions required, move and jump.
All the main program does is to initialise memory and  then loop for ever.  The important memory addresses (such as stack_top) are defined in kernel.ld and can then be referenced in kernel.c



The program is compiled to an ELF executable kernel.elf, we can look at the assembly using llvm-objdump, as shown above.  QEMU starts OpenSBI to initialise the hardware then loads and executes kernel.elf.  Our program doesnt do much, using the QEMU console we can see it loops at 80200050.

After our exercise we have a working environment, we have initialised our machine and loaded a minimal supervisor.  Excellent!

A Diversion: online compiler

The "book" mentions that Compiler Explorer is an online tool which enables you to see how a compiler translates C into assembly language.  This can be very useful when writing a mixture of C and assembler.  Code produced by compilers is very specific to the the compiler brand, its versions and switches used to invoke it.  For example Optimisation switches affect the code generated.  In our case we will be using riscv 32 bit clang as the variant.  The screenshot below shows how the compiler translates a simple square function.



Hello World!

We have now mastered our memory and CPU and we have compiled, loaded and executed our simplest kernel.  An OS which doesn't communicate with the outside world is useless and our next step is to print messages to our console.  Luckily OpenSBI has setup our virtual hardware console so we dont need to control the virtual serial UART directly to write to the console.  Instead we make an "ecall" to openSBI including the character we want to print within the parameters and let it work out the details.  The OpenSBI specification is available on github. 





To write a character to the console SBI provides a "putchar" function.   I dont quite understand the syntax but we set up risc-v registers a0-a7 as parameters to the sbi_call function, then we put the character to print ch in a0, and the putchar function id #0x01 in a7 and make the call.

The result is a "Hello World!" message on the virtual console.


Once we have a putchar function available in our C environment we can write our own printf function so that we can print formatted numbers, strings etc.



The printf function provided here is a simplified version which prints numbers and strings but is still very powerful allowing a list of variables and expressions to be printed.



C Standard Library

Previously we implemented a simplified version of the standard library functions printf to print a string to the console.  Looking at the code in printf we can see it is standard C, we don't need any environment specific coding.  There is a hint we could copy functions from the standard C library into our program, saving us a load of working and giving us many tools to help write our kernel.  However, in practice the C standard library is very complicated. I looked at vprintf.c which is the "Actual printf innards", it says in the header "This code is large and complicated".  The C header files and code are much more advanced than me and I dont want to make this project about learning a lot about C pointers.

The instructions provided actually give us a small set of subroutines which we will need to help us.  The ones we have available are memset (initialise a range of memory), memcpy (copy a range of memory), strcpy (copy a string), strcmp (compare two strings).


Intermission

We are making wonderful progress with our kernel.  We have been able to use openSBI to setup our RISC-V VM and we are able to write C programs which communicate with the SBI API to utilise our virtual hardware, in particular to print messages on the virtual console.


No comments:

Post a Comment