Monday, 17 February 2025

RISK-V : Bare Metal OS : 3 User Applications

 Intro

We have some basic capabilities in our supervisor now and we can turn our attention to user applications.  We can already run multiple independent processes with their own registers and stack.  We need a few more features to be able to run user applications as processes within our OS.

Firstly we need a virtual address mechanism so that each process/application can have an address space.  The same virtual addresses can be used in multiple applications but these will map to different physical addresses so that applications dont interfere with each other.

Secondly we need a mechanism for creating and running user applications.  Our applications will be written in C and the compiled executables will be loaded at the same time as the kernel.

Finally we need the ability to switch between supervisor mode and user mode so that applications dont have the capability to interfere with each other or the kernel.

Virtual Memory

Hardware architectures incorporate methods to implement virtual memory within their specification.  With 32-bit addresses from
0x0000-0000 to 0xffff-ffff each program can have a 4GB virtual address space.  The physical address space may be much smaller.
Processor hardware (not the Operating Syestem) provides the ability to translate virtual addresses  to physical addresses.


This implementation uses the 32-bit RISC-V implementations called Sv32.
Memory is divided into 4KB pages.  Virtual addresses are mapped to physical memory using a two level page table.
The first level page table contains 1024 4 byte entries pointing to upto 1024 second level pages.
This structure allows upto 1024x1024= 1M pages  which equates to 4GB virtual memory.
If the kernel+data starts at 0x8000-0000 and is 2MB in size the first level page table will just have the 512th entry completed pointing to a single second level page containing entries for 2048 data pages.

Page table entries are setup using the map_page function which looks at a data page's virtual address to determine the associated 1st level table entry. If this is zero, it sets up a new 2nd level page.  It then puts a pointer to the data page's physical address in the table.
In this way Page Table nodes are setup for each page in a process.





In our implementation we use virtual addresses for the kernel which are the same as physical addresses.  This allows kernel programs to work regardless of whether virtual addressing is on or off.
When we create a process we must map all the kernel pages into the table so that they can be accessed by the process.
We must also put the pointer to the start of the page table into our process structure.

Now comes the magic.  When we context switch to a new process we simply change the satp  (supervisor address translation and protection)  register to the appropriate page table.  Now the processor will automatically translate the processes virtual addresses to actual physical addresses using the page table.
Note that the sfence instructions ensure that writing the new satp is completed as a single operation without interruption.
This is awesome 😁😁😁😁

Application

I confess I got a bit lost setting up the application, partly because the instructions aren't so clear and partly because my attempt didn't work.

First we set up a basic program user.c which has a function start. The linker specifies that the program starts at 0x1000000 and that start is the entry point.  The start function sets up the stack, calls main to carry out functions then calls exit which is initially an infinite loop.

main is defined in the file shell.c and initially it just loops forever.





The build script is full of weird magic.  First we build the application as shell.elf.  Then we convert it from elf format to a raw binary and THEN we convert it into a format which can be embedded into the kernel.
Finally the kernel is build, including shell.bin.o which is our application.

Run Application In User Mode


Now we have the program we want to run appended to the kernel, we can create a process by  allocating memory for the program, copying the program into the memory pages and adding page table entries.  The jiggery pokery in the build script gives us the information we need regarding the program location within the kernel.  We now have a process containing our application ready to run.

Finally we need to switch to user mode which is accomplished with the privileged sret instruction.
This is executed in the user_entry function so is executed as soon as the context switch to the user program is completed.  The sepc register contains the program counter which sret jumps to.

Our first program is pretty dull, we can only look at memory and status registers to see what it is doing.
Of course in user mode we cant read/write from/to the console as that is the supervisors role.  We need to be able to execute system calls.



System Calls

System calls are familiar from assembly language programming.  You ask the kernel to do some work for you.  In this example, we ask the kernel to read a character from the console and return it in an integer varaible.
You specify a number of parameters then issue an ecall instruction.  ecall causes an exception which the kernel must handle.
In this case the only argument we pass is  the ecall identifier SYS_GETCHAR









Within the kernel, the exception causes the  handle_trap function to be executed. 
By checking CSRs handle_trap determines that an ecall is responsible for the exception and invokes handle_syscall.


 handle_syscall looks at the identifier, and sees that it is being requested to read a character from the console.  Using its own getchar() function it makes an SBI call to see whether a console character is available so that it can return it to the user.  Note that the function loops until a character is available.  After each check it yields so that other kernel processes can take their turn.


User Shell

Finally we enter familiar territory.  We can write a C program to display a prompt, get some input and display a console message.








Outro

The tutorial has a couple more chapters demonstrating disk i/o and a simple file system.  These are the essential next step if you are writing a disk Operating System.  However I have learnt so much on this journey that I will stop here.

There is a complete working copy of "OS in 1,000 lines" on github which I have tried out.
It gives you a nice warm feeling to be able to execute commands from the shell, even though they are very basic.

Looking back I have found out so many new concepts:
  • QEMU  provides a pretty realistic RISC-V environment for investigations.  It needs openSBI to initialise the "virtual hardware" for us.
  • openSBI is another level of software to consider, it runs in M-mode (machine) and does all the hardware specific (memory, interfaces, peripherals) initialisation for us.
  • openSBI calls allow programs running in S-mode (supervisor) to use hardware functions such as writing to the console.
  • We can write our Operating System in C.  The occasional assembly language instructions we need to use for privileged instructions and initialisation can easily be specified using inline assembler.
  • Our OS could use the C standard library.
  • Our OS is running in S-mode and can access CSRs (Control and Status Registers) to find the cause and type of exceptions and other information.
  • We define two areas of storage for a stack and free memory.  We allocate memory in 4096 byte pages.
  • A process has its own stack and memory.  We move between processes using a context switch.
  • RISC-V hardware converts virtual to physical addresses based on a two level page table.
  • Switching to user mode protects the kernel and isolates processes from each other.  It is simple to implement.
  • We can easily write a user shell to allow us to use basic functions 
As a follow on I may investigate xv6, first as a QEMU VM, followed by running it on Nezha or VisionFIre2 hardware.

Thursday, 13 February 2025

RISC-V Bare Metal OS : 2

The Story So Far

We have made an excellent start to our OS, we have loaded SBI, initialised harder, booted the processor and communicated with SBI to print a Hello World message.  In this installment we will do some consolidation making our environment useable.

Kernel Panic

If our kernel crashes, we want it to print some diagnostic information.  We define a short macro called PANIC which prints out an error message.
Because PANIC is a macro the text is inserted inline before compilation and the values for source file and line numberfilled in so they can be printed if PANIC is called.
The while(1) specifies that the program loops once it has printed the message.  do...while(0) specifies that the block is only executed once.

When the PANIC macro is included in our kernel it causes the "booted!" message to be printed out 

followed by the program source file name and number.  This simple mechanism will be very helpful in debugging our programs if we put PANIC calls wherever necessary.

Exceptions

An exception will occur if our processor encounters serious errors such as invalid instruction, invalid memory address (page fault) or a system call.  When an exception occurs the values in various communication and status registers (CSRs) provide details to help resolve the issue.
User mode applications are not able to access CSRs but if the kernel operates in supervisor mode it can read and write CSRs to deal with problems.  Thankfully openSBI initialises our machine in supervisor mode so our kernel can process exceptions.


Risc-v also incorporates machine mode which deals with hardware level issues.  We are not concerned with these as we building a kernel and openSBI has already configured the platform for us.  I may return to M-mode if/when I look at bare metal programming on a real risc-v processor, for example the VisionFire2.

We will configure our kernel to print CSRs and stop when it encounters an exception.

First we can define a couple of macros with inline assembly language so that we can use privileged instructions CSRR and CSRW to respectively read and write CSRs.






Next we define a handle_trap function which obtains the values of three CSRs, prints them and uses a PANIC macros to stop processing.


We also setup a function called kernel_entry which saves registers then calls handle_trap. Near the start of our kernel we setup the exception handler by setting the stvec register to point to kernel_entry.

The kernel then executes "unimp", a pseudo instruction which triggers an illegal instruction exception.

When our kernel runs it encounters the illegal instruction, which triggers an exception.
Processing jumps to the address in stvec, which is the kernel_entry function.

kernel_entry saves registers then calls handle_trap to retrieve CSR values and display them using the PANIC function.  These values can be used to determine what caused the exception error.  In the example above we can see that the program counter sepc was at address 80200134 when the exception occured.  scause=2 indicates an illegal instruction. Using the llvm-addr2line utility we discover that address 80200134 corresponds to line 135 in kernel.c.  Looking at kernel.c we see that the "unimp" instruction is indeed at line 135!  MagicπŸ˜€

Allocate Memory

We use a very simple allocation system which allocates 4096 bytes (hex 0x1000) of physical memory for each request and never frees up memory after use.  Each time alloc_pages is called it returns the memory address __free_ram and increases __free_ram by 4096.
Our test program calls alloc_pages twice, the first time two pages are allocated starting at 8022-1000 and then a page is allocated at 8022-3000







Physical memory starts at 8020-0000 and we start allocating physical memory at the initial value of __free_ram 8022-1000 which follows programs, data, variables and the stack.











Process

Now we need the concept of a process.  The  kernel needs to be able to run more than one application at a time.  Each process will have its own execution context (registers etc) and resources (address space etc).  We need the ability to setup a process for each application and to switch between the contexts to execute different applications.
A Process Control Block (PCB) structure is defined which contains pid, state, stack pointer and an 8KB stack.



Our kernel has PROCS_MAX =8 so it can run 8 processes.
A PCB structure array procs contains 8 entries.
The create_process function is quite simple.  First it checks whether there is a free process slot.

If there is a free process slot it initialises the stack pointer and saves (initialises) 0s to the stack for register values.
These values will be restored to registers the first time the process is started.





The context switch is also very simple.  First it saves "callee registers" (ra, s0-s11) next it switches to the stack pointer for the new process and restores its callee registers.



Now comes the real magic.  We define two functions proc_a_entry and proc_b_entry which will do the work.  We keep this first test simple, each process prints a character then switches to the other process.  A delay is incorporated to stop characters coming out too quickly.
Next, processes proc_a and proc_b are created and the functions start address is passed over.  This will become the program counter (PC).
Finally proc_a_entry is invoked.  It prints 'A' and context switches to proc_b.  The switch includes restoring the return  register for proc_b so the context_switch returns into proc_b  which prints 'B' then switches back to proc_a.
The processes continue to swap and a string of A s and Bs is printed.
I think this is awesome.  
The actual C programming is a little complicated for me because it uses pointers so much, but the result is wonderful.😁😁

Scheduler



Our first approach to context switching was awesome but we certainly dont want each process to specify which process will run next.  Instead, at a convenient time we tell the function to yield. The yield function is responsible for scheduling another task to run.  Note that functions are still running within the kernel and the programmer has responsibilty to pass control to the scheduler.  I believe (not sure) that you can use multiple yield functions at various points in a function.

The yield function contains the scheduler.  It steps through the Program Control Blocks until it finds another process that is free to run and then does a context_switch to that process.  If it doesn't find another process it continues with the current process.

This is beautifully simple 😁

There is a little extra work required in the exception handler to keep track of the current stack pointers but it isn't a major change (well I don't really understand it) so I wont include it here.






The Story So Far

We have made awesome progress on our kernel; we can boot it up, write output to a console, use standard library functions (ours are somewhat simplified).  We can also deal with crashes, allocate memory and run / schedule multiple processes.
So far everything is running within the kernel which is running in supervisor mode  Next we must prepare the ability to run user mode applications and prevent them adversely affecting the kernel or other user programs.

Saturday, 25 January 2025

RISC-V Bare Metal OS

Intro

I saw this article on hackaday and, as you can imagine, it set my heart racing.

Bare-metal programming isn't trivial, it typically requires intimate knowledge of hardware and a good understanding of assembly language programming.  The thought of writing a bare metal OS is pretty stunning - it isn't something I have contemplated previously.

I don't want to try learning a new assembly language but I am familiar with RISC-V assembler so I am not put off by it.
This project is built on QEMU so it avoids the need to purchase some new hardware.  In addition real-world hardware programming often has a tortuous method to load and test programs which it is good to avoid.

Preparation

There is an excellent online "book" which shows the code required at each step and clear instructions how to build and test it.

The installation instructions didn't work on Windows (always some complication with Windows) so I setup on PI41 linux where everything worked perfectly.

Code for the project is compiled using Clang (front-end) / LLVM (back-end).  Programs are written in C.  The small amount of assembly language required is incorporated using inline assembly instructions.  This closely integrates the C and assembly code so that they share data easily.

The only other software required in addition to QEMU, Clang and LLVM is openSBI.

QEMU includes a nice "generic" platform for RISC-V virtual machines called 'virt'.  The specification is shown on the right.  There is much more virtual hardware provided than we need.  In fact we just need an RV32 core, some memory and the UART console.  Later on we will add disk I/O for which we will attach a virtio device to the Virtual Machine.

Even though we are doing bare metal programming we need lower-level firmware SBI  (Supervisor Binary Interface, like BIOS/UEFI) to initialise the hardware prior to loading our programs.  Fortunately there is an openSBI implementation for our 'virt' machine.  This gives us everything we need.

First Boot

There is a lot of magic in our first step.

The memory layout is define in kernel.ld.  There are four sections text(the code), rodata (read-only data), data (variables), bss (zero-initialised variables).
Code and data are placed in memory starting at 32-bit address 80:20:00:00.  32 bit addresses provide 4GB storage.  "80" tells us that we are in the top half of storage (reserved for supervisor mode?) and "20" tells us that the start address is 2MB from the start.
The entry point will be the boot function address.

Our minimal first supervisor code is very short.  Luckily most of it can be written in C, in line assembly is used where C cannot cope.
The boot function, simply sets the stack pointer and jumps into the kernel_main program.  There are only two assembly instructions required, move and jump.
All the main program does is to initialise memory and  then loop for ever.  The important memory addresses (such as stack_top) are defined in kernel.ld and can then be referenced in kernel.c



The program is compiled to an ELF executable kernel.elf, we can look at the assembly using llvm-objdump, as shown above.  QEMU starts OpenSBI to initialise the hardware then loads and executes kernel.elf.  Our program doesnt do much, using the QEMU console we can see it loops at 80200050.

After our exercise we have a working environment, we have initialised our machine and loaded a minimal supervisor.  Excellent!

A Diversion: online compiler

The "book" mentions that Compiler Explorer is an online tool which enables you to see how a compiler translates C into assembly language.  This can be very useful when writing a mixture of C and assembler.  Code produced by compilers is very specific to the the compiler brand, its versions and switches used to invoke it.  For example Optimisation switches affect the code generated.  In our case we will be using riscv 32 bit clang as the variant.  The screenshot below shows how the compiler translates a simple square function.



Hello World!

We have now mastered our memory and CPU and we have compiled, loaded and executed our simplest kernel.  An OS which doesn't communicate with the outside world is useless and our next step is to print messages to our console.  Luckily OpenSBI has setup our virtual hardware console so we dont need to control the virtual serial UART directly to write to the console.  Instead we make an "ecall" to openSBI including the character we want to print within the parameters and let it work out the details.  The OpenSBI specification is available on github. 





To write a character to the console SBI provides a "putchar" function.   I dont quite understand the syntax but we set up risc-v registers a0-a7 as parameters to the sbi_call function, then we put the character to print ch in a0, and the putchar function id #0x01 in a7 and make the call.

The result is a "Hello World!" message on the virtual console.


Once we have a putchar function available in our C environment we can write our own printf function so that we can print formatted numbers, strings etc.



The printf function provided here is a simplified version which prints numbers and strings but is still very powerful allowing a list of variables and expressions to be printed.



C Standard Library

Previously we implemented a simplified version of the standard library functions printf to print a string to the console.  Looking at the code in printf we can see it is standard C, we don't need any environment specific coding.  There is a hint we could copy functions from the standard C library into our program, saving us a load of working and giving us many tools to help write our kernel.  However, in practice the C standard library is very complicated. I looked at vprintf.c which is the "Actual printf innards", it says in the header "This code is large and complicated".  The C header files and code are much more advanced than me and I dont want to make this project about learning a lot about C pointers.

The instructions provided actually give us a small set of subroutines which we will need to help us.  The ones we have available are memset (initialise a range of memory), memcpy (copy a range of memory), strcpy (copy a string), strcmp (compare two strings).


Intermission

We are making wonderful progress with our kernel.  We have been able to use openSBI to setup our RISC-V VM and we are able to write C programs which communicate with the SBI API to utilise our virtual hardware, in particular to print messages on the virtual console.