Aruna

Android Device Security and GSI: A Layman’s Guide

2024-12-10T00:00:00-08:00

Android devices come with several security components that work together to protect your device and personal data. Here’s a simple explanation of who makes these components, their security implications, and how they collaborate to keep your device (data) safe.

AI Generated Image : credits to chatGPT

Key Components and Entities

When it comes to an Android device—whether it’s a mobile phone, tablet, car entertainment system, or any other form factor—there are several key components involved in its realization:

Bootloader: The initial software that runs when the device powers on, responsible for starting the kernel.
Linux kernel, serves as the core subsystem of the Android operating system, managing communication between hardware and software.
Android system image, includes the Android operating system and essential applications. Android is open source, vendors can modify the implementation to suit their own hardware as they wish.

Each of these components is developed by different entities: the bootloader is provided by the hardware manufacturer (vendor), the Linux kernel is developed by the open-source community under the Linux Foundation, and the Android system image is provided by Google. Device manufacturers can modify the Android system image to suit their own hardware as they wish.

The Chain of Trust Among Components

The bootloader, Linux kernel, and Android system image work together to create a secure environment for your device. When the device powers on, the bootloader verifies the kernel’s integrity using cryptographic signatures. Subsequently, the kernel verifies the Android system image before loading it. This sequential verification establishes a chain of trust, ensuring that each component is authentic and has not been tampered with. Additionally, Over-The-Air (OTA) updates are signed with private keys and verified using public keys, maintaining the security and integrity of the updates applied to the system image.

Generic System Images (GSI)

When manufacturers stop providing updates for their devices, users can turn to Generic System Images (GSIs) to receive the latest Android versions. GSIs are system images with adjusted configurations for Android devices, compatible with Project Treble.

What Is a GSI? A GSI is a barebones Android OS image that can run on any device compliant with Project Treble, regardless of the manufacturer.

Who Provides It? While Google provides the official Android source code via Android Open Source Project (AOSP), it is utilized by several open-source projects to run the latest Android versions on devices abandoned by their vendors. There are more generic GSI versions, such as TrebleDroid and treble_aosp, which work on many different devices, and device-specific GSI versions, like DUO-DE, designed specifically for the Microsoft Surface Duo’s unique dual-screen foldable configuration. Often, people confuse these GSI images with Pixel ROMs; however, GSIs are implemented using the barebone Android images provided by Google.

GSI and Security

To install a GSI image on an Android device, you must unlock the device’s bootloader unless you are the vendor with access to the device’s private keys. Unlocking the bootloader disables certain trust mechanisms implemented in the bootloader. However, security features such as Android Debug Bridge (ADB) authorization continue to protect your device data if the device is lost or falls into the possession of an adversary.

Since unlocking the bootloader disables the security mechanisms it provides, the trust in the GSI is established independently. To ensure this trust, GSI maintainers use their own release keys to sign the GSI packages. These keys are similar to vendor keys. They provide the same level of security for operations on the device, preventing unauthorized modifications and maintaining the integrity of the system. Once you manually flash a GSI image to your device, GSI keys ensure the security of the following components.

They ensure that over-the-air (OTA) updates are authentic and have not been tampered with. During installation, the device verifies the signature using the embedded public key.
Protection of System Applications: Without the proper release keys, system applications cannot be replaced or modified.

Since the Android system of the GSI is originally from the Android, all the security features available in the lastet version of the Android is availble to the users via GSI.

Conclusion

In this guide, we’ve explored the key components that ensure the security of Android devices, including the bootloader, Linux kernel, and Android system image. We discussed how these components interact through a chain of trust, utilizing cryptographic signatures with private and public keys to verify each stage of the boot process and Over-The-Air (OTA) updates. Additionally, we discussed about Generic System Images (GSIs), highlighting their role in providing up-to-date Android versions for devices no longer receiving manufacturer updates.

Ultimately, the trustworthiness of an Android device hinges on the reliability of its manufacturers and, in the case of GSIs, the maintainers behind these open-source projects. When using a device from an unknown manufacturer, users must place a certain level of trust in that manufacturer’s commitment to security. Similarly, opting for a GSI replaces the manufacturer’s role with GSI maintainers. The advantage here is that most GSI projects are open source, allowing the public to review and contribute to their implementations, thereby enhancing transparency and security.

Key Takeaways:

Chain of Trust: Ensures that each component of the Android system is verified and secure, preventing unauthorized modifications.
GSIs as a Solution: Provide a way to keep devices secure and updated even after manufacturers cease support.
Open Source Benefits: GSIs benefit from community oversight, increasing the likelihood of identifying and addressing security vulnerabilities promptly.
Trust Considerations: Whether relying on manufacturers or GSI maintainers, users must trust that the entities managing their device’s software prioritize security and integrity.

By understanding these components and the dynamics of trust in both manufacturer-provided and community-driven updates, users can make informed decisions to maintain the security and longevity of their Android devices.

TinyOS🐞: Interrupts and Peripherals

2024-04-08T00:00:00-07:00

A computer needs to communicate with the external world to perform tasks. To facilitate this, we have peripheral hardware. When these peripherals need to talk to the operating system, we have interrupts. In this episode of TinyOS🐞 tutorial series, we will be looking at interrupts and how to use them.

Programmable Interrupt Controller

A Programmable Interrupt Controller (PIC) is a hardware component in the system that is crucial for managing interrupt requests (IRQs) from various peripherals. Its primary function is to prioritize interrupt signals from hardware peripherals based on their urgency and importance, making sure that critical tasks are handled promptly. When an interrupt occurs, the PIC suspends the CPU’s current task and directs it to an interrupt handler, which manages the interrupt and executes necessary actions. The PIC also allows for interrupt masking and priority configuration, enabling system designers to customize interrupt handling according to specific requirements. While modern computer systems may utilize more advanced interrupt controllers like the Advanced Programmable Interrupt Controller (APIC), the fundamental role of prioritizing and managing interrupts remains essential for efficient system operation.

Interrupt

Let’s first review the types of interrupts in RISC-V, which can be broken down into several major categories:

Local Interrupt
- Software Interrupt
- Timer Interrupt
Global Interrupt
- External Interrupt

The Exception Code of various interrupts is also defined in detail in the RISC-V specification

Specifically, the exception code will be recorded in the mcause register.

If we want system programs running in RISC-V to support interrupt processing, we also need to set the field value of the MIE Register:

// Machine-mode Interrupt Enable
#define MIE_MEIE (1 << 11) // external
#define MIE_MTIE (1 << 7)  // timer
#define MIE_MSIE (1 << 3)  // software
// enable machine-mode timer interrupts.
w_mie(r_mie() | MIE_MTIE);

PIC in RISC-V

RISC-V has its own Programmable Interrupt Controller implementation known as Platform-Level Interrupt Controller (PLIC). As we discussed earlier, there can be multiple interrupt sources (keyboard, mouse, hard disk…) connected to PLIC of a system. PLIC will determine the priority of these interrupts and then allocate them to the processor’s Hart (the minimum hardware thread in RISC-V) for processing by the CPU.

Interrupt Request

An Interrupt Request is also known as IRQ, is a mechanism used by hardware devices to signal the CPU that they need attention or service. When a hardware device requires the CPU to perform a task, such as processing incoming data or handling an event, it sends an interrupt request. The CPU then temporarily suspends its current operation, saves its state, and jumps to a predefined location in memory known as an interrupt handler. This handler executes the necessary actions to address the request from the device. IRQs are assigned unique numerical identifiers, typically ranging from 0 to 15 in legacy systems, to distinguish between different interrupt sources. Each IRQ is associated with specific hardware components, such as keyboards, mice, storage devices, or network cards, allowing the CPU to prioritize and handle interrupts appropriately. Taking the RISC-V virtual machine - Virt in Qemu as an example, its source code defines IRQs for different interrupts as follows:

enum {
    UART0_IRQ = 10,
    RTC_IRQ = 11,
    VIRTIO_IRQ = 1, /* 1 to 8 */
    VIRTIO_COUNT = 8,
    PCIE_IRQ = 0x20, /* 32 to 35 */
    VIRTIO_NDEV = 0x35 /* Arbitrary maximum number of interrupts */
};

When we are writing an operating system, we can use the IRQ code to identify the type of external interrupt and solve the problems of keyboard input and disk reading and writing.

Configuring the PIC

As the name of PIC suggests, it can be programmed. For this purpose, PLIC adopts a Memory Map mechanism, which maps some important information to Main Memory. In this way, we can communicate with PLIC by accessing the memory. We can find these memory map definitions in Virt’s source code, which defines the virtual locations of PLIC as follows,

static const MemMapEntry virt_memmap[] = {
    [VIRT_DEBUG] =       {        0x0,         0x100 },
    [VIRT_MROM] =        {     0x1000,        0xf000 },
    [VIRT_TEST] =        {   0x100000,        0x1000 },
    [VIRT_RTC] =         {   0x101000,        0x1000 },
    [VIRT_CLINT] =       {  0x2000000,       0x10000 },
    [VIRT_PCIE_PIO] =    {  0x3000000,       0x10000 },
    [VIRT_PLIC] =        {  0xc000000, VIRT_PLIC_SIZE(VIRT_CPUS_MAX * 2) },
    [VIRT_UART0] =       { 0x10000000,         0x100 },
    [VIRT_VIRTIO] =      { 0x10001000,        0x1000 },
    [VIRT_FW_CFG] =      { 0x10100000,          0x18 },
    [VIRT_FLASH] =       { 0x20000000,     0x4000000 },
    [VIRT_PCIE_ECAM] =   { 0x30000000,    0x10000000 },
    [VIRT_PCIE_MMIO] =   { 0x40000000,    0x40000000 },
    [VIRT_DRAM] =        { 0x80000000,           0x0 },
};

Each PIC interrupt source will be represented by a temporary register. By adding PLIC_BASE to the offset offset of the temporary register, we can know the location where the temporary register is mapped to the main memory.

0xc000000 (PLIC_BASE) + offset = Mapped Address of register

Interrupts to TinyOS

I think so far we looked at the background of the interrupts. Let’s add this functionality to the TinyOS operating system. First, we need to initialize Virt’s PLIC controller. For that, we use the plic_init() function, which is defined in plic.c:

void plic_init()
{
  int hart = r_tp();
  // QEMU Virt machine support 7 priority (1 - 7),
  // The "0" is reserved, and the lowest priority is "1".
  *(uint32_t *)PLIC_PRIORITY(UART0_IRQ) = 1;

  /* Enable UART0 */
  *(uint32_t *)PLIC_MENABLE(hart) = (1 << UART0_IRQ);

  /* Set priority threshold for UART0. */

  *(uint32_t *)PLIC_MTHRESHOLD(hart) = 0;

  /* enable machine-mode external interrupts. */
  w_mie(r_mie() | MIE_MEIE);

  // enable machine-mode interrupts.
  w_mstatus(r_mstatus() | MSTATUS_MIE);
}

As shown in the above example, plic_init() mainly performs following initialization actions:

Set the priority of UART_IRQ. Since PLIC can manage multiple external interrupt sources, we must set priorities for different interrupt sources. Then in case of conflicting requests, PLIC will know which IRQ to process first.
Enable UART interrupt for hart0
Set threshold. IRQs less than or equal to this threshold will be ignored by PLIC. We can configure the threshold using,
```
*(uint32_t *)PLIC_MTHRESHOLD(hart) = 10;
```
In this way, the system will not process the UART’s IRQ.
Enable external interrupts and global interrupts in Machine mode. It should be noted that this project originally used trap_init() to enable global interrupts in Machine mode. After this modification, we changed plic_init() to be responsible.

Note that the peripherals also need configuration. In the case of UART, settings such as baud rate and other actions. uart_init() is defined in lib.c.

Modify Trap Handler

We discussed about trap hander in the episode Preemptive Scheduling. You might remember the following diagram.

graph LR
    C[trap_handler] --> D[soft_handler]
    C --> E[timer_handler]
    C --> F[exter_handler]

Previously in Preemptive Scheduling, trap_handler() only supported the processing of time interrupts. This time we want to make it support the processing of external interrupts as well.

/* In trap.c */
void external_handler()
{
  int irq = plic_claim();
  if (irq == UART0_IRQ)
  {
    lib_isr();
  }
  else if (irq)
  {
    lib_printf("unexpected interrupt irq = %d\n", irq);
  }

  if (irq)
  {
    plic_complete(irq);
  }
}

Because the goal this time is to enable the operating system to process UART IRQ, we need to add that to the interrupt request as above. This will invoke the function lib_isr().

/* In lib.c */
void lib_isr(void)
{
    for (;;)
    {
        int c = lib_getc();
        if (c == -1)
        {
            break;
        }
        else
        {
            lib_putc((char)c);
            lib_putc('\n');
        }
    }
}

The principle of lib_isr() is quite simple. It just repeatedly detects whether the UART’s RHR register has received new data. If it is empty (c == -1), it jumps out of the loop. Registers related to UART are defined in riscv.h. Some register addresses have been added to support lib_getc(). The general definitions of UART registers are as follows:

 #define UART 0x10000000L
 #define UART_THR (volatile uint8_t *)(UART + 0x00) // THR:transmitter holding register
 #define UART_RHR (volatile uint8_t *)(UART + 0x00) // RHR:Receive holding register
 #define UART_DLL (volatile uint8_t *)(UART + 0x00) // LSB of Divisor Latch (write mode)
 #define UART_DLM (volatile uint8_t *)(UART + 0x01) // MSB of Divisor Latch (write mode)
 #define UART_IER (volatile uint8_t *)(UART + 0x01) // Interrupt Enable Register
 #define UART_LCR (volatile uint8_t *)(UART + 0x03) // Line Control Register
 #define UART_LSR (volatile uint8_t *)(UART + 0x05) // LSR:line status register
 #define UART_LSR_EMPTY_MASK 0x40                   // LSR Bit 6: Transmitter empty; both the THR and LSR are empty

Simulation

Let’s see the TinyOS interrupt handler in action. If you have followed the tutorial series continuously, you know the steps.

cd 07-ExterInterrupt 
make

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -g -Wall -T os.ld -o os.elf start.s sys.s lib.c timer.c task.c os.c user.c trap.c lock.c plic.c

Next, you can run the Virt and type letters into the terminal, which will generate interrupt requests.

make qemu

In this episode, we have looked at configuring external peripherals and generating interrupts with that. I hope this was an interesting episode since this basic functionality is required when you are dealing with embedded systems in the future.

TinyOS🐞: Spinlocks

2024-04-07T00:00:00-07:00

In this episode of episode of the TinyOS🐞 tutorial series, we will be looking at how to protect critical sections in processes using spinlocks.

What is a Spinlock

A spinlock is a synchronization mechanism used to protect shared resources (such as data structures) from being accessed simultaneously by multiple threads of execution. Unlike other synchronization primitives like mutexes or semaphores, which typically put threads to sleep when the resource they’re trying to access is unavailable, a spinlock causes a thread trying to acquire the lock to repeatedly “spin” in a loop (i.e., continuously checking the lock’s state) until it becomes available.

The basic idea behind a spinlock is simple: when a thread wants to acquire the lock, it checks to see if the lock is available. If it is, the thread acquires the lock and continues execution. If the lock is not available (i.e., another thread holds it), the thread continuously polls the lock until it becomes available, at which point it acquires the lock and proceeds.

Atomic operations

Atomic operations can ensure that an operation will not be interrupted by other operations before completion. Taking RISC-V as an example, it provides RV32A Instruction set, which are all atomic operations (Atomic).

In order to avoid multiple Spinlocks accessing the same memory at the same time, atomic operations are used in the Spinlock to ensure correct locking logic implementation.

In fact, not only Spinlock, mutex lock also requires Atomic operation in implementation.

Simple Spinlock in C language

Consider the following code:

typedef struct spinlock{
    volatile uint lock;
} spinlock_t;
void lock(spinlock_t *lock){
    while(xchg(lock−>lock, 1) != 0);
}
void unlock(spinlock_t *lock){
    lock->lock = 0;
}

Through the sample code, you can notice a few points:

Keyword volatile: volatile keyword lets the compiler know that the variable may be accessed in unexpected circumstances, so do not optimize the variable’s instructions to avoid storing the result in the Register, but write it directly to memory.
Lock function: xchg(a,b) The contents of the two variables a and b can be swapped, and the function is an atomic operation. When the lock value is not 0, the execution thread will spin and wait until the lock is 0 (that is, it can be locked )until.
Unlock function: Since only one thread can obtain the lock at the same time, there is no need to worry about preemption of access when unlocking. Because of this, the example does not use atomic operations.

Simple Lock

First of all, since TinyOS is a Single Hart (hardware thread) operating system, in addition to using atomic operations, there is actually a very simple way to achieve the locking effect:

void basic_lock()
{
  w_mstatus(r_mstatus() & ~MSTATUS_MIE);
}

void basic_unlock()
{
  w_mstatus(r_mstatus() | MSTATUS_MIE);
}

In lock.c, we implement a very simple lock. When we invoke basic_lock() in the program, the system’s machine mode interrupt mechanism will be turned off. In this way, we can ensure that no there are other programs accessing the Shared memory to avoid the occurrence of Race condition.

Spinlock Implementation

The above lock has an obvious flaw: When the program that acquires the lock has not released the lock, the entire system will be blocked. In order to ensure that the operating system can still maintain the multi-tasking mechanism, we must implement a bit more complex lock :

typedef struct lock
{
  volatile int locked;
} lock_t;

void lock_init(lock_t *lock)
{
  lock->locked = 0;
}

void lock_acquire(lock_t *lock)
{
  for (;;)
  {
    if (!atomic_swap(lock))
    {
      break;
    }
  }
}

void lock_free(lock_t *lock)
{
  lock->locked = 0;
}

In fact, the above program code is basically the same as the previous example of Spinlock. When we implement it in the system, we only need to deal with one more troublesome problem, which is to implement the atomic swap action atomic_swap():

.globl atomic_swap
.align 4
atomic_swap:
        li a5, 1
        amoswap.w.aq a5, a5, 0(a0)
        mv a0, a5
        ret

As shown in above assembly construct, we can read the lock in the lock structure, exchange it with the value 1, and finally return the contents of the register a5. Further summarizing the execution results of the program, we can draw two cases:

Case 1- Successfully acquire the lock: When lock->locked is 0, after the exchange through amoswap.w.aq, the value of lock->locked is 1 and the return value (Value of a5) is 0:
```
void lock_acquire(lock_t *lock)
{
  for (;;)
  {
 if (!atomic_swap(lock))
 {
   break;
 }
  }
}
```
When the return value is 0, lock_acquire() will successfully jump out of the infinite loop and enter Critical sections for execution.
Case 2- No lock acquired: Otherwise, continue to try to obtain the lock in an infinite loop.

Simulation

If you followed the TinyOS tutorial series contnously, you know how to run the simulation of the code. If you missed the first article about setting up the environment, you can check it from here.

Now let’s take a look at the system’s behaviour.

cd tinyos/06-Spinlock 
make

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -g -Wall -T os.ld -o os.elf start.s sys.s lib.c timer.c task.c os.c user.c trap.c lock.c

make qemu

Press Ctrl-A and then X to exit QEMU qemu-system-riscv32 -nographic -smp 4 -machine virt -bios none -kernel os.elf OS start OS: Activate next task Task0: Created! Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... timer interruption! timer_handler: 1 OS: Back to OS OS: Activate next task Task1: Created! Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... Task1: Running... timer interruption! timer_handler: 2 OS: Back to OS OS: Activate next task Task2: Created! The value of shared_var is: 550 The value of shared_var is: 600 The value of shared_var is: 650 The value of shared_var is: 700 The value of shared_var is: 750 The value of shared_var is: 800 The value of shared_var is: 850 The value of shared_var is: 900 The value of shared_var is: 950 The value of shared_var is: 1000 The value of shared_var is: 1050 The value of shared_var is: 1100 The value of shared_var is: 1150 The value of shared_var is: 1200 The value of shared_var is: 1250 The value of shared_var is: 1300 The value of shared_var is: 1350 The value of shared_var is: 1400 The value of shared_var is: 1450 The value of shared_var is: 1500 The value of shared_var is: 1550 The value of shared_var is: 1600 timer interruption! timer_handler: 3 OS: Back to OS OS: Activate next task Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... Task0: Running... timer interruption! timer_handler: 4 OS: Back to OS QEMU: Terminated

Debug Mode and Breakpoint

With the QEMU simulation, you can run the simulation in Debug mode with added break points. Following are the steps for running the TinyOS in debug mode.

make debug

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -g -Wall -T os.ld -o os.elf start.s sys.s lib.c timer.c task.c os.c user.c trap.c lock.c Press Ctrl-C and then input 'quit' to exit GDB and QEMU ------------------------------------------------------- Reading symbols from os.elf... Breakpoint 1 at 0x80000000: file start.s, line 7. 0x00001000 in ?? () => 0x00001000: 97 02 00 00 auipc t0,0x0 Thread 1 hit Breakpoint 1, _start () at start.s:7 7 csrr t0, mhartid # read current hart id => 0x80000000 <_start+0>: f3 22 40 f1 csrr t0,mhartid (gdb)

You can set the breakpoint in any c file using the following command,

(gdb) b trap.c:27

Breakpoint 2 at 0x80008f78: file trap.c, line 27. (gdb)

As the example above, when process running on trap.c, line 27 (Timer Interrupt). The process will be suspended automatically until you press the key c (continue) or s (step).

TinyOS🐞: Preemptive Scheduling

2024-04-06T00:00:00-07:00

In the MultiTasking episode of the TinyOS🐞 tutorial series, we implemented “Cooperative Multitasking”. Next in TimerInterrupt episode, we discussed how the RISC-V time interrupt mechanism works. If you have missed them, I highly recommend going through them before proceeding.

In this episode, we plan to combine the two techniques of the above episodes to implement a “Preemptive” operating system with forced time interruption. Technically, TinyOS is going to be a real-time operating system (RTOS) at the end of this episode.

Simulation

As we discussed in earlier episodes, we know that where there are multple processes running parallely, they need share the same set of resources between them. So with that in mind, let’s run the simulation. Simulation steps are as usual.

If you missed the first article about setting up the environment, you can check it from here.

First let’s take a look at the system’s behaviour.

cd tinyos/05-Preemptive
make

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -T os.ld -o os.elf start.s sys.s lib.c timer.c task.c os.c user.c

make qemu

As we can see, system switches the context between OS, Task0, and Task1 during the execution This situation is very similar to the simulation of MultiTasking episode, where both of which have the following execution sequence.

    stateDiagram-v2
    direction LR
    State1 :OS
	State2 : Task0
	State3 : OS
	State4 :Task1
	State5 : OS
	State6 : Task0
	State7 : OS
	State8 : Task1
	
    State1 --> State2
	State2 --> State3
	State3 --> State4
	State4 --> State5
	State5 --> State6
	State6 --> State7
	State7 --> State8

The only difference is that the user process in MultiTasking episode must actively return control to the operating system through os_kernel().

void user_task0(void)
{
	lib_puts("Task0: Created!\n");
	lib_puts("Task0: Now, return to kernel mode\n");
	os_kernel();
	while (1) {
		lib_puts("Task0: Running...\n");
		lib_delay(1000);
		os_kernel();
	}
}

However, during this simulation, the user schedule does not need to be actively handed back to the OS, but the OS forces the switching action through time interruption.

void user_task0(void)
{
	lib_puts("Task0: Created!\n");
	while (1) {
		lib_puts("Task0: Running...\n");
		lib_delay(1000);
	}
}

The lib_delay in lib.c is actually a delay loop and does not return control.

void lib_delay(volatile int count)
{
	count *= 50000;
	while (count--);
}

On the contrary, the operating system will forcefully take back control through time interruption. (Because lib_delay has a long delay, the operating system usually interrupts its while (count--) loop to take back control)

OS Kernel

The operating system os.c will initially call user_init() to allow the user to create tasks (in this example, user_task0 and user_task1 will be created in user.c.

#include "os.h"

void user_task0(void)
{
	lib_puts("Task0: Created!\n");
	while (1) {
		lib_puts("Task0: Running...\n");
		lib_delay(1000);
	}
}

void user_task1(void)
{
	lib_puts("Task1: Created!\n");
	while (1) {
		lib_puts("Task1: Running...\n");
		lib_delay(1000);
	}
}

void user_init() {
	task_create(&user_task0);
	task_create(&user_task1);
}

Then the operating system will set the time interrupt through the timer_init() function in os_start(), and then enter the main loop of os_main(), which adopts Round-Robin scheduling. In Round robin scheduling each process is assigned a fixed time slice in a cyclic manner, ensuring fairness by giving each process equal time on the CPU regardless of its priority or execution time.

#include "os.h"

void os_kernel() {
	task_os();
}

void os_start() {
	lib_puts("OS start\n");
	user_init();
	timer_init(); // start timer interrupt ...
}

int os_main(void)
{
	os_start();

	int current_task = 0;
	while (1) {
		lib_puts("OS: Activate next task\n");
		task_go(current_task);
		lib_puts("OS: Back to OS\n");
		current_task = (current_task + 1) % taskTop; // Round Robin Scheduling
		lib_puts("\n");
	}
	return 0;
}

In the interrupt mechanism of sys.s, we modified the interrupt vector table as below.

.globl trap_vector
# the trap vector base address must always be aligned on a 4-byte boundary
.align 4
trap_vector:
	# save context(registers).
	csrrw	t6, mscratch, t6	# swap t6 and mscratch
        reg_save t6
	csrw	mscratch, t6
	# call the C trap handler in trap.c
	csrr	a0, mepc
	csrr	a1, mcause
	call	trap_handler

	# trap_handler will return the return address via a0.
	csrw	mepc, a0

	# load context(registers).
	csrr	t6, mscratch
	reg_load t6
	mret

Essentially what it does is when an interrupt occurs, the interrupt vector table trap_vector() will call trap_handler() in trap.c.

reg_t trap_handler(reg_t epc, reg_t cause)
{
  reg_t return_pc = epc;
  reg_t cause_code = cause & 0xfff;

  if (cause & 0x80000000)
  {
    /* Asynchronous trap - interrupt */
    switch (cause_code)
    {
    case 3:
      lib_puts("software interruption!\n");
      break;
    case 7:
      lib_puts("timer interruption!\n");
      // disable machine-mode timer interrupts.
      w_mie(~((~r_mie()) | (1 << 7)));
      timer_handler();
      return_pc = (reg_t)&os_kernel;
      // enable machine-mode timer interrupts.
      w_mie(r_mie() | MIE_MTIE);
      break;
    case 11:
      lib_puts("external interruption!\n");
      break;
    default:
      lib_puts("unknown async exception!\n");
      break;
    }
  }
  else
  {
    /* Synchronous trap - exception */
    lib_puts("Sync exceptions!\n");
    while (1)
    {
      /* code */
    }
  }
  return return_pc;
}

After jumping to trap_handler(), it will call different handlers for different types of interrupts, so we can think of it as an interrupt dispatch task relay station.

graph LR
    C[trap_handler] --> D[soft_handler]
    C --> E[timer_handler]
    C --> F[exter_handler]

trap_handler can hand over interrupt processing to different handlers according to different interrupt types. This can greatly improve the scalability of the operating system.

#include "timer.h"

// a scratch area per CPU for machine-mode timer interrupts.
reg_t timer_scratch[NCPU][5];

#define interval 20000000 // cycles; about 2 second in qemu.

void timer_init()
{
  // each CPU has a separate source of timer interrupts.
  int id = r_mhartid();

  // ask the CLINT for a timer interrupt.
  // int interval = 1000000; // cycles; about 1/10th second in qemu.

  *(reg_t *)CLINT_MTIMECMP(id) = *(reg_t *)CLINT_MTIME + interval;

  // prepare information in scratch[] for timervec.
  // scratch[0..2] : space for timervec to save registers.
  // scratch[3] : address of CLINT MTIMECMP register.
  // scratch[4] : desired interval (in cycles) between timer interrupts.
  reg_t *scratch = &timer_scratch[id][0];
  scratch[3] = CLINT_MTIMECMP(id);
  scratch[4] = interval;
  w_mscratch((reg_t)scratch);

  // enable machine-mode timer interrupts.
  w_mie(r_mie() | MIE_MTIE);
}

static int timer_count = 0;

void timer_handler()
{
  lib_printf("timer_handler: %d\n", ++timer_count);
  int id = r_mhartid();
  *(reg_t *)CLINT_MTIMECMP(id) = *(reg_t *)CLINT_MTIME + interval;
}

If you observe the function timer_handler() in timer.c, you can see that it invokes reset MTIMECMP.

/* In trap_handler() */
// ...
case 7:
      lib_puts("timer interruption!\n");
      // disable machine-mode timer interrupts.
      w_mie(~((~r_mie()) | (1 << 7)));
      timer_handler();
      return_pc = (reg_t)&os_kernel;
      // enable machine-mode timer interrupts.
      w_mie(r_mie() | MIE_MTIE);
      break;
// ...

In order to avoid interrupt nesting in Timer Interrupt, trap_handler() will close the timer interrupt before processing the interrupt, and then open it again after the processing is completed.

After timer_handler() is executed, trap_handler() will point mepc to os_kernel() to achieve the task switching function. In other words, if the interrupt does not belong to Timer Interrupt, the Program counter will jump back to the state before entering the interrupt. This step is defined in trap_vector() as below.

csrr	a0, mepc # a0 => arg1 (return_pc) of trap_handler()

Note In RISC-V, the parameters of the function will be first stored in the a0 - a7 registers. If the space is not enough, they will be stored in the Stack. Among them, the a0 and a1 registers also serve as function return values.

Finally, we import the trap and timer initialization actions when the Kernel is started as illustrated below.

void os_start()
{
	lib_puts("OS start\n");
	user_init();
	trap_init();
	timer_init(); // start timer interrupt ...
}

By forcibly taking back control through time interruption, we don’t have to worry about a bully schedule taking over the CPU, and the system will not be stuck by the bully and completely paralyzed. This is the most important “schedule management mechanism” in modern operating systems.

Remarks

Although TinyOS is just a “tiny” embedded operating system, it still demonstrates the design principle of a specific and simple “preemptible operating system” through relatively streamlined code.

Of course, there is still a long way to go to learn “Operating System Design”. In particular, TinyOS does not have a “File System”, and we haven’t even touched on the areas related to control and switching methods of supervisor mode and user mode in RISC-V. Further, OS needs to handle virtual memory mechanisms, so that processes cannot steal other process’s data.

Fortunately, you can learn more about these more complex mechanisms by studying xv6-riscv, a teaching operating system designed by MIT. The source code of xv6-riscv has a total of more than 8,000 lines, although not too few, xv6-riscv is a very streamlined system compared to modern Linux and Windows, which can run from millions to tens of millions of lines.

I hope this episode of TinyOS tutorial series gave you the basic understanging about how the preemptive multitasking is working on RISC-V environment. In the next episode let’s discuss about Spinlocks in RISC-V.

TinyOS🐞: Multitasking

2023-09-08T00:00:00-07:00

In the previous episode ContextSwitch of TinyOS🐞, we introduced the context switching mechanism under the RISC-V architecture. In this episode we will be looking at Multitasking in our DIY operating system.

Cooperative Multitasking

Modern operating systems have a Preemptive function that forcibly terminates the process through timed interruption, so that when a certain process occupies the CPU for too long, it is forcibly interrupted and switched to another process for execution.

However, in a system without a time interruption mechanism, the operating system cannot interrupt the current execution process, so it must rely on each process to actively return control to the operating system in order to allow all processes to have a chance to execute. This notion of multi-tasking system that relies on an automatic return mechanism is called a Coorperative Multitasking system. Windows 3.1 launched by Microsoft in 1991, Macintosh OS version 8.0-9.2, as well as HeliOS for Arduino MCU, are all operating systems that employee cooperative multitasking mechanisms.

As an illustration step in our TinyOS series, in this chapter, we will design a cooperative multitasking mechanism on a RISC-V processor.

Simulation

First let’s execute the program. For this you can navigate to the MultiTasking folder of the cloned repository from the docker image. If you missed the first article about setting up the environment, you can check it from here.

First let’s take a look at the system’s performance.

cd tinyos/03-MultiTasking
make qemu

Press Ctrl-A and then X to exit QEMU qemu-system-riscv32 -nographic -smp 4 -machine virt -bios none -kernel os.elf OS start OS: Activate next task Task0: Created! Task0: Now, return to kernel mode OS: Back to OS OS: Activate next task Task1: Created! Task1: Now, return to kernel mode OS: Back to OS OS: Activate next task Task0: Running... OS: Back to OS OS: Activate next task Task1: Running... OS: Back to OS OS: Activate next task Task0: Running... OS: Back to OS OS: Activate next task Task1: Running... OS: Back to OS OS: Activate next task Task0: Running... OS: Back to OS OS: Activate next task Task1: Running... OS: Back to OS OS: Activate next task Task0: Running... QEMU: Terminated

You can see that the system keeps switching between two tasks Task0, Task1, but the actual switching process is as follows:

    stateDiagram-v2
    direction LR
    State1 :OS
	State2 : Task0
	State3 : OS
	State4 :Task1
	State5 : OS
	State6 : Task0
	State7 : OS
	State8 : Task1
	
    State1 --> State2
	State2 --> State3
	State3 --> State4
	State4 --> State5
	State5 --> State6
	State6 --> State7
	State7 --> State8

Operating system needs cordinate the two task to execute sequancially. In this episode, we will looking at the underlying logic of this implementation.

User Tasks

In user.c, we define two tasks, user_task0 and user_task1, and finally initialize these two tasks in the user_init function.

// user.c
#include "os.h"

void user_task0(void)
{
	lib_puts("Task0: Created!\n");
	lib_puts("Task0: Now, return to kernel mode\n");
	os_kernel();
	while (1) {
		lib_puts("Task0: Running...\n");
		lib_delay(1000);
		os_kernel();
	}
}

void user_task1(void)
{
	lib_puts("Task1: Created!\n");
	lib_puts("Task1: Now, return to kernel mode\n");
	os_kernel();
	while (1) {
		lib_puts("Task1: Running...\n");
		lib_delay(1000);
		os_kernel();
	}
}

void user_init() {
	task_create(&user_task0);
	task_create(&user_task1);
}

OS Kernel

Then, in the main program os.c of the operating system, we use a while loop to arrange each process to be executed sequentially.

// os.c
#include "os.h"

void os_kernel() {
	task_os();
}

void os_start() {
	lib_puts("OS start\n");
	user_init();
}

int os_main(void)
{
	os_start();
	
	int current_task = 0;
	while (1) {
		lib_puts("OS: Activate next task\n");
		task_go(current_task);
		lib_puts("OS: Back to OS\n");
		current_task = (current_task + 1) % taskTop; // Round Robin Scheduling
		lib_puts("\n");
	}
	return 0;
}

The above scheduling method is in principle consistent with Round Robin Scheduling, but Round Robin Scheduling must be equipped with a timed interruption mechanism in principle, but the code in this episode has no timed interruptions, so it can only be said to be the Round Robin Scheduling of the collaborative multitasking version.

Cooperative multitasking must rely on each task to actively return control. For example, in user_task0, whenever the os_kernel() function is called, the context switching mechanism will be called to return control to the operating system.

void user_task0(void)
{
	lib_puts("Task0: Created!\n");
	lib_puts("Task0: Now, return to kernel mode\n");
	os_kernel();
	while (1) {
		lib_puts("Task0: Running...\n");
		lib_delay(1000);
		os_kernel();
	}
}

The os_kernel() function of os.c will call the task_os() of task.c

void os_kernel() {
	task_os();
}

And task_os() will call sys_switch in assembly language sys.s to switch back to the operating system.

// switch back to os
void task_os() {
	struct context *ctx = ctx_now;
	ctx_now = &ctx_os;
	sys_switch(ctx, &ctx_os);
}

So the whole system is executed in turn letting the other process to execute under the cooperation of os_main(), user_task0(), user_task1().

os_main() function in os.c looks like this.

int os_main(void)
{
	os_start();
	
	int current_task = 0;
	while (1) {
		lib_puts("OS: Activate next task\n");
		task_go(current_task);
		lib_puts("OS: Back to OS\n");
		current_task = (current_task + 1) % taskTop; // Round Robin Scheduling
		lib_puts("\n");
	}
	return 0;
}

user_task0() and user_task1() functions in user.c looks like this.

void user_task0(void)
{
	lib_puts("Task0: Created!\n");
	lib_puts("Task0: Now, return to kernel mode\n");
	os_kernel();
	while (1) {
		lib_puts("Task0: Running...\n");
		lib_delay(1000);
		os_kernel();
	}
}

void user_task1(void)
{
	lib_puts("Task1: Created!\n");
	lib_puts("Task1: Now, return to kernel mode\n");
	os_kernel();
	while (1) {
		lib_puts("Task1: Running...\n");
		lib_delay(1000);
		os_kernel();
	}
}

The above is an example of a specific and micro cooperative multitasking system on the RISC-V processor. In the next episode of TinyOS🐞, let’s look at implementation of TimerIntterupts.

TinyOS🐞: TimerInterrupts

2023-09-08T00:00:00-07:00

In the previous episode MultiTasking, we implemented a operating system with cooperative multitasking. However, without the implementation of an interruption mechanism, our system cannot support preemptive multitasking.

This episode will lay the foundation for a “Preemptive Multitasking System” by introducing the utilization of the “Time Interrupt Mechanism” in RISC-V processors. Through time interrupts, we gain the ability to regain control at predefined intervals, ensuring that a third-party application cannot indefinitely seize control of the system without yielding control back to the operating system.

Main Concepts for TimerInterrupts

Before learning how the system implements the time interruption mechanism, we must first understand a few things:

Generating Timer Interrupts
Interrupt Vector Table
Control and Status Registers (CSR)

Lets go throgh each of the concept one by one.

Generating Timer Interrupts

The RISC-V architecture specifies that the system platform must include a timer, and this timer must feature two 64-bit registers: mtime and mtimecmp. The purpose of these registers is as follows:

mtime (Machine Time): This register is utilized to keep track of the current counter value of the timer. It serves as a continuously incrementing counter, recording the passage of time in the system.
mtimecmp (Machine Time Compare): The mtimecmp register is employed to set a comparison value against which the value of mtime is compared. When the value of mtime becomes greater than the value stored in mtimecmp, an interrupt is triggered.

These registers are integral for implementing time-based interrupt handling in RISC-V systems. By comparing mtime with the value stored in mtimecmp, the system can generate interrupts at specific time intervals, facilitating various timing-related tasks and enabling features such as preemptive multitasking and real-time scheduling. You can find the definitions for these two registers in riscv.h:

After understanding the mechanism for generating a Timer interrupt, we will examine a piece of code that defines the time interval (Interval) for each interrupt trigger in the upcoming explanation.

// ================== Timer Interrput ====================

#define NCPU 8             // maximum number of CPUs
#define CLINT 0x2000000
#define CLINT_MTIMECMP(hartid) (CLINT + 0x4000 + 4*(hartid))
#define CLINT_MTIME (CLINT + 0xBFF8) // cycles since boot.

Additionally, during system initialization, it’s essential to enable the Timer interrupt. This can be achieved by setting the corresponding field in the mie (Machine Interrupt Enable) register to 1.

What is the Interrupt Vector Table?

The interrupt vector table is a data structure managed by the system program. It serves as a mapping between interrupt numbers or types and their corresponding interrupt handlers. When a specific interrupt or exception occurs, the system will look up the corresponding Interrupt Handler in this table.

Here’s how it works:

When an interrupt, such as a time interrupt, occurs, the processor first stops executing the current program’s instructions.
It then looks up the interrupt or exception type in the interrupt vector table to find the associated Interrupt_Handler.
The processor transfers control to the Interrupt_Handler, which is a predefined piece of code responsible for handling that specific type of interrupt or exception.
The Interrupt_Handler performs the necessary processing, which may include saving the current context, handling the interrupt’s specific tasks, and eventually returning control to the interrupted program.
After completing the handling of the interrupt, the processor jumps back to the original instruction address in the interrupted program, allowing it to continue execution as if the interrupt never occurred.

This mechanism is essential for managing and responding to various interrupts and exceptions in a systematic and controlled manner, ensuring that the system remains stable and responsive. The interrupt vector table plays a crucial role in facilitating this process by directing the processor to the appropriate interrupt handling routines.

Note: When an exception or interrupt occurs, the processor will stop the current process, point the address of the Program counter to the address pointed by mtvec and start execution. Such behavior is like actively jumping into a trap. Therefore, this action is defined as Trap in the RISC-V architecture. In the xv6 (risc-v) operating system, we can also find a series of Operations to handle Interrupt (mostly defined in Trap.c).

Control and Status Registers (CSR)

The RISC-V architecture encompasses numerous registers, including a category known as Control and Status Registers (CSRs), as highlighted in the title. CSRs serve the crucial role of configuring and recording the processor’s operational status.

CSR (Control and Status Registers):
- mtvec: This register specifies the address that the Program Counter (PC) will jump to when an exception occurs, allowing exception handling to begin.
- mcause: It records the reason for encountering an exception or anomaly.
- mtval: This register is used to store additional information or messages related to the encountered exception.
- mepc: Before entering an exception, it holds the address pointed to by the PC, which can be read to resume execution after handling the exception.
- mstatus: This register’s fields are updated by hardware when an exception is entered, reflecting various status changes.
- mie: It determines whether interrupts are enabled or disabled.
- mip: This register indicates the pending status of different types of interrupts.
Memory Address Mapped:
- mtime: Records the current value of the timer.
- mtimecmp: Stores a comparison value for the timer, against which mtime is compared to generate timer interrupts.
- msip: Used for generating or clearing software interrupts.
- Platform-Level Interrupt Controller (PLIC): This external hardware component handles and manages interrupts from various sources and devices in the system, ensuring that they are appropriately routed to the processor for handling.

In addition, RISC-V defines a series of instructions that allow developers to operate the CSR register:

csrs: Set the specified bit in the CSR to 1.

csrsi mstatus, (1 << 2)

The above command will set the third position of mstatus from the LSB to 1.

csrc Set the specified bit in the CSR to 0.

csrsi mstatus, (1 << 2)

The above instruction will set the third position of mstatus from the LSB to 0.

csrr[c|s] Read the value of CSR into the general scratchpad.

csrr to, mscratch

csrw Write the value of the general scratchpad to the CSR.

csrw mepc, a0

csrrw[i] Write the value of csr to rd and the value of rs1 to csr .

csrrw rd, csr, rs1/imm

Think about it from another perspective:

csrrw t6, mscratch, t6

The above operation can interchange the values of register t6 and mscratch.

Simulation

You can clone the tinyos repository if you havent already. If you missed the introduction episode of this series, you can check it out from here. Then from the docker environment, navegate to 04-TimerInterrupt folder on the mounted repo. After you use make clean, make and other commands to build the project, you can use make qemu to start simulation. The results are as follows:

make

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -T os.ld -o os.elf start.s sys.s lib.c timer.c os.c

makeqemu

Press Ctrl-A and then X to exit QEMU qemu-system-riscv32 -nographic -smp 4 -machine virt -bios none -kernel os.elf OS start timer_handler: 1 timer_handler: 2 timer_handler: 3 timer_handler: 4 timer_handler: 5 timer_handler: 6 timer_handler: 7 timer_handler: 8 timer_handler: 9

The system will consistantly print out a message like timer_handler: i about once per second, which means that the time interrupt mechanism is successfully started and interrupts are performed regularly.

Discussion

Before explaining time interruption, let us first take a look at the contents of the operating system main program os.c.

#include "os.h"

int os_main(void)
{
lib_puts("OS start\n");
timer_init(); // start timer interrupt ...
while (1) {} // os : do nothing, just loop!
return 0;
}

Basically, after this program prints OS start, it starts the time interrupt, and then enters the os_loop() infinite loop function and gets stuck.

But why does the system print a message like timer_handler: i later?

timer_handler: 1
timer_handler: 2
timer_handler: 3

This is of course caused by the time interruption mechanism!

Let’s take a look at the contents of timer.c. Please pay special attention to the line w_mtvec((reg_t)sys_timer). When a time interrupt occurs, the program will jump to the sys_timer macro in sys.s.

#include "timer.h"

#define interval 10000000 // cycles; about 1 second in qemu.

void timer_init()
{
  // each CPU has a separate source of timer interrupts.
  int id = r_mhartid();

  // ask the CLINT for a timer interrupt.
  *(reg_t*)CLINT_MTIMECMP(id) = *(reg_t*)CLINT_MTIME + interval;

  // set the machine-mode trap handler.
  w_mtvec((reg_t)sys_timer);

  // enable machine-mode interrupts.
  w_mstatus(r_mstatus() | MSTATUS_MIE);

  // enable machine-mode timer interrupts.
  w_mie(r_mie() | MIE_MTIE);
}

The sys_timer function in sys.s will use the csrr privileged instruction to temporarily store the mepc privileged register (the address that stores the interrupt point) in a0 for storage. After timer_handler() is executed, it can do a mret return to the interruption point.

sys_timer:
	# call the C timer_handler(reg_t epc, reg_t cause)
	csrr	a0, mepc
	csrr	a1, mcause
	call	timer_handler

	# timer_handler will return the return address via a0.
	csrw	mepc, a0

	mret # back to interrupt location (pc=mepc)

Note that RISC-V defines three execution modes in their privilage level extention, namely “machine mode, super mode and user mode”.

All TinyOS tutorials are executed in machine mode, and super mode (user mode is not used).

mepc means that when an interrupt occurs in machine mode, the hardware will automatically execute the action of mepc=pc.

When sys_timer executes mret, the hardware will execute the action of pc=mepc, and then jump back to the original interruption point to continue execution. (As if nothing happened)

I’ve provided a basic overview of the RISC-V interrupt mechanism. However, to gain a deeper understanding of the process, it’s crucial to understand the machine mode-related privilege registers of the RISC-V processor, including mhartid (processor core identifier), mstatus (status register), mie (interrupt enable register), and more.

#define interval 10000000 // cycles; about 1 second in qemu.

void timer_init()
{
  // each CPU has a separate source of timer interrupts.
  int id = r_mhartid();

  // ask the CLINT for a timer interrupt.
  *(reg_t*)CLINT_MTIMECMP(id) = *(reg_t*)CLINT_MTIME + interval;

  // set the machine-mode trap handler.
  w_mtvec((reg_t)sys_timer);

  // enable machine-mode interrupts.
  w_mstatus(r_mstatus() | MSTATUS_MIE);

  // enable machine-mode timer interrupts.
  w_mie(r_mie() | MIE_MTIE);
}

In addition, it is required to understand the memory mapping area in the RISC-V QEMU virtual machine, such as CLINT_MTIME, CLINT_MTIMECMP, etc.

The time interrupt mechanism of RISC-V is to compare the two values of CLINT_MTIME and CLINT_MTIMECMP. When CLINT_MTIME exceeds CLINT_MTIMECMP, an interrupt occurs.

Therefore, the timer_init() function has the following instructions

 *(reg_t*)CLINT_MTIMECMP(id) = *(reg_t*)CLINT_MTIME + interval;

This command is to set the first interruption time.

Similarly, in timer_handler of timer.c, you also need to set the next interrupt time as illustrated in below code.

reg_t timer_handler(reg_t epc, reg_t cause)
{
  reg_t return_pc = epc;
  // disable machine-mode timer interrupts.
  w_mie(~((~r_mie()) | (1 << 7)));
  lib_printf("timer_handler: %d\n", ++timer_count);
  int id = r_mhartid();
  *(reg_t *)CLINT_MTIMECMP(id) = *(reg_t *)CLINT_MTIME + interval;
  // enable machine-mode timer interrupts.
  w_mie(r_mie() | MIE_MTIE);
  return return_pc;
}

In this way, the next time the CLINT_MTIMECMP time comes, CLINT_MTIME will be greater than CLINT_MTIMECMP, and the interrupt will occur again.

In this episode of TinyOS, we looked the process of generating TimerInterrupts. This is a huge increment of the process of implementing preemptive muti tasking. In the Next episode, we will be specifically looking at that!

TinyOS🐞: Context-Switch

2023-08-28T00:00:00-07:00

In the previous episode HelloWorld of TinyOS🐞, we discussed how to print strings to the UART serial port for a specific processor on QEMU that utilises RISC-V architecture. This episode takes us further into the operating system territory, introducing the concept of “Context-Switching”.

Main File (os.c)

This time, in addition to stuff that we had earlier, we have a function called task

You can find the complete code os.c from here.

#include "os.h"

#define STACK_SIZE 1024
uint8_t task0_stack[STACK_SIZE];
struct context ctx_os;
struct context ctx_task;

extern void sys_switch();

void user_task0(void)
{
	lib_puts("Task0: Context Switch Success !\n");
	while (1) {} // stop here.
}

int os_main(void)
{
	lib_puts("OS start\n");
	ctx_task.ra = (reg_t) user_task0;
	ctx_task.sp = (reg_t) &task0_stack[STACK_SIZE-1];
	sys_switch(&ctx_os, &ctx_task);
	return 0;
}

Task task is a function, which is user_task0 in the main file. In order to switch, we set ctx_task.ra as user_task0. Since ra is a return address register, its function is to set the return adress (ra) to the program counter (pc), so that it can jump to this function to execute when executing the ret instruction.

	ctx_task.ra = (reg_t) user_task0;
	ctx_task.sp = (reg_t) &task0_stack[STACK_SIZE-1];
	sys_switch(&ctx_os, &ctx_task);

However, each task needs stack space to execute function calls within the C context. As a result, we allocate stack space for task0 and utilize ctx_task.sp to reference the stack’s starting point.

System Switch function

Then we can use sys_switch(&ctx_os, &ctx_task) to switch from the main program to task0, where sys_switch is located in sys.s to combine language functions, the content is as follows:

# Context switch
#
#   void sys_switch(struct context *old, struct context *new);
# 
# Save current registers in old. Load from new.

.globl sys_switch
.align 4
sys_switch:
        ctx_save a0  # a0 => struct context *old
        ctx_load a1  # a1 => struct context *new
        ret          # pc=ra; swtch to new task (new->ra)

In RISC-V, the parameters are mainly placed in the temporary registers a0, a1, …, a7. When there are more than eight parameters, they will be passed on the stack.

The C language function corresponding to sys_switch is as follows:

void sys_switch(struct context *old, struct context *new);

In the above program, a0 corresponds to old value (the context of the old task), and a1 corresponds to new value (the context of the new task). The function of the entire sys_switch is to store the context of the old task, and then load the context of the new task to start execution.

The last ret instruction is very important, because when the context of the new task is loaded, the ra register will also be loaded, so when ret is executed, it will set pc=ra, and then jump to the new task (such as void user_task0 (void)) that needs to be executed next.

ctx_save and ctx_load in sys_switch are two assembly macros, which are defined as follows:

# ============ MACRO ==================
.macro ctx_save base
        sw ra, 0(\base)
        sw sp, 4(\base)
        sw s0, 8(\base)
        sw s1, 12(\base)
        sw s2, 16(\base)
        sw s3, 20(\base)
        sw s4, 24(\base)
        sw s5, 28(\base)
        sw s6, 32(\base)
        sw s7, 36(\base)
        sw s8, 40(\base)
        sw s9, 44(\base)
        sw s10, 48(\base)
        sw s11, 52(\base)
.endm

.macro ctx_load base
        lw ra, 0(\base)
        lw sp, 4(\base)
        lw s0, 8(\base)
        lw s1, 12(\base)
        lw s2, 16(\base)
        lw s3, 20(\base)
        lw s4, 24(\base)
        lw s5, 28(\base)
        lw s6, 32(\base)
        lw s7, 36(\base)
        lw s8, 40(\base)
        lw s9, 44(\base)
        lw s10, 48(\base)
        lw s11, 52(\base)
.endm
# ============ Macro END   ==================

RISC-V must store ra, sp, s0, … s11 and other temporary registers when switching between schedules. The above code is from the xv6 teaching operating system kernel and modified for RISC-V 32-bit application.

Struct for Register Contents

In riscv.h header file, we have to define corresponding struct for context related registers.

// Saved registers for kernel context switches.
struct context {
  reg_t ra;
  reg_t sp;

  // callee-saved
  reg_t s0;
  reg_t s1;
  reg_t s2;
  reg_t s3;
  reg_t s4;
  reg_t s5;
  reg_t s6;
  reg_t s7;
  reg_t s8;
  reg_t s9;
  reg_t s10;
  reg_t s11;
};

Now from the main file, we need to set the task pointers to the ra and sp, and we can use the sys_switch function to smoothly switch from os_main to user_task0.

int os_main(void)
{
	lib_puts("OS start\n");
	ctx_task.ra = (reg_t) user_task0;
	ctx_task.sp = (reg_t) &task0_stack[STACK_SIZE-1];
	sys_switch(&ctx_os, &ctx_task);
	return 0;
}

Execute with QEMU

You can run the simulation on QEMU with the archfx/rvutils:qemu docker containter mounted with the tinyos repo following the below steps;

cd 03-ContextSwitch 
make 

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -T os.ld -o os.elf start.s sys.s lib.c os.c

make qemu

Press Ctrl-A and then X to exit QEMU qemu-system-riscv32 -nographic -smp 4 -machine virt -bios none -kernel os.elf OS start Task0: Context Switch Success ! QEMU: Terminated

We looked at the basic details about the implementation of the “Context-Switch” mechanism within the RISC-V architecture. This method showcases how tasks are managed and their execution contexts transitioned, contributing to the overall functionality and efficiency of the system.

Docker for Research🐳

2023-08-28T00:00:00-07:00

Lots of researchers have been curious about my Docker setups. Docker is actually a super handy tool for research. So, in this article, I’ll walk you through the basics of Docker for engineers/researchers.

Docker has revolutionized the way we develop, ship, and run applications by encapsulating them in lightweight, portable containers. In this article, we’ll take you through the essential steps of working with Docker, from installation to building and running Docker images. We’ll also explore how to effectively mount a local folder into a Docker image.

Installing Docker

To get started, let’s set up Docker on your system. The installation process may vary depending on your operating system. For instance, on a Linux system, you can use the package manager to install Docker. On Windows and macOS, you’ll use Docker Desktop. After installation, ensure that the Docker daemon is up and running.

AI Generated Image : credits to deepAI

Building a Docker Image

Creating a Docker image is the heart of containerization. A Docker image is a standalone executable package that includes everything required to run a piece of software, including the code, runtime, system tools, and libraries.

Dockerfile: To build a Docker image, you define its configuration using a text file called a Dockerfile. This file specifies the base image, sets up the environment, copies files into the image, and more.

# Dockerfile example
FROM python:3.9
WORKDIR /app
COPY . /app
RUN pip install -r requirements.txt
CMD ["python", "app.py"]

Building the Image: Use the docker build command along with the path to the directory containing the Dockerfile. This command processes the Dockerfile and creates an image with the defined configurations.

docker build -t my-python-app .

Find more example dockerfiles that use to create enviroments in my docker-decks repository.

Dockerfile Commands

In a Dockerfile, commands are used to define the step-by-step instructions for creating a Docker image. Each command performs a specific action that contributes to building the image. Here’s an explanation of some commonly used commands in a Dockerfile:

FROM: This command specifies the base image upon which your image will be built. It’s the starting point for your image’s filesystem. For example, FROM python:3.9 sets the base image to Python version 3.9.
WORKDIR: This command sets the working directory for subsequent commands in the Dockerfile. All paths specified in subsequent commands will be relative to this directory. For instance, WORKDIR /app sets the working directory to /app.
COPY / ADD: These commands copy files from the host machine to the image. COPY is preferred for simple copying, while ADD has additional features like unpacking tar archives and downloading files from URLs. Example: COPY . /app.
RUN: The RUN command executes a command in the image during the build process. This is often used to install software, update packages, and perform other setup tasks. Example: RUN pip install -r requirements.txt.
CMD: The CMD command specifies the default command to run when a container is started from the image. This is typically the command that your application needs to run. It’s worth noting that CMD can be overridden when running a container by providing a command as an argument. Example: CMD ["python", "app.py"].
EXPOSE: This command documents the port(s) that the container will listen on at runtime. It doesn’t actually publish the ports. It’s a helpful hint to the user about the container’s expected network behavior. Example: EXPOSE 80.
ENTRYPOINT: Similar to CMD, ENTRYPOINT specifies the command to run when the container starts. However, unlike CMD, ENTRYPOINT doesn’t get overridden if you provide a command while running the container. It’s useful for specifying a default behavior that can still be customized. Example: ENTRYPOINT ["python", "app.py"].
ENV: The ENV command sets environment variables within the image. These variables can be accessed by any process running in the container. Example: ENV MY_VARIABLE=value.

These are just a few of the key commands you’ll encounter in Dockerfiles. Understanding and effectively using these commands enables you to create well-structured and efficient Docker images tailored to your application’s needs.

Running a Docker Image

Once you have a Docker image, you can run it in a container. Containers are instances of Docker images that can be executed independently.

Running an Image: Use the docker run command followed by the image name to start a container. You can also specify additional options like port mapping, environment variables, and more.
```
docker run -p 8080:80 my-python-app
```
Interactive Mode: For interactive tasks, you can launch a container in interactive mode. This allows you to enter commands directly into the container’s terminal.
```
docker run -it my-python-app /bin/bash
```

Mounting a Folder into a Docker Image

Mounting a local folder into a Docker container is a common practice for development and testing. It allows you to update code in real-time without rebuilding the entire image.

Volume Mounting: Use the -v or --volume flag with docker run to specify a folder on your host machine to be mounted into the container. This creates a shared space between the host and the container.
```
docker run -v /path/on/host:/path/in/container my-python-app
```
Bind Mounting: With bind mounting, you can map a host directory to a container directory. Changes made on either side are immediately reflected on the other.
```
docker run -v /path/on/host:/path/in/container -it my-python-app /bin/bash
```

Conclusion

Docker has streamlined the process of managing and deploying applications, making them more portable and consistent across different environments. By understanding the installation process, building images, running containers, and utilizing volume mounts, you’ll be well-equipped to harness the power of Docker for your development/research needs. Start exploring and witness the magic of containerization in action!

TinyOS🐞: Operating System Tutorial

2023-08-26T00:00:00-07:00

TinyOS🐞 is a tutorial series about minimal operating system kernel implementation based on the comprehensive tutorial series mini-risscv-os. This operating system kernel is based on RISC-V instruction set architecture. Credits goes to the original authors of mini-risscv-os. A fully built environment is available as a docker environment. This tutorial will cover several chapters related to implementing a operating system from begining.

Requirements

In order to complete the tutorial you just need only two things

Beyond the technical requirements, inorder to understand the concepts I highly recommend looking at my firmware tutorial series before starting on this.

Setup the Environement on Docker

Fully bulid docker environemt with all the requirements installed including

gcc toolchain
qemu with RISC-V simulator

can be found in the following docker-hub reposiroty.

You can follow the below instructions to get it mount on your docker stack.

Pull the docker container
```
docker pull archfx/rvutils:qemu 
```

clone this repository

git clone https://github.com/Archfx/tinyos

Mount the repo to the docker container

docker run -t -p 6080:6080 -v "${PWD}/:/tinyos" -w /tinyos --name rvutils archfx/rvutils:qemu

Open the docker environment in another terminal
```
docker exec -it rvutils /bin/bash
```

Chapters

This tutorial series contains following chapters.

HelloWorld: Use UART to print text to the terminal
ContextSwitch: Basic switch from OS to user task
MultiTasking: Two user tasks are interactively switching
TimerInterrupt: Enable SysTick for future scheduler implementation
Preemptive: Basic preemptive scheduling
Spinlock: Lock implementation to protect critical sections
ExterInterrupt: Learning PLIC & external interruption
BlockDeviceDriver: Learning VirtIO Protocol & Device driver implementation
MemoryAllocator: Understanding how to write the linker script & how the heap works
SystemCall: Invoking a mini ecall from machine mode.

Building and Simulation

Instead of following through the tutorial, you can direclty compile and run the simulation from the repository. To do that you can navigate direcltly to the folder and use followings commamds from inside the Docker instance of archfx/rvutils:qemu

make # Build the OS
make qemu # Simulate the OS

Note: Press Ctrl-A and then X to exit QEMU

References

TinyOS🐞: HelloWorld

2023-08-26T00:00:00-07:00

Welcome to the first and simplest episode of TinyOS🐞. In this episode we will be looking at memory map settings of QEMU and writing a simple HelloWorld program. If you missed the introduction article about TinyOS🐞, you can find it here.

Main File (os.c)

Let’s start with the core code related to the HelloWorld. You can find the file here.

#include 

#define UART        0x10000000
#define UART_THR    (uint8_t*)(UART+0x00) // THR:transmitter holding register
#define UART_LSR    (uint8_t*)(UART+0x05) // LSR:line status register
#define UART_LSR_EMPTY_MASK 0x40          // LSR Bit 6: Transmitter empty; both the THR and LSR are empty

int lib_putc(char ch) {
	while ((*UART_LSR & UART_LSR_EMPTY_MASK) == 0);
	return *UART_THR = ch;
}

void lib_puts(char *s) {
	while (*s) lib_putc(*s++);
}

int os_main(void)
{
	lib_puts("Hello OS!\n");
	while (1) {}
	return 0;
}

The preset RISC-V virtual machine in QEMU is called virt, and the UART memory mapping location starts from 0x10000000, and the mapping registers are as follows:

UART MemoryMapped IO

0x10000000	THR	Transmitter Holding Register
0x10000000	RHR	Receive Holding Register
0x10000001	IER	Interrupt Enable Register
0x10000002	ISR	Interrupt Status Register
0x10000003	LCR	Line Control Register
0x10000004	MCR	Modem Control Register
0x10000005	LSR	Line Status Register
0x10000006	MSR	Modem Status Register
0x10000007	SPR	Scratch Pad Register

As long as we send a certain character to the THR of the UART, the character can be printed out, but before sending it, we must confirm whether the sixth bit of the LSR is 1 (meaning that the UART transmission area is empty and can be transmitted).

THR Bit 6: Transmitter empty; both the THR and shift register are empty if this is set.

So we wrote the following function to send a character to the UART for printing out to the host. (Because the embedded system usually does not have a display device, it will be sent back to the host for display)

int lib_putc(char ch) {
	while ((*UART_LSR & UART_LSR_EMPTY_MASK) == 0);
	return *UART_THR = ch;
}

Once a word can be printed, a large string of words can be printed with the following lib_puts(s).

void lib_puts(char *s) {
	while (*s) lib_putc(*s++);
}

So our main program calls lib_puts to print Hello World!.

int os_main(void)
{
	lib_puts("Hello World!\n");
	while (1) {}
	return 0;
}

Although our main program is only a short 22 lines, the 01-HelloOs project includes not only the main program, but also the startup program start.s, the link file os.ld, and the configuration file Makefile.

Project build configuration file Makefile

The Makefile in mini-riscv-os is usually similar, the following is the Makefile of 01-HelloOs.

CC = riscv32-unknown-elf-gcc
CFLAGS = -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32

QEMU = qemu-system-riscv32
QFLAGS = -nographic -smp 4 -machine virt -bios none

OBJDUMP = riscv32-unknown-elf-objdump

all: os.elf

os.elf: start.s os.c
	$(CC) $(CFLAGS) -T os.ld -o os.elf $^

qemu: $(TARGET)
	@qemu-system-riscv32 -M ? | grep virt >/dev/null || exit
	@echo "Press Ctrl-A and then X to exit QEMU"
	$(QEMU) $(QFLAGS) -kernel os.elf

clean:
	rm -f *.elf

Some of the Makefile syntax is not easy to understand, especially the following symbols:

$@ : the target file for this rule (Target file)
$* : represents the files specified by targets, but does not contain the file extension
$< : the first dependency file in the list of dependency files (Dependencies file)
$^ : all dependent files in the dependent file list
$? : A list of files in the dependent file list that are newer than the target file
$* : represents the files specified by targets, but does not contain the file extension
?= Syntax: If the variable is undefined, assign it a new value.
:= Syntax: make will expand the entire Makefile and then determine the value of the variable.

So the following two lines in the above Makefile:

os.elf: start.s os.c
	$(CC) $(CFLAGS) -T os.ld -o os.elf $^

The $^ in it is replaced by start.s os.c, so the entire line $(CC) $(CFLAGS) -T os.ld -o os.elf $^ becomes the following instructions.

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -T os.ld -o os.elf start.s os.c

In the Makefile, we use riscv32-unknown-elf-gcc to compile, and then use qemu-system-riscv32 to execute. The execution process of program is as follows:

cd 01-HelloWorld
$ make clean

rm -f *.elf

$ make

riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -T os.ld -o os.elf start.s os.c

$ make qemu

Press Ctrl-A and then X to exit QEMU qemu-system-riscv32 -nographic -smp 4 -machine virt -bios none -kernel os.elf Hello World! QEMU: Terminated

First use make clean to clear the last compilation output, then use make to call the riscv32-unknown-elf-gcc compilation project, the following is the complete compilation instruction

 riscv32-unknown-elf-gcc -nostdlib -fno-builtin -mcmodel=medany -march=rv32ima -mabi=ilp32 -T os.ld -o os.elf start.s os.c

Among them, -march=rv32ima means that we want to generate code for 32-bit I+M+A instruction set :

I: Basic Integer Instruction Set (Integer)
M: Include multiplication and division (Multiply)
A: Contains atomic instructions (Atomic)
C: Use 16-bit compression (Compact) – Note: We did not add C, so the instruction machine code generated is purely 32-bit instructions, not compressed into 16-bit, because we want the instruction length to be the same, from the beginning to the end The tail is 32 bits.

And -mabi=ilp32 indicates that the integer of the generated binary object code is based on a 32-bit architecture.

ilp32: int, long, and pointers are all 32-bits long. long long is a 64-bit type, char is 8-bit, and short is 16-bit.
lp64: long and pointers are 64-bits long, while int is a 32-bit type. The other types remain the same as ilp32.

There is also the -mcmodel=medany parameter, which means that the generated symbol address must be within 2GB, and can be addressed by static linking.

-mcmodel=medany: Generate code for the medium-any code model. The program and its statically defined symbols must be within any single 2 GiB address range. Programs can be statically or dynamically linked.

More detailed RISC-V gcc parameters can be found from here

In addition, the two parameters -nostdlib -fno-builtin are used to indicate that the standard library should not be linked (because it is an embedded system, the library usually needs to be self-made), please refer to the here for more details:

Link Script link file (os.ld)

There is also the -T os.ld parameter specifying the link script as the os.ld file as follows: (link script is a guide file describing how to put the program segment TEXT, data segment DATA and BSS uninitialized data segment into the memory respectively)

OUTPUT_ARCH( "riscv" )

ENTRY( _start )

MEMORY
{
  ram   (wxa!ri) : ORIGIN = 0x80000000, LENGTH = 128M
}

PHDRS
{
  text PT_LOAD;
  data PT_LOAD;
  bss PT_LOAD;
}

SECTIONS
{
  .text : {
    PROVIDE(_text_start = .);
    *(.text.init) *(.text .text.*)
    PROVIDE(_text_end = .);
  } >ram AT>ram :text

  .rodata : {
    PROVIDE(_rodata_start = .);
    *(.rodata .rodata.*)
    PROVIDE(_rodata_end = .);
  } >ram AT>ram :text

  .data : {
    . = ALIGN(4096);
    PROVIDE(_data_start = .);
    *(.sdata .sdata.*) *(.data .data.*)
    PROVIDE(_data_end = .);
  } >ram AT>ram :data

  .bss :{
    PROVIDE(_bss_start = .);
    *(.sbss .sbss.*) *(.bss .bss.*)
    PROVIDE(_bss_end = .);
  } >ram AT>ram :bss

  PROVIDE(_memory_start = ORIGIN(ram));
  PROVIDE(_memory_end = ORIGIN(ram) + LENGTH(ram));
}

Start the program (start.s)

In addition to the main program, an embedded system usually needs a startup program written in assembly language. The content of the startup program start.s in 01-HelloOs is as follows: are asleep, which makes things simpler and does not need to consider too many parallel processing issues).

.equ STACK_SIZE, 8192

.global _start

_start:
    # setup stacks per hart
    csrr t0, mhartid                # read current hart id
    slli t0, t0, 10                 # shift left the hart id by 1024
    la   sp, stacks + STACK_SIZE    # set the initial stack pointer 
                                    # to the end of the stack space
    add  sp, sp, t0                 # move the current hart stack pointer
                                    # to its place in the stack space

    # park harts with id != 0
    csrr a0, mhartid                # read current hart id
    bnez a0, park                   # if we're not on the hart 0
                                    # we park the hart

    j    os_main                    # hart 0 jump to c

park:
    wfi
    j park

stacks:
    .skip STACK_SIZE * 4            # allocate space for the harts stacks

Execute with QEMU

And when you enter make qemu, Make will execute the following commands

qemu-system-riscv32 -nographic -smp 4 -machine virt -bios none -kernel os.elf

It means to use qemu-system-riscv32 to execute the os.elf kernel file, -bios none does not use basic input and output bios, -nographic does not use drawing mode, and the specified machine architecture is -machine virt, also It is the RISC-V virtual machine virt preset by QEMU.

So when you enter make qemu, you will see the following screen!

$ make qemu

Press Ctrl-A and then X to exit QEMU qemu-system-riscv32 -nographic -smp 4 -machine virt -bios none -kernel os.elf Hello World! QEMU: Terminated

This is the basic appearance of the simplest Hello Wolrd! program in the TinyOS series.