(Note: this and the following posts use the “nanoexec” code located at http://github.com/imagecraft and in the JumpStart C for Cortex-M demo install. However, for brevity, some details and data structure fields are omitted.)

(The second and last part if now posted here }

Writing a basic multitasking executive is a fairly simple exercise. The basic concept is that a CPU runs a "task" - which is a sequence of code (e.g. a function running an infinite loop) - that, at some point, switches to another task. If task switching (also known as context switching) occurs only when a task explicitly yields control of the CPU, then it is called "cooperative multitasking". If task switching happens periodically, i.e.: through some kind of timer interrupt, then it is known as "preemptive multitasking". Nominally, preemptive multitasking may be slower than cooperative multitasking, but in the world of 32-bit CPUs, there is really no reason not to use preemptive multitasking.

What's the difference? A Multitasking Executive vs. a Kernel

A multitasking executive is a program that manages task switching, and a kernel is a multitasking executive that also provides other services such as timers, inter-task communications, memory management etc.

(click post to read more...)


Preserving the Task State

To preserve the state of a task so that it can be paused and resumed, you only need to preserve the CPU registers. Global objects, i.e. global variables in C, may be modified by different tasks, and are not preserved across context switching. To further localize the state of a task, each task must have its own stack for allocation of local variables and function calling.

The Task Control Block

Nanoexec uses a structure called the Task Control Block to preserve the state of a task:

typedef struct {
    unsigned sp; //The task's current stack pointer
    unsigned flags; //Status flags includes activity status,etc

 You might ask: "Hey; what happened to the rest of the CPU registers?" It turns out that it is very easy to push the CPU registers onto the task’s stack during context switching! Therefore, it is sufficient to store just the stack pointer, as the entire CPU register set is stored on the stack.


Non-Cortex-M0 devices, e.g. M3, M4, and M7 devices) have two stack pointers: Master Stack Pointer (MSP) and Process Stack Pointer (PSP), of which only one is being used by the system as the current stack pointer at any given time. (Note that the number of stack pointers has nothing to do with the tasks’ stacks, per se.)

There are different ways of using the two stack pointers. For a multitasking system, the most straightforward way is to use the MSP for interrupt processing and running kernel code, and use the PSP when tasks are running.

Using the SYSTICK Timer to Provide Portable Timer Interrupts

The Cortex-M architecture defines a SYSTICK timer which  most (if not all) Cortex-M devices implement. Thus, a Cortex-M executive can be written portably using the SYSTICK timer as the source of the periodic interrupt, without relying on device-specific timers.

Typically a rate of one interrupt per millisecond is used for multitasking executives: it’s fast enough to give reasonable responsiveness to the system, but slow enough to not to add too much kernel overhead to the system.

Saving A Task’s Context

When an interrupt happens, the Cortex-M CPU automatically pushes the registers R0 to R3, R12, LR (Link Register, or R14), and PC (Program Counter, or R15) onto the stack being pointed to by the PSP. LR is then loaded with an exception code indicating whether MSP or the PSP was used at the time of the interrupt, then MSP is used as the stack pointer since the CPU is now running interrupt handler (and kernel) code.

To exit an interrupt handler, set the PSP to the value at the time of the interrupt, and with LR containing the exception code, the instruction “BX LR” will return control to the interrupted code.

Context Switch

Using SYSTICK interrupts, the basic technique of context switching is: in the SYSTICK interrupt handler, you save the rest of the CPU registers (e.g. R4 to R11, R13, and LR) and store PSP in the task’s TCB (this is the “sp” field of the TCB above). Note that this LR contains the exception return code, whereas the LR automatically pushed by the CPU contains the value of R14/LR of the interrupted code.

To resume a task at the end of of the SYSTICK interrupt handler, after a new task is selected, you set the PSP to the task’s TCB, restore R4 to R11, R13, and LR, then execute the instruction “BX LR” and execution resumes where it was interrupted.


The algorithm to select which task to run during context switching is called the scheduling algorithm. There are many different scheduling algorithms, depending on whether the kernel supports priorities, and whether it supports “real time” operations - e.g. data coming in from a port that may need immediate attention or else it may be dropped. (We will discuss scheduling in another blog post.)


This post examines the basic theory of a multitasking executive. The next few posts will look at the actual implementation of nanoexec and other issues, such as how to create new tasks etc.

Happy coding! :)