2 Real-Time Operating Systems: 2.1 Learning Objectives of This Chapter

2
Real-Time Operating Systems (by Damir Isovic)
Summary: The previous chapter provided a general introduction to real-time systems. This chapter discusses the operating system level supports essential for the realization of real-time applications. There exist a multitude of real-time kernels1 and they provide varying levels of support with regard to real-time systems realization. We provide an overview of commercially available RTOSs as well dwell upon some of the academic research initiatives. The goal of this chapter is provide the student with awareness of the characteristics of RTOSs and available alternatives.
2.1
Learning objectives of this chapter
After reading this chapter you should be able to: Understand basic concepts in operating systems, such as communication, synchronization, interrupts, I/O, memory management, time-management etc., with special emphasis on their real-time implications. Obtain a thorough knowledge regarding classes of operating systems and features commonly supported. Get an overall perspective of the various commercial operating systems and academic research kernels and how they realize/implement real-time properties. Understand the important issues to consider when choosing a real-time operating system for your development project (especially the role of application characteristics in this selection).
2.2
Concurrent task execution
In Chapter 1, we said that one of the main characteristics of real-time systems is multitasking, where several tasks compete for execution on a same CPU. Note that this is simulated parallelism tasks are not really executed in parallel but the operating system creates an impression of parallelism by switching the execution of tasks very often and fast. This is different from true parallelism where several processing units are used simultaneously, e.g. as in multiprocessor systems. Simulated parallelism does not come without problems; we will look into some of them. Let us consider again the example with the electrical motor from Chapter 1, see Figure 3. The application consist of an electrical motor with a sensor and an actuator, user input throttle and
WewillusethetermsOperatingSystem(OS)andkernelinterchangeably,eventhoughanOStypically consistsofitscentralpartsthekernelandadditionalservices.DuetothesimplicityofmanyRealTimeOSs, theydonotprovideanyadditionalservices,suchasafilesystem.

1
a computer system (controller). The question now is how we should design the controller software. The first question we can ask ourselves is what the controller software should do. The answer is that it calculates the control values to be sent to the motor based on sensor inputs, as described in Figure 1. Hence, the software needs to periodically read the values from the sensor and the user, compute the new control values, and actuate the motor.
read sensor Controller (program) read joystick
Figure1:Electricenginesoftwareinteraction.
write actuator
So, how should we structure the software? Lets try first with a simple, naive approach, which is to put all functionality in a single loop that repeats itself periodically, as described below.
void main (void){ /* declare all variables */ ... /* repeat continuously */ while(1) { sensor_val = read_sensor(); user_val = read_user(); control(sensor_val,user_val,&signal); write_actuator(signal); } }
/* read sensor */ /* read user input */ /* compute new value*/ /* actuate motor*/
Can you see any problems with this solution? One big problem is blocking in execution. For example, if the function read_sensor() is blocking, i.e., it does not return any result until a new sensor value has been generated, then the whole system will be blocked the system is busy waiting. It gets worse if the function read_user() is blocking, since it waits for even more rare events generated by the user. We see that the functions in the example above are
blocking each other, despite the fact that they are independent from each other. The lack of user input should not affect reading of the engine sensor and vice versa. We can avoid the blocking problem above by checking in the main loop if the values are ready to be read, by checking the status registers of the input ports. For simplicity we can assume we have input ports that automatically reset their status bits when the corresponding data registers are read a rather common case in practice.
... while(1) { if(SENSOR_VALUE_READY){ sensor_val = read_sensor(); control(sensor_val,user_val,&signal); write_actuator(signal); } if(USER_VALUE_READY){ user_val = read_user(); } } }
This solution is better than the first one, since no blocking will occur. However, it is not resource-efficient, because the while-loop will run all the time, i.e., it will consume all CPU time regardless how frequently the sensor values are generated if there is a sensor value available the corresponding action will be performed, if there is no value, the loop will proceed right away. This is a big CPU time wastage; we should be able to execute other tasks in the system when there are no sensor values ready at the input ports. We could avoid unnecessary execution by checking first if any interrupts has been generated:
... while(1) { wait_for_system_IRQ(); /* stop execution until next interrupt */ if(SENSOR_VALUE_READY){ ... } if(USER_VALUE_READY){... } }
This solution is called cyclic executive. The main idea is to put all functions in a sequence that executes in a joint loop with certain periodicity. If we do not want all functions to run at each loop iteration, this can be controlled by loop counters. Cyclic executive is simple and deterministic, thats why it is still a commonly used solution in simple embedded systems in the industry. On the other hand, one disadvantage is that the execution schedule is handmade, i.e., it must be redone for each single change in the system. However, the biggest disadvantage of cyclic executive is that it does not consider the execution time of the independent computations in the loop. For example, what happens if the sensor values are much more frequent than the user values, while the processing time of the user values is much larger? In this case, several sensor values will be generated while the computer responds to the user input sensor values will be lost! This whole situation is depicted in Figure 2.
user
time
sensor
Figure2:Lostsensorvalues.
To prevent losing any sensor values, we need to preempt the execution of the user task each time a new sensor value has been generated, process it, and then re-invoke the execution of the user task as shown in Figure 3.
Preempt execution of user task and run sensor task
user sensor time
Figure3:Manualinterleavingofprogramexecution.
In other words, we need to do manual interleaving of execution, i.e., we need to do changes in the program so that the processing of user input will be stopped whenever a new sensor value
is generated. Generally, it is not easy to do code interleaving by hand, i.e., writing in the code where the preemption should occur it usually result in many inserted if-cases in the program that check if it is time to preempt the current execution and let somebody else execute. Instead, we would like to split the application into independent tasks in the system, implement them as separate threads2, and let the operating system (OS) take care of interleaving, i.e., interleaving is automatic. This means that the application programmer does not have to insert any special code to make task switching happen, or take specific action to save the local context when switching occurs. On the other hand, the programmer must be prepared that switching might occur at any time. In our electrical motor example, we would create two tasks, one that takes care of the sensor values and calculate new control values, and one that check the user input, and implement them as real-time threads described below:
void Sensor_Task(){ while(1) { sensor_val = read_sensor(); control(sensor_val,user_val,&signal); write_actuator(signal); sleep(); }
void User_Task(){ while(1) { user_val = read_sensor(); sleep(); } }
When activated by the operating system, both tasks will performs their actions and, when done, they will call a sleep function that waits until it is time to repeat the computation. By calling sleep, the execution of the task is suspended for a specified time interval, which means that other tasks in the system can use the CPU. If a sensor value is generated while the user input is processed (User_Task executes), the operating system will preempt it, and start executing Sensor_Task. When Sensor_Task is done, the OS will continue the execution of User_Task at the exact point it was interrupted. So, as an application programmer, we need to structure our application as separate tasks, assign appropriate timing constraints to them (e.g., deadlines and periods) and let the operating system take care of all multitasking issues, such as interleaving, resource allocation, scheduling etc. We will now start looking at a special type of operating systems that is suitable for handling real-time tasks, real-time operating systems.
WesaidinChapter1thatathreadisanimplementationofatask,andthattheyareusuallyusedas synonymous.Hence,wewilluseonlythetermtaskinthefuturetoavoidconfusion.
2
2.3
What is a Real-Time Operating System?
A real-time operating system (RTOS) is an operating system capable to guarantees certain functionality within specified time boundaries. Special functions available in a RTOS simplify and make it more efficient to develop software for real-time systems. We can say that a RTOS is a platform suitable for development of real-time applications. Figure 4 illustrates a simplified model of a RTOS used to provide services to a real-time application. At the bottom of the figure there is the physical hardware, i.e., the CPU itself with I/O devices, registers, memory, communication circuits, analog/digital convertors, etc. Hardware Adaptation Layer (HAL) contains hardware dependent code needed to communicate with the underlying hardware, i.e., device drivers, register handling code, interrupt handling code etc. The RTOS itself uses the functionality provided by HAL, for e.g., scheduling, communication and synchronization. The RTOS can also communicate directly with the hardware, but it is better to use a HAL, since it will make the application easier to port to different platforms we only need to change the HAL when moving applications to other platforms, not the application or RTOS code. Finally, the real-time application uses the services provided by the RTOS. Those services are usually called system calls.
Application software RTOS Hardware Adaptation Layer Hardware
Figure4:ARealTimeOperatingSysteminitsenvironment.
2.4
RTOS characteristics
It is certainly possible to implement real-time applications without using a real-time operating system, but an RTOS make the work much easier, mainly due to the following properties: Task management By using an RTOS, the development of real-time application software becomes easier and more efficient. We could see in previous examples that if we need to develop an application consisting of several independent computations, it is a good idea to split the application into independent tasks and let the operating system manage their execution. Services in this category include the ability to launch tasks and assign priorities to them. The most important service in this category is scheduling of tasks, which will make the task execute in a very timely and responsive fashion. We will talk about different real-time scheduling policies in Chapter 3.
Resource management an RTOS provides a uniform framework for organizing and accessing the hardware devices that are typical of an embedded real-time system. This includes services for management of I/O devices, memory, disks, etc. Communication and synchronization These services make it possible for tasks to pass data from one to another, without danger of that data to be damaged. They also make it possible for tasks to coordinate, so that they can productively cooperate. Without the help of these RTOS services, tasks might well communicate corrupted data or otherwise interfere with each other. Time services Obviously, good time services are essential to real-time applications. Since many embedded systems have stringent timing requirements, most RTOS kernels also provide some basic timer services, such as task delays and time-outs. Homogeneous programming model an RTOS provides a number of well defined system calls, which makes it easier to understand and maintain the application code. Furthermore, this reduces the development effort and risk, since a homogenous programming model usually implies usage of a streamlined set of tools and methods to get a quality product into production as quickly as possible. By using an RTOS, we use existing reliable proven building blocks, which substantially increase the quality of the product. Portability an RTOS simplifies the porting between different platforms. Since most of the available RTOSs can be adjusted by the manufacturer to support different platforms, using an RTOS makes it simpler for the customer to change the hardware platform. Besides, standards are covering all the possible interaction and interchanges between subsystems, components and building blocks. In general, we can say that using an RTOS makes it much easier to implement and maintain real-time applications compared with doing everything from scratch. Hence, an RTOS should be used whenever possible when developing real-time systems.
2.5
RTOS vs GPOS
What is the difference between an RTOS and a general-purpose operating system, such as Windows or Linux? Many non-real-time operating systems also provide similar kernel services as an RTOS, so why use an RTOS? The key difference between general-purpose operating systems (GPOS) and real-time operating systems is the need for deterministic timing behavior in real- time operating systems. Deterministic means that provided OS services consume only known and expected amounts of time. General-purpose operating systems are often quite non-deterministic. Their services can inject random delays into application software and thus cause slow responsiveness of an application at unexpected times.
Hence, the fundamental difference between an RTOS and a GPOS is the view on result, i.e., the temporal aspect is very important in an RTOS, which means, among other things, that: Service calls must be predictable with a known upper bound on execution time. Task execution switching has to be done by some algorithm that can be analyzed for its timing. Delay spent in waiting for shared resources must be possible to determine. The maximum time that interrupts can be disabled must be known.
Another thing that differs is the clock resolution; which is higher in an RTOS than in a GPOS. Each operating system has a system clock used for scheduling of activities. To generate a time reference, a timer circuit is programmed to interrupt the processor at a fixed rate. The internal system time is represented by an integer variable, which is reset at system initialization and is incremented at each timer interrupt. The interval of time with which the timer is programmed to generate interrupts defines the unit of time in the system (time resolution). The unit of time in the system is called a system clock tick, see Figure 5.
Clock resolution
System clock time 0 1 2 3 4
Clock ticks
Figure5:Systemclock.
The value to be assigned to the tick depends on the specific application. In general, small values of the tick improve system responsiveness and allow handling of periodic activities with the higher activation rates. On the other hand, a very small tick causes a large run-time overhead due to the timer handling routine: the smaller the tick value, the more CPU time is needed for the time administration. Typical values used for the time resolution in an RTOS are on a millisecond level or less, while in a GPOS, the clock resolution is an order of magnitude higher, i.e., tens of milliseconds.
2.6
Types of RTOSs
We said before that one classification of real-time system is into event-triggered and timetriggered, based on if the system activities are carried out as they come, or at predefined points in time. Even real-time operating systems are divided into these two types. In an event-triggered RTOS, each task is assigned a priority relative to other tasks in the system. High priority values represent the need for quicker responsiveness, i.e., if there are several tasks competing to execute on a single CPU, the task with the highest priority will be executed first. Priorities can be assigned before run-time of the system (static priority
assignment), or at run-time (dynamic priority assignment). Synchronization between tasks is usually done in asynchronous way, by sending messages during execution. Although we use priorities to determine the order of task execution, it is not always guaranteed that this order will be preserved at run-time. That is because of shared resources, such as shared variables or shared I/O devices. If a low priority task is currently using a resource that is requested by a high priority task, the high priority task will need to wait until the resource becomes free. However, in an RTOS, the waiting time can be calculated and guaranteed (which is not the case in a general-purpose operating system). In a time-triggered RTOS, tasks are executed according to a schedule determined before the execution. Time acts a means for synchronization. Since all decisions about task execution, synchronization and communication are made before run-time, the run-time mechanism is quite simple it just reads the schedule and executes the task according to it. You can compare this type of RTOS with a bus schedule; a bus company makes a schedule for a bus that is valid during some time period. Every day the bus drives according to the schedule, which makes it possible for the people to know when the bus is arriving at a certain bus stop. The schedule is repeated over and over. This is exactly what happens in timetriggered real-time systems: we use some scheduling algorithm to create a schedule before putting the system into use, and then, at run-time, we just follow the schedule. For example, the schedule could say at time 5 run task A until time 8, at time 8 run task B until 12, etc. There are also real-time operating systems that support both the event-triggered and timetriggered paradigms. A difficulty with those hybrid systems is the communication delays between the two parts as well as the CPU sharing between event-triggered and time-triggered tasks.
2.7
Event-triggered Real-Time Operating Systems
Most commercial real-time operating systems are priority-driven, that's why we will put the emphasis on event-triggered RTOSs in this book. Here we describe some of the most common services provided in an event-triggered RTOS. We will not focus on a specific RTOS, but discus some general mechanisms common for most of them. A comparison between different commercial RTOSs will be presented at the end of this chapter. Preemption and context switch In most event-triggered real-time operating systems, a task with assigned higher priority will be able to preempt the execution of currently running lower priority task.
1 preempts 2
High-priority task 3
2 preempts 1
Middle-priority task 2
t3 t4
Low-priority task 1
t1
t2
t5
time
Figure6:Preemptionbetweentasks.
Figure 6 illustrates what happens when preemption occurs. We see in the figure that the RTOS will stop the execution of a task if there is a higher-priority task that wants to execute. When task 2 becomes ready to execute at time t2, it will preempt the lower-priority task 1, and when the highest-priority task 3 gets ready, it preempts 2 (and hence, indirectly preempts even 1). Each time the priority-based preemptive RTOS is alerted by an external world trigger (such as a switch closing) or a software trigger (such as a message arrival), must determine whether the currently running task should continue to run. If not, the following steps are made: 1. 2. 3. 4. Determine which task should run next. Save the environment of the task that was stopped (so it can continue later). Set up the running environment of the task that will run next. Allow this task to run.
These steps are together called task switching (or context switch). The time it takes to do task switching is of interest when evaluating an operating system. A general-purpose operating system might do task switching only at timer tick times, which could be tens of milliseconds apart. Such a delay would be unacceptable in most real-time systems. For this reason, most real-time operating systems do not rely on system clock scheduling alone. Rather, it is used in combination with other events in the system, such as a new task is released, a task gets blocked, an external interrupt occurs etc. Task structure A real-time task in an event-triggered RTOS consists of: Task Control Block (TCB) a data structure that contains task ID, task state, start address of the task code, and some registers, such as program counter and status register. Program code the binary representation of the code to be executed by the task, which was originally implemented in some programming language, e.g., the Clanguage.
Data area stack and heap

TCB
- Task ID - Task state - Staus register - Program counter
Program code
main(){ init; loop forever { .... wait(...); } }
Data area
Figure7:Astructureofatask
When a real-time kernel creates a task, it allocates memory space to the task and brings the code to be executed by the task into memory. In addition, it instantiates the Task Control Block and uses the structure to keep all the information it will need to manage and schedule the task. When a task is executing, its context changes continuously. When the task stops executing, the kernel keeps its context at the time in the task's TCB. When we say that the RTOS inserts a task in a queue (e.g., the ready-queue), we mean that it inserts a pointer to the TCB of the task into the queue. The kernel terminates a task by deleting its TCB and deallocating its memory space. The TCB has pointers to the task code and data area, see Figure 7. Separation between task code and task data is done because we want to be able to store the code and the data in different types of memories. In embedded systems the task code is usually stored in an EPROM3, especially in system-on-chip computers, where the objective is to avoid usage of external memory as much as possible. In such systems, the resources are usually limited, not allowing the code to use read- and write memory. Another reason of separating program code from the data area is to be able to reuse the same code for different tasks. For example, assume a PID4 controller that has been used and tested for a while. We want to add an additional PID controller, that will perform the same action but with different input values, e.g., with different control parameters and periodicity. In this case we can use the same program code for both PID controllers, as illustrated in Figure 8.
EPROM:ErasableProgrammableReadOnlyMemoryisatypeofmemorythatcanbeerasedandrewritten, usuallybyusingultravioletlight. 4 PID:Aproportionalintegralderivativecontrollerisagenericcontrolloopfeedbackmechanismwidelyused inindustrialcontrolsystems.Itattemptstocorrecttheerrorbetweenameasuredprocessvariableanda desiredsetpointbycalculatingandthenoutputtingacorrectiveactionthatcanadjusttheprocessaccordingly andrapidly,tokeeptheerrorminimal.

3
TCB PID 1
Parameters (PID 1)
Shared program code

main(){ init; loop forever { .... wait(...); } }
TCB PID 2
Parameters (PID 2)
Data area (PID 1) Data area (PID 2)
Figure8:Sharedprogramcode.
Reentrant code To be able to reuse the same code in different tasks, as in PID example above, the code must be pre-emptable in the middle of the execution without any side effects, i.e., the code must be reentrant. A reentrant piece of code can be simultaneously executed by two or more tasks. Example: Is the function swap that exchanges the values of two variables reentrant?
int temp; int swap(int *x, int *y){ temp = *x; *x = *y; *y=temp; }
The answer is no, function swap is not reentrant. The reason for this is usage of a global variable temp. Assume a low priority task L and a high priority task H that both use function swap. Assume H does not want to execute for the moment, and L starts to execute. After L has executed the code line temp=*x, task H becomes ready and preempts task L. During its execution, H will change the value of temp, so when L resumes its execution again, the value of *y will be wrong. The whole scenario is illustrated below.
task_L(){ int x=1, y=2; swap(&x, &y); -------------> temp = *x; /* temp=1 */ (H preempts L) --------------> task_H(){ int z=3,t=4; swap(&z,&t); -------> temp=*z; /* temp=3 */ *z=*y; /* z=4 */ *y=*temp; /* y=3 */ <------} <------------(L continues) *x = *y; /* x=2 */ *y=temp; /* y=3 (WRONG! it should be 1) */ <------------... }
When swap() is interrupted, temp contains value 1. The high priority task H sets temp to 3 and swaps the contents of its local variables correctly (i.e., z=4 and t=3). After finishing its execution, task H gives the control back to the low priority task L, which is then resumed. Note that at this point temp is still set to 3 (since it is a global variable). When L resumes execution, it sets its local variable y to 3 instead of 1, which is obviously wrong. Reentrant code cannot use global variables without protection; otherwise different tasks may update the same memory location in non-deterministic order. There are several ways to avoid this problem and make the code reentrant, such as to declare temp as a local variable, disable interrupts until task L is done, so that task H cannot preempt it, or to protect the global variable temp from simultaneous access e.g., by using semaphores (we will talk about semaphores later in this chapter). Task states A task changes its state during its lifetime: e.g., a task that executes can become blocked by another task that uses a shard resource. Another example is that when a task is done with its execution, it calls a sleep function that will put in the waiting state. In general, we can identify the following states for a task:
Executing
Ready
Dormant
Waiting
Blocked
Figure9:Taskstatesandstatetransitions.
Dormant This state means that the task is not yet consuming any resources in the system. The task is registered in the system but it is either not activated yet or has terminated. Executing A task enters this state as it starts to run its code on the processor. Only one task can be executing at a time (on a single-core processor). Ready By entering the ready state, a task expresses its wish to gain access to the processor, i.e., when it wants to execute. A ready task cannot gain control of the CPU until all higher priority tasks in the ready or executing state either complete, or become dormant. Waiting A task enters this state when it waits for an event, e.g., timeout expiration, or a synchronization signal from another task. The before mentioned sleep function will put a task in a ready state. Blocked A task is blocked when it released but cannot continue its execution for some reason. For example, it may be blocked waiting for a shared resource to be free. Not all states have to be supported by an RTOS, but in any kernel that supports execution of concurrent tasks on a single processor, there are at least states executing, ready and waiting. The next obvious question we can ask ourselves is which state transitions are valid. This is summarized in Figure 9. Dormant Ready: This transition occurs when a task is activated. Ready Executing: The task that has the highest priority among all ready tasks at the moment will start to execute. Executing Ready: If another task, with higher priority than the currently executing task, has become ready, it will preempt the current task, and become executing itself. The preempted task will then go to the ready state (where it has to compete again with all other ready tasks). Executing Waiting: An executing task becomes waiting by e.g., invoking a system call sleep at the end of its execution in the current period time. When the waiting time has elapsed, the task will become ready again.
Executing Blocked: An executing task becomes blocked when it comes to a point in its execution that it cannot proceed, since it cannot get the access to a necessary resource that is locked by some other task. It is important to note the difference between blocked and waiting a task is forced to enter the blocked state, while it voluntarily enters the waiting state. Executing Dormant: When a task has terminated or completed its execution, it becomes dormant. Tasks in this state may be destroyed. Waiting Ready: When a task has spent the desired time period in the waiting state, it goes back to ready. Why can't it enter executing state directly? The answer is: there might be other tasks in the system that also are ready, and some of them might have higher priority. Hence, all transitions to the executing state go through the ready state, i.e., all ready tasks are transferred in the ready queue, and sorted to execute based on their priorities. Blocked Ready: When the resource that caused a task to become blocked has been freed, the task can proceed to execute. However, it must go via the ready queue for the same reason as discussed above. When does a state transition occur? The answer is: At system clock timer interrupts. At each clock tick, the RTOS increases the system time, and then checks if there are any task transitions to be made. For example, if there is a ready task with higher priority than the currently execution one, task switch occurs. When a task invokes a system call such as e.g., sleep. At external interrupts, i.e., when an interrupt routine invokes a system call that causes a task switch. The kernel invokes the scheduler to update the ready queue whenever it wakes up or releases a task, finds a task unblocked, creates a new task, and so on. Thus, a task is placed in the proper place in the ready-queue as soon as it becomes ready. Time handling functions We mentioned before that an RTOS has a system clock for scheduling of activities. Lets have a look on some functions that operate on the system clock.
getTime() get system time setTime(t) set system time adjustTime(t)- adjust system time
The first two services are pretty self-explanatory, but why do we need adjustTime(t)? Can't we just use setTime(t) if we want to change the system time? Yes, in many cases we can, but consider the following case: assume that we have scheduled a number of tasks to run at different clock ticks that belong to a time interval [t1, t2]. Assume that, at time t1 we call setTime(t2). If we jump from t1 to t2, all scheduled tasks in the interval will be released at once, compared to different release times if we do not set a new time. This will create a temporary overload in the system that can crash it. If we, however, use the adjustTime function, it will adjust the time in discrete steps, i.e., it will either speed up or slow down the system clock during a certain period until we reach the desired time. This will avoid multiple simultaneous task releases.
Another problem that you should be aware of when using time functions is a risk for wrong timestamps. Consider a task where we read some sensor data and timestamp it, i.e., we record at which point in time the reading occurred. Assume that the task code looks like this:
void Task_T() { struct a; ... a.value = read(sensor); a.timeStamp = getTime(); ...

}
Figure 10-a illustrates the case when the task executes without any interruption. The timestamp value will be correct. But what if the task gets preempted after reading the sensor value, but before timestamping it, as depicted in Figure 10-b? In this case, the timestamp will not reflect the actual reading time, since we will read the data at a certain point in time that differ from the one recorded, i.e., the recorded time stamp can be much later than the actual reading took place, because of the preemption which delayed reading of current system time.
Task execution a) time
preemption High-priority task b) read(sensor) getTime()
Figure10:Wrongtimestampduetopreemption.
read(sensor) getTime()
time
One possible solution to this problem could be to disable interrupts while reading the sensor data and timestamping it. However, this is a quite drastic solution because disabling interrupts means also not being able to respond to external events within that time, which may be a
problem in case of urgent events. A better solution would be to protect the code with some resource access mechanism, such as semaphores, which will be described next: Semaphores A critical region is a sequence of statements in the code that must appear to be executed indivisibly (or atomically). In the timestamp example above, the critical region contains both reading the sensor value and putting a timestamp on it. A semaphore is a data structure used for protection of critical regions. In more general terms, we can say that semaphores are used for synchronization between tasks by providing mutual exclusion when several tasks are accessing same resources. Mutual exclusion means that only one task is using the resource at a time. Here is a code example on how to use semaphores:
void Task_T(){ ... if (lockSemaphore(S)) { /* try to get semaphore S */
/* critical region entered*/ a.value = read(sensor); a.timeStamp = getTime(); unlockSemaphore(S); /* critical region exited */ } else /* failed to lock semaphore S */ ... }
The typical semaphore mechanism used in traditional operating systems is not suited for implementing real-time applications because it is subject to priority inversion, which occurs when a high-priority task is blocked by a low-priority task for an unbounded interval of time. Priority inversion must absolutely be avoided in real-time systems, since it introduces nondeterministic delays on the execution of critical tasks. Priority inversion can be avoided by adopting particular protocols that must be used every time a task wants to enter a critical region. We will talk about those real-time resource access protocols later, in the scheduling chapter.
Interrupt handling If not handled properly, interrupts generated by external devices can cause a serious problem for predictability of a real-time system, since they can introduce unbounded delays in task executions. The objective of the interrupt handling mechanism of an RTOS is to provide service to the interrupts generated by attached devices, such as the keyboard, serial ports, sensor interfaces, etc. This service consists of the execution of a dedicated routine (device driver) that will transfer data from the device to the main memory or vice versa. In classical operating systems, application tasks can always be preempted by drivers, at any time. In real-time systems, however, this approach may cause some hard task deadlines to be missed. Hence, in real-time systems, the interrupt handling mechanisms must allow the most critical tasks to execute without interference. This can be done by using one of the following techniques: Disable all external interrupts This is the most radical approach, where all peripheral devices must be handled by the application tasks, which have direct access to the registers of the interfacing boards. Since no interrupt is generated, data transfer takes place through polling i.e., periodically checking if any new event occurred. The main disadvantage of this approach is low processor efficiency on I/O operations, due to the polling. Manage external devices by dedicated kernel routines All external interrupts are disabled, but the devices are handled by dedicated kernel routines rather than application tasks. The advantage of this approach with respect to the previous one is that all hardware details of the peripheral devices can be encapsulated into kernel procedures and do not need to be known to application tasks. A major problem of this approach is that the kernel has to be modified when some device is replaced or added, since the device handling routines are part of the kernel. Allow all external interrupts, but reduce the drivers to the least possible size According to this approach, the only purpose of each driver is to activate a proper task that will take care of the device management, see Figure 11. Interrupt handling is integrated with the scheduling mechanism, so that the task that handles an interrupt event can be scheduled as any other task in the system. This way, we control the execution order by assigning priorities to tasks, rather than allowing random interrupts to preempt high priority tasks. Thus, an application task can have a higher priority than a device handling task. Besides, this approach has high CPU efficiency on I/O operations, since the interrupts are handled when they occur (no polling).
Hardware External interrupt
HAL
Application Task that handles the interrupt
Interrupt routine
Figure11:Atechniqueforhandlinginterrupts.
Almost every system allows you to disable interrupts, usually in a variety of ways. For example, most I/O chips allow a program to tell them not to interrupt, or, microprocessors allow your program to tell them to ignore incoming signals on their interrupt request pins, by either writing a value in a special register in processor, or with a single assembly language instruction. As mentioned before, disabling interrupts in real-time systems might lead to serious consequences. In this type of system the question how fast the system responds to each interrupt is crucial. In other words, we must know the longest time the RTOS can disable interrupts, which is known as interrupt latency. Low interrupt latency is not only necessary for hard real-time systems. Even in soft real-time systems, it is needed for reasonable overall performance, particularly when working with processing of audio and video. In order to have reasonable soft real-time performance (for example, performance of multimedia applications), the interrupt latency caused by every device driver must be both small and bounded. Interrupt latency is one of the most important factors when choosing RTOS for an application. If there is an interrupt in the system that must be served faster than the length of time the interrupt is disabled in the RTOS, then we cannot use that RTOS. This because the handling of the interrupt can be delayed for the amount of time that, in worst case, is equal to the interrupt latency. This is illustrated in the example of Figure 12. A task that handles an interrupt gets preempted by the RTOS kernel with an interrupt disable just before the interrupt occurs. In that case, the task will not be able to handle the interrupt until the kernel has enabled interrupts again.
Event that causes interrup occurs Interrupt routine RTOS kernel Task time Interrupt handling time
Figure12:Interruptlatency.
RTOS interrupt latency
Task execution mechanisms Most real-time tasks are periodic, i.e., they perform the same computation again and again, with a specified period time interval between two consecutive invocation (e.g., reading of a sensor value each 100 milliseconds). An individual occurrence of a periodic task is called task instance (also known as job), see Figure 13.
period Task
(and so on)
instance k+1 instance k+2
instance k
time
Instance k is invoked
Instance k+1 is invoked
Instance k+2 is invoked
Figure13:Periodictaskinstances.
It is clearly inefficient if the task is created and destroyed repeatedly every period. In an operating system that supports periodic tasks, the kernel itself re-initializes such a task and puts it to sleep when the task completes. We just need to assign period to tasks before runtime, and the kernel keeps track of the passage of time and releases (i.e., moves to the ready queue) the task again at the beginning of the next period. It is very similar to regular function calls that are called repeatedly. The task can be re-invoked infinitely, i.e., under entire lifetime of the system, or it can be terminated after a finite number or instances. Here is an example:
int period_time = 50;
... void task_(){ ... /* do task work */ ... /* kernel takes over when the task is done */ }
Most commercial RTOSs, however, do not have implicit mechanism for periodic tasks at the kernel level. Since many real-time tasks are of periodic nature (e.g., sampling), there must exist some other mechanisms available in those RTOSs to implement periodic tasks explicitly, at user level.
We can implement a periodic task at user level as a thread that alternately executes the code of the task and sleeps until the beginning of the next period. In other words, the task does its own re-initialization and keeps track of the time for its own next release, e.g.:
void task_1{ int period_time = 50; while(1){ ... /* wait some time and re-invoke */ sleep(period_time); } } /* do forever */ /* do task work */
The infinite while-loop ensures that the task instances will be invoked repeatedly, and sleepfunction makes sure that there will be some time interval between consecutive invocation. In other words, at the end of its execution a task suspends itself for some time (goes from executing into waiting state), allowing lower priority tasks to use the processor. Without sleep, the task would run all the time, consuming all CPU time, and no lower priority tasks can execute (higher priority tasks can preempt it). The sleep-function can be implemented in different ways, providing relative or absolute delays. Relative delay means that the next instance will be released when specified time, relative to the call time, has elapsed. The sleep-function in the code example above provides a relative delay, i.e., it will put the task into waiting state for a number of clock ticks specified by period_time, i.e., 50 in this example. However, an alert mind will notice directly that the implementation above will not really achieve the desired period time, since it does not take into the consideration the execution time of the task. If, for example, it takes 10 clock ticks to execute the task, then the next instance will be released at 10+50 ticks, which in not what we want. We can solve this by subtracting the execution time from desired period time, e.g,:
... /* do task work for 10 clock ticks */ ... sleep(period_time-10); ...
The solution above is not general; the period will be 50 only if the task is either the only one in the system, or if it has the highest priority. Otherwise, if there are other higher priority tasks in the system, they might preempt the task just before calling the sleep-function. Assume, for example, such a higher-priority task 2 that has an execution time of 20 clock ticks. Preemption after 10 ticks of execution will cause 1 to invoke the next instance after 70 ticks, not desired 50 ticks, as illustrated in Figure 14.
Task 2 wait_time=40 Task 1 time
t+30
t+70
Next instance of 1 released

Figure14:Relativedelay.
Hence, we need to include the preemption time in the execution time of 1, which can be done by using timestamps, as follows:
void task_1(){ ... while(1) { start_time = getTime(); /* do task work */ ... stop_time = getTime(); sleep(period_time-(stop_time-start_time)); } }
An absolute delay will suspend the task execution until the system clock has reached specified time (counted from start of the system). The system call used to provide absolute delay is usually called sleepUntil, delayUntil or waitUntil. Here is an example how we can implement the task above by using absolute instead of relative delay.
void task_1(){ ... period_time = 50; next_time = getTime(); while(1) { /* do task work */ ... next_time = next_time + period_time; sleepUntil(next_time); } }
If the tasks starts to invoke its instances at time t, i.e., initial next_time is equal to t, then all consecutive values of next_time will be t+50, t+100, t+150, etc, regardless if the task gets preempted or not. Jitter We have showed above how to implement periodic tasks with help of relative and absolute delay. Correctly calculated period, however, does not necessarily mean that the distance between execution of consecutive task invocations will be constant. There can be variations in the actual execution, caused by high priority tasks. Those variations are called jitter. Consider the following example; assume two periodic tasks, 1 and 2, with the execution time 2 and 1, and the period time 4 and 10 respectively. Assume also 1 has higher priority than 2. Figure 15 shows what happens if both tasks are released at the same time. Although we have defined the period of 2 to be 10, (i.e., 2 is released at times 0,10,20,etc), the time between executions of its instances will vary between 8 and 12 clock ticks, based on if preemption from 2 occurs or not.
2
period= 8 period =12
1
0 2 4 6 8 10 12 14 16 18 20 22 24
Figure15:Jitterinperiodicexecution.
The objective is to minimize jitter for each task (ideally jitter=0). The smaller the jitter, the better periodicity of a task's execution. But that is not easy; the only task for which we can guarantee jitter free execution is the highest priority one, still under condition that no external interrupts take place. In all other cases, tasks can get jitter. We will show later in the scheduling chapter how we can calculate the effect of jitter when predicting the system behavior. Communication and synchronization mechanisms Often, tasks execute asynchronously, i.e., at different speeds, but may need to interact with each other, e.g., to communicate data to each other, or to access shared resources. Most realtime operating systems offer a variety of mechanisms for handling task interactions. These mechanisms are necessary in a preemptive environment of many tasks, because without them tasks might communicate corrupted information or otherwise interfere with each other. For instance, we could see in the swap-example above that the preemption in the middle of the operation can cause wrong values to be assigned to variables that are to be swapped. Another example is two tasks sharing the same display, where one task measure the current temperature of the air and displays it as e.g. "10o C", while the second one displays the current time e.g., "23:15". If we do not protect the access to the display device, it might result in strange display caused by one task preempting another one. For example, the temperature-task writes "10" and, before writing the rest of the display text, the second task preempts it and writes its own text "23:15", which would result with the display output being: "1023:15". We can, for example, use semaphores to protect the access to the shared resource and solve the problem. The simplest way of the communication between tasks is through a shared memory, where each communicating task may update pieces of shared information/data, as illustrated in Figure 16.
Task 1
write
Shared variable v
read
Task 2
Figure16:Communicationthroughsharedvariables.
Communication through shared memory is easy and efficient way of communication. It provides a low-level, high bandwidth and low-latency means of inter-task communication. It is commonly used for communication among tasks that run on one processor, as well as among tasks that run on tightly coupled multiprocessors. The disadvantage with this approach is data overwrite, i.e., the old values are overwritten by the new ones since no buffering is provided. Another difficulty is to synchronize accesses to shared memory. The application developer must make sure that the data access is atomic, i.e.,
tasks must not be interrupted while updating the shared memory space. This can be achieved by protecting the shared variable with a semaphore, as shown in the code example below:
void sender_taks(){ ... while(1){ ... lockSemaphore(S); /* enter critical region */ v = getValue(); /* exit critical region */ unlockSemaphore(S); ... } }
void receiver_task(){ while(1) { ... lockSemaphore(S); /* enter critical region */ local = v; /* exit critical region */ unlockSemaphore(S); ... } }
An alternative is to use Wait- and lockfree communication, which is a method to accomplish non-blocking communication between tasks. Non-blocking means if two or more tasks want to read from a wait- and lockfree channel, WLFC, no one of the readers is delayed by another task (compare it to the shared variables approach, where tasks get blocked if some other task is updating the variable). This is done by assigning one buffer to each reader of the WLFC. One extra buffer is added to assure there always exists one free buffer in the WLFC, see Figure 17.
Buffer slot1
Buffer slot 2
Buffer slot 3
(free slot)
Task 1
(producer/writer)
Task 2
(consumer/reader)
Figure17:Waitandlockfreecommunication.
Both tasks get their own buffer slot. Task 1 starts by writing some data to buffer slot 1. When task 2 starts to read the data from slot 1, task 1 continues writing to slot 3. This way, there is always one free buffer for writing. The formula for calculating the number of needed buffers is: nbuffers = nwriters + nreaders + 1
Since writers do not share buffer slots, there is no need for atomic write operation. This makes it good for communication large amount of data that continuously changes. The disadvantage is that it wait- and lockfree communication requires more memory than shared variables. Here is an example how wait- and lockfree communication can be used in task code:
void task_Consumer(){ void task_Producer(){ ... ... while(1){ while(1){ ... ... /* get the pointer the buffer */ /* get the pointer the buffer */ /* to read from */ /* to write to */ buff_ptr = readWLFC(buf_ID); buff_ptr = writeWLFC(buf_ID); ... ... } } } }
Both read- and write functions return a pointer to the buffer to operate on. Since the buffers are user-defined, it is the user who is responsible for filling the buffer with data. A WLFC contains an array of buffers, pointers to the oldest and the newest values in the buffer, and a list of all tasks that can use the buffers. Every time a task becomes READY (due to a new period), it is assigned (by the kernel) a pointer to a buffer within the WLFC. If the task is a reader, it will get the most recently written buffer, and if the task is a writer, it will get the pointer to the first free buffer with the oldest value. Message passing is the most popular technique for transferring data between tasks in a multitasking software environment. Most real-time operating systems use "indirect" message passing. In this approach, messages are not sent straight from task to task, but rather through message queues. The idea is that one task will send messages into the queue; and then, perhaps later on, another task will fetch the messages from the queue, see Figure 18.
Message queue
Task 1
Task 2
Figure18:Communicationthroughamessagequeue.
Before a task can send a message to another task, the message queue needs to be created. Any task can ask to create the queue; it doesn't have to be the message sender task or the message receiver task. But, both the message sender and message receiver tasks need to be informed of
the identity of the queue, called a queue identifier, in order for them to communicate through the queue. Here is an example code for inter-task communication via message queues:
void task_Sender(){ ... while(1){ ... /* send message */ if(send(MSGQ,msg)) /* message sent */ else /* something is wrong */ /* e.g., queue full */ ... } }
void task_Receiver(){ ... while(1){ ... /* receive message */ receive(MSGQ,&msg)) ... } }
A message queue can either be global or local. Global means that all tasks in the system can read messages for the message queue, while local is connected to a specific pair of tasks (sender and receiver). A message queue is usually implemented as a first-in-fist-out (FIFO) queue. However, some RTOSs use priority queues instead, which is a better choice. The sending task can specify the priority of its message, which results in faster de-queuing, i.e., it will be received faster than lower priority messages. When creating message queues for inter-task communication, we need to allocate memory to store the messages in the queue. A common problem when building embedded systems is the lack of memory. Embedded systems usually have very limited memory and CPU resources that should be used in the most efficient way. So, when allocating memory for the message queues we should be careful not to waste more memory than necessary. For example, there is no point in allocating memory for 50 messages if the queue will maximally contain one message at a time, i.e., as soon a message arrives, the receiver task removes it from the queue. Here is an example. Assume two tasks that communicate through a message queue. The sender task, 1, has a period time 500, and the execution time 1200. The receiver task, 2, has the period time 300 and the execution time 100. The sender task sends three messages to the receiver task in each instance. The receiver task reads two messages in each instance. How must the message queue be dimensioned? To be able to answer this question, we need first to look how the sender and the receiver task interleave during their execution. Since the receiver task has higher priority it will preempt the sender task whenever both are ready at the same time.
The execution trace is shown in Figure 19 . We need to analyze the trace for worst-case scenario, in which both tasks are released simultaneously at time 0. How long should we analyze the trace? The answer is until the next point in time when the tasks are released at the same time, which is easily obtained by calculating the least common multiple (lcm) of the task periods, i.e., it is lcm(300,500) = 1500. After time 1500, the execution pattern of the tasks will be exactly the same as the one between 0 and 1500, hence we just need to consider the trace up to the lcm of the task periods. This interval is also known as hyperperiod.
both ready both ready again
2
300 600 900 1200
1
0 100 500 700 1000 1500
Figure19:Examplecommunicationviamessagequeues.
We can use the following table to illustrate what happens during the execution: Time 0 100 300 500 600 900 1000 1200 1500 m7, m8, m9 m9 m1, m2, m3 m3 m3, m4, m5, m6 m5, m6 Message queue before execution Task execution 2 starts (high priority). No messages to read yet 1 starts and sends 3 messages Second instance of2 runs. It reads two messages Second instance of 1 executes. It sends additional three messages Third instance of 2 preempts. It reads two messages Fourth instance of 2 runs. It reads two messages Third instance of 1 runs. It sends three new messages. Fifth instance of 2 runs. It reads two messages Star of new hyperperiod. 2 runs and reads one message Message queue after execution
m1, m2, m3 m3 m3, m4, m5, m6 m5, m6
m7, m8, m9 m9
We see that the maximum number of messages contained in the queue at any given time is four messages; hence it is enough to dimension the queue to be able to contain four messages. Tasks synchronize in order to ensure that their exchanges occur at the right times and under the right conditions. In other words, in order to carry out the required activities, a task may need to have the ability to say "stop" or "go" or "wait a moment" to itself, or to another task. Synchronization between two tasks can be implemented by the following service calls:
sendSignal(event) sends the fact that an event has occurred. Its action is to place
event information in a channel or pool. This in turn may enable a waiting task to continue. waitSignal(event) - causes the task to suspend activity as soon as the wait operation is executed, and it will remain suspended until notification of an event is received.
Signals can be sent directly to a specific task, or they can be sent as a broadcast to all tasks in the system. If sent directly, then we need to include the receiver task in the sendSignal(..) call. Another way of implementing synchronization is to use semaphores, which we already discussed. Memory management Many general-purpose operating systems offer memory allocation services from what is called a heap. The famous malloc and free services, known to C-language programmers, use heap; tasks can temporarily borrow some memory from the operating systems heap by calling malloc, and free it when done by calling free. Heaps suffer from external memory fragmentation that may cause the heap services to degrade. This fragmentation is caused by the fact that when a buffer is returned to the heap, it may in the future be broken into smaller buffers when malloc requests for smaller buffer sizes occur. This will result in small fragments of memory appearing between memory buffers that are being used by tasks. These fragments are so small that they are useless to tasks, but they cannot be merged into bigger, useful buffer sizes. This will eventually result in situations where tasks will ask for memory buffers of a certain size, and they will be refused by the operating system, even though the operating system has enough available memory in its heap. This fragmentation problem can be solved by so-called garbage collection (defragmentation) software. Unfortunately, garbage collection causes random, non-deterministic delays in the heap service, making it unsuitable for real-time system (where we want to be able to predict all delays). So, what to do in real-time systems? Real-time operating systems offer non-fragmenting memory allocation techniques instead of heaps. They do this by limiting the variety of memory chunk sizes they make available to application tasks. For example, the pools memory allocation mechanism allows application tasks to allocate chunks of memory of perhaps 4 or 8 different buffer sizes per pool, see Figure 20.
Pool 1
Pool 2
Pool 3
Same block size
Figure20:Memoryallocationthroughpools.
Pools avoid external memory fragmentation, by not permitting a buffer that is returned to the pool to be broken into smaller buffers in the future. Instead, when a buffer is returned the pool, it is put onto a free buffer list of buffers of its own size that are available for future reuse at their original buffer size. Memory is allocated and de-allocated from a pool with deterministic, often constant, timing. Device drivers When constructing larger systems, where the probability of replacing a hardware component in the future is high, it is a good idea to encapsulate all hardware-depended software into device drivers. Device drivers provide an interface between software and hardware, they manage hardware devices and they have more privileges than regular tasks. The interface to application software should not contain any specific details about the underlying hardware device, because it should be possible to replace the device without changing the application software. Figure 21 gives an example of a device driver for a circuit that send and receives serial data (a UART).
INIT OPEN
Device driver
READ WRITE CLOSE
Task
Figure21:Exampledevicedriveinterface.
When the system is started, INIT is called, typically with some input parameters that define the speed of sending bits, the number of start and stop bits, 7- or 8-bits characters, and even or odd parity. The device is exclusively reserved by calling OPEN, so that no other task can use it at the same time. The characters are read and written by using calls READ and WRITE. When we are done with the sending, CLOSE is called, which releases the device and makes it available for other tasks. A device driver can be implemented in several ways. If it is supposed to be able to do buffering, a device driver is usually implemented in one or several tasks. If no buffering is needed, it can be implemented only by using semaphores.
Device driver
a3 a1,c1
In-buffer
b2
b1
HWcircuit
a2 c3
Interrupt routine
d3 c2
INIT OPEN
Driver b3
d1
READ WRITE CLOSE
Task
Out-buffer
d2
Figure22:Exampleimplementationofadevicedriver.
Figure 22 depicts a possible implementation of the device drive in the example above. As we can see in the figure, the interrupt routine communicates with the driver via buffers. We will show now what happens upon read and write operations. Read When a character is received by the hardware circuit, an interrupt is generated (a1 in the figure). Then, the interrupt routine reads the character (a2), and puts it into the input buffer (a3). If there is any new incoming character in the input buffer, the driver will read it (b1) and deliver it to the application task (b3). If there is no new character (b2) the driver will wait until it arrives. Write The driver receives a write request from the task (d1), and puts it into the output buffer (d2). Driver activates the hardware circuit by sending a write request interrupt (d3). The circuit then starts the interrupt routine (c1), which reads the character from the output buffer (c2) and puts it into the hardware register (c3). The interface to the device driver can be implemented something like this:
void user_task{ ... DD_init();/* initiate the device driver (DD) */ ... while(1){ if( DD_open() ){ DD_write("...and this..."); ... DD_close(); } else /* Driver is currently used by some other task */ ... } } int DD_open(){ if(lockSemaphore(S)) return OK; return FAILED; } int DD_close(){ if(unlockSemaphore(S)) return OK; return FAILED; } void DD_write(char *str){ /* A task sends its strings to the driver via a message queue */ msg.text = str; send(MSGQ,msg); } void task_DD(){ while(1){ /* The driver just reads all received strings and prints them */ receive(MSGQ,msg) printf("%s",msg.text); } } /* release driver when done */ /* try allocate the driver */ DD_write("Print this..."); /* send a string to the driver */
2.8
Time-triggered RTOSs
So far we have mostly talked about event-triggered systems. As I mentioned before, this type of RTOS is most common in the industry since most of the commercial RTOSs are eventtriggered. However, there is another type of RTOS: time-triggered real-time operating systems, where all activities are carried out at certain points in time which are known a priori. One reason for introducing support for time triggered execution is design of safety critical systems, for which it must be possible to prove, or at least show, that the system behavior is correct. Verification of correctness is facilitated with the time triggered approach due to the reproducible behavior (the execution order is static). Many control systems require timely execution which can be guaranteed pre-run-time. Task execution Tasks in a time-triggered RTOS are activated according to a time table (schedule), at predefined times e.g., at time t=5 run task A, at time t=12 run task B etc. A schedule is a table that is created before the start of the system and during run-time it repeats itself after some time (cycle time), see Figure 23.
cycle
(Schedule is repeted)
time
Figure23:Timetriggeredexecution.
There are two ways of implementing time-triggered systems. The simplest approach is to activate only one task at each clock tick, as illustrated in Figure 24-a. The problem is, if a task has a very short execution time (shorter than the length of the clock tick), it will still allocate an entire clock tick, since only one task is released per tick. This will result in poor utilization (usage) of the system, since the rest of the tick will be unused. We could increase the clock resolution, but that will also result in increased overhead to handle ticks (more clock interrupts). A better approach is to allow activation of several tasks per clock tick, see Figure 24-b. Tasks are defined as successors to each other, and we release a sequence of tasks (chain) instead of just one task in a tick. So, when the first task in the chain has completed its execution, the next task in the chain released at once, without waiting for the next clock tick. Task chains make it possible for several tasks to execute within one system clock tick. Another advantage of this approach is easy implementation of preemption. Whenever there are several tasks chains ready to execute, the one with the latest start time gets the highest priority, while the chain with the oldest start time gets the lowest priority.
Taskc chains
unused time
50
a) One task per clock tick
b) Several tasks per clock tick
Figure24:Exampletaskschedulingintimetriggeredsystems.
Task structure From the user point of view, a task in a time-triggered system is just a function. All task parameters are defined in the schedule (i.e., period, execution time), and the application programmer needs only to write the task code. The user does not need to worry about mutual exclusion and concurrency control since all conflicts are resolved in the schedule. Usually, tasks share the same memory stack, which results in a very memory efficient system. In order to make this work, there must not be any blocking primitives (like semaphore locks). Furthermore, a task that preempts another task must terminate and clean its data from the stack before the preempted task can be resumed again. Communication and synchronization With time-triggered scheduling there is no need to worry about concurrency control. Tasks run sequentially one after the other and therefore mutual exclusion is guaranteed. All conflicts between tasks are resolved in the schedule, before system starts to run. We simply separate access to shared resources by time or we put tasks that share resources in the same chain. For example, if two or more tasks are accessing the same resource, we simply construct the schedule so that the execution of conflicting tasks is separated in time. This way, those tasks cannot access a shared resource at the same time. We can make a parallel from day-to-day life: time tables for trains Assuming no delays or break downs on trains can occur, we could eliminate the whole signaling system for trains; since due to the construction of the time table, no two trains can reside at the same railway section. Time-triggered scheduling is also easy to implement. For many simple systems this kind of scheduling approach is perfect. On the other hand, while being good for cyclic tasks, sporadic (non-periodic) activities really mess things up especially short-deadline ones. For example, an event may occur at most once every ten seconds, but need to be handled within 2 milliseconds. This has to be handled in the system by polling, and to meet a 2 ms deadline in the example system, a poll in each 1 ms slot is necessary. This wastes a lot of CPU time. Another drawback of time-triggered approach in poor flexibility to include new activities (tasks) in the system. Once a schedule is made, it is usually fixed and if we want to add something we need to reschedule the entire system. We will talk more about scheduling of both time-triggered and event-triggered systems in the scheduling chapter.
2.9
Example commercial RTOSs
Here we provide a brief summary of the features of some of the popular real-time operating systems. These are presented in 3 groups, viz., event-triggered commercial RTOSs, timetriggered RTOSs and research RTOSs. Event-triggered RTOSs VxWorks This is one of the widely-used RTOSs in the market, which is developed by Wind River systems. It supports many popular hardware platforms and had been used in several diverse applications over past two decades. It is built around a large number of APIs and customizability is one of its strong features. It supports multi tasking with 256 priority levels and has deterministic context switching. Preemptive scheduling and priority inheritance are supported. Support for multi-core processors, symmetric and asymmetric multi-processing (SMP & AMP), IPv6 network stack, and special development platforms tuned for safety critical domains, are the key features of latest versions. Windows CE Windows CE is a small footprint kernel and supported on Intel x86 and compatibles, MIPS, ARM, and Hitachi SuperH processors. It supports 256 priority levels and priority inheritance. All threads are enabled to run in kernel mode and slices CPU time between threads. Execution time for non-preemptable code is reduced by breaking the nonpreemptable parts of the kernel into small sections. Kernel objects like processes, threads, semaphores etc. are dynamically allocated in virtual memory. QNX This POSIX-compliant RTOS first appeared in 1982. Centered around a minimal micro kernel and a host of user servers which can be shutdown as needed. Supports most modern CPUs such as MIPS, PowerPC, SH4, StrongArm, xScale, and x86. It is scalable from constrained embedded to multiprocessor platforms. The architecture provides multitasking, priority-driven pre-emptive scheduling, synchronization, and TCP/IP protocol. Synchronous message passing, nested interrupts and fixed upper bound on interrupt latencies are some of the other features. Adaptive partitioning technology helps system designers guarantee responses to events; for example, guarantee a minimum CPU budget to the user interface of a device. QNX Neutrino is the latest version and the source code is available from the 2007 onwards. pSoS This OS is built around the concept of object orientation. Typical objects include tasks, semaphores and memory regions. It supports EDF as well as preemptive priority based scheduling. Priority ceiling and priority inherence protocols are supported to avoid priority inversion problem. Application-level control over interrupt handling, supervisory mode execution of user tasks and dynamic loading of device drivers are some of the features of pSoS. OS-9 Originally developed for the Motorola processors during early 80s. Clear separation of kernel mode and user mode and ability to run on 8, 16 and 32 bit processors. RT Linux RT Linux (or RTCore) is a microkernel that runs the entire Linux operating system as a fully pre-emptable process. This originated from the research at New Mexico
Institute of Mining and Technology and currently available in two versions- a free version and a paid version from Wind River systems. This supports hard real-time operations through interrupt control between the hardware and the operating system. Interrupts needed for deterministic processing are processed by the real-time core, while other interrupts are forwarded to Linux, which runs at a lower priority than real-time threads. First-In-First-Out pipes (FIFOs) or shared memory can be used to share data between the operating system and RTCore. There are several other commercial RTOSs of this category such as RTEMS, Palm O/S, XP Embedded, DSP/BIOS, RTX, Uc/OS, OSEK etc. Time-triggered RTOSs OS Based on the time-triggered protocol, TTPOS combines a small footprint and fast context switch. This supports priority based, cooperative preemptive scheduling, synchronization to a global time and error detection features to support fault-tolerance. Deadline monitoring for tasks and interrupt service handlers for aperiodic requests are provided. Multiple time bases such as global fault-tolerant TTP time and local time are supported. Rubus The Rubus RTOS has evolved from Basement, a distributed real-time architecture developed in the automotive industry and research at Mlardalen University. The Rubus methods and tools have been used by the Swedish automotive industry for more than a decade. The key constituents of the Basement concept are: Resource sharing (multiplexing) of processing and communication resources, A guaranteed real-time service for safety critical applications, A best-effort service for non-safety critical applications, A communication infrastructure providing efficient communication between distributed devices, and A program development methodology and tools allowing resource independent and application oriented development of application software. To guarantee the real-time behavior of the safety critical application static scheduling in combination with time-triggered execution is utilized. Dynamic scheduling is utilized for safety critical application as well as for non-safety critical application. Three categories of run-time services are provided by Rubus OS (each by a kernel with a name matching the color of service): Green Run-Time Services, external event triggered execution (interrupts). Red Run-Time Services, time triggered execution, mainly to be used for applications that have hard real-time requirements.
TTP
Blue Run-Time Services, internal event triggered execution; to be used for applications that have hard real-time requirements as well as have soft real-time requirements.
Research kernels Spring The Spring kernel was developed at University of Massachusetts, Amherst with the aim of providing scheduling support for distributed systems. This can dynamically schedule tasks based upon execution time and resource constraints. The safety critical tasks are scheduled using a static table. The kernel helps retain enough application semantics to improve fault-tolerance and performance on overloads. It supports both application and system level predictability. Spring supports abstraction for process groups, which provides a high level of granularity and a real-time group communication mechanism. It supports both synchronous and asynchronous multicasting groups and achieves predictable low-level distributed communication via globally replicated memory. It provides abstractions for reservation, planning and end- to- end timing support. Admission control, planning-based scheduling and reflection are notable features of spring kernel. MARTE This is a research kernel developed by University of Cantabria. It is written in Ada, follows the minimal Real-Time POSIX.13 subset and supports Ada 2005 real-time features. Concurrency at thread level (whole program is a single process), single memory space (threads, driver and OS) and static linking (output is a single bootable image) are main features. Tools for supporting waiting synchronization and mutual exclusion, measuring time, efficient triggering of events, offline (e.g. MAST) and online (FRESCOR) scheduling are provided.
2.10
Exercises
1. Answer the following questions about Real-Time Operating systems: a) What is a real-time operating system (RTOS)? Explain the difference between an RTOS and a general-purpose operating system? b) Each RTOS has something that is called interrupt latency. What is that? c) Explain at least three different states for a real-time task. Also, explain which transitions between given states can take place. d) Explain briefly the mechanisms provided by RTOS to support shared resources. e) What does re-entrant code means? f) Can Windows NT be used as an operating system for real-time applications? If no, motivate why not. If yes, motivate why yes and give an example real-time application that can run on Windows NT? 2. Assume two periodic tasks 1 and 2 that communicate to each other by sending messages. Task 1 has an execution time 200 ms and a period 500 ms. 1 sends 3 messages to a message queue during each period (i.e., it sends 3 messages in each instance). Task 2 has an execution time 100 ms and a period 300 ms. 2 manages to read 2 messages under its period (i.e., reads 2 messages in each instance). Assume 2 has a higher priority than 1 and 1 is
allowed to send its messages at any point in time during its execution. Also, when a task reads a message from the message queue, the message is removed from the queue. Entire situation is depicted below: 1 (low priority)
period = 500 ms execution time = 200 ms Message queue
2 (high priority)
period = 300 ms execution time = 100 ms
Sends 3 messages during its period (i.e., in each instance)
Reads 2 messages during its period
Since we do not want to allocate more memory than necessary for the message queue, we would like to minimize its size. What is the minimum possible size of the message queue (counted in number of messages) such that we are able to guarantee there will always be enough space in the queue for 1 to insert its messages? Motivate your answer. Hint: Think of the system behavior in the worst case, which is: both tasks are released simultaneously and 2 preempts 1. It helps a lot if you draw the execution trace for the tasks. 3. Assume following three periodic tasks:
Task_ 1(void){ while(1){ /*do something*/ ... sleep(42); } } } } Task_ 2(void){ while(1){ /*do something */ ... sleep(24); } /* no sleep */ Task_ 3(void){ while(1){ /*do something*/
a) Assign task priorities so that all tasks will be able to execute (i.e., no task must wait forever because of some other task). Motivate! b) Give an example when a) is not fulfilled c) If you remove sleep(24) from 2, is it possible to set priorities so that a) is fulfilled?

2 Real-Time Operating Systems: 2.1 Learning Objectives of This Chapter

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

2 Real-Time Operating Systems: 2.1 Learning Objectives of This Chapter

Transféré par

Droits d'auteur :

Formats disponibles

2

Real-Time Operating Systems (by Damir Isovic)

Learning objectives of this chapter

Concurrent task execution

WewillusethetermsOperatingSystem(OS)andkernelinterchangeably,eventhoughanOStypically consistsofitscentralpartsthekernelandadditionalservices.DuetothesimplicityofmanyRealTimeOSs, theydonotprovideanyadditionalservices,suchasafilesystem.

read sensor Controller (program) read joystick

user sensor time

void Sensor_Task(){ while(1) { sensor_val = read_sensor(); control(sensor_val,user_val,&signal); write_actuator(signal); sleep(); }

void User_Task(){ while(1) { user_val = read_sensor(); sleep(); } }

What is a Real-Time Operating System?

Application software RTOS Hardware Adaptation Layer Hardware

System clock time 0 1 2 3 4

Event-triggered Real-Time Operating Systems

Data area stack and heap

Shared program code

Data area (PID 1) Data area (PID 2)

void Task_T() { struct a; ... a.value = read(sensor); a.timeStamp = getTime(); ...

Task execution a) time

preemption High-priority task b) read(sensor) getTime()

void Task_T(){ ... if (lockSemaphore(S)) { /* try to get semaphore S */

Hardware External interrupt

Application Task that handles the interrupt

RTOS interrupt latency

Instance k+1 is invoked

Instance k+2 is invoked

int period_time = 50;

... /* do task work for 10 clock ticks */ ... sleep(period_time-10); ...

Task 2 wait_time=40 Task 1 time

Next instance of 1 released

void task_Receiver(){ ... while(1){ ... /* receive message */ receive(MSGQ,&msg)) ... } }

m1, m2, m3 m3 m3, m4, m5, m6 m5, m6

Same block size

READ WRITE CLOSE

READ WRITE CLOSE

a) One task per clock tick

b) Several tasks per clock tick

Example commercial RTOSs

Sends 3 messages during its period (i.e., in each instance)

Reads 2 messages during its period

Vous aimerez peut-être aussi