MC104 - Sistemas Operacionais

7º Projeto de MC104 - SISTEMAS OPERACIONAIS

Scheduling, Priorities, and Priority Inversion

Introduction

When developing a real-time system, it will soon become apparent that some tasks are more important than others. Some must be run on-time, every time, while others can run whenever the processor has a few cycles to spare. For example, if we return to the mythical nuclear reactor that has been described in many of the previous experiments, we can assign priorities to the different processes. Looking at the nuclear reactor system, the process that reads the data off of the data aquisition card 1000 times a second is time critical. It must be run on-time, every time. Therefore it would have a high priority. On the other hand, the archive system, which is simply writing information to disk, can be designed to have a large buffer. As data is read it is thrown into the buffer, when the system is not busy, the archive program can run, empty the buffer, and write the data to disk. The other processes would probably fall somewhere between these two in terms of importance.

Scheduling is the term given to the method by which processes are run on a UNIX system. In this experiment, the methods by which processes can be scheduled are examined including how to change the scheduler and the priority of a process. In addition, some of the pitfalls of scheduling will be noted. The example program demonstrates one of these pitfalls, "priority inversion".

Objectives

The following are the primary objectives of this experiment:

to learn how to schedule processes on a UNIX system;

to understand why processes need to be scheduled and know when to change the default scheduler;

to identify the pitfalls of scheduling and know ways to avoid them;

to know how to change the scheduler and priority of a process.

Theory Aspects

As stated in the introduction, there are many circumstances where the designer of a real-time system will wish to have one process be "more important" than another. In another words, if process A is able to run, it should run in place of any other process. In this case, process A would have a high priority. The operating system decides when a process should or should not get to run based on a set of rules it has. Scheduling is the functionality of the operating system deciding what should run and when. POSIX.1b gives a set of calls through which the programmer can modify how and when a set of processes will run. The calls to change a processes priority and scheduling will be described in the next section. The rest of this section will focus on some of the theory behind the scheduling.

POSIX.1b allows the user to choose between three different scheduling methods. The three scheduling methods in POSIX.1b are:

Round-Robin Scheduling

FIFO (First in first out) Scheduling

Other Scheduling

First, lets look at the Other Scheduling algorithm. The other scheduling algorithm is the default scheduler that the system normally operates under. This is the time-share scheduler that attempts to give all processes a "fair share" of the processor. This algorithm makes no guarantees and it is difficult to predict what will run and when. Process "priorities" can be modified by the operating system in an attempt to more evenly distribute the processor between processes. The Other Scheduler is good for user processes that don't have any specific time constraints. It is designed for time-sharing systems and makes no guarantees. For real-time performance, something else will have to be used. The two "real-time" scheduling algorithms defined by POSIX.1b are round-robin and FIFO. Both work of a fixed priority set, where each process running under one of these scheduling algorithms can be assigned a different priority. This priority will determine when and if a program will be able to run. Before examining the rules that govern these schedulers in detail, it will help to clear up some terminology.

A process can execute in several states. The most common states a process can be in are executing, ready, sleeping, and waiting. Executing processes are those that are currently running on the processor. Ready processes are those that are able to run but are not currently assigned to any processor. Sleeping processes are those that don't need to run at this time (ie because of a sleep() call). Waiting processes are typically waiting on some event (such as a disk read to complete) before they become ready. The states of the processes in a system will affect who can run when.

A process can yield the processor via several conditions. The first, it can be performing a device wait (ie waiting for read from the disk). In this case, it will yield the processor and allow someone else to run while it waits for its data. Second, a process can voluntarily yield via a sched_yield() call, a sleep() call, or several others. Third, a processes time quantum can expire and the OS can force it to yield (NOTE: The OS sets a timer signal that is fired off periodically. When this signal is received, the scheduler is run and determines what to do. At this time, the scheduler can force a process to yield if it has been running for longer that a predetermined amount of time.) When a process yields the processor, the scheduler reevaluates who can run and task-in the highest priority process that meets the criteria.

Now that the terminology is out of the way, lets look at the rules that govern the round-robin scheduler:

The OS keeps a list of processes at each priority that are ready.

The OS chooses the process with the highest priority that is at the front of the list to run.

That process will run until it yields. If it does not voluntarily yield within a predetermined amount of time, the OS will force it to yield and place it at the end of the list at its priority level. The OS will then attempt to find another process that is able to run.

At any time, when a process with a higher priority than the one that is currently running is able to run, the higher priority process should be allowed to run.

For the FIFO scheduler, the following rules apply:

The OS keeps a list of processes at each priority that are ready.

The OS chooses the process with the highest priority that is at the front of the list to run.

That process will run until it yields. The OS will not force a FIFO process to yield, it will run for as long as it wants to.

At any time, when a process with a higher priority than the one that is currently running is able to run, the higher priority process should be allowed to run.

The primary difference between the two real-time schedulers is simple. FIFO will allow a process to run indefinitely while round-robin will force a process to yield and allow another process of equal or higher priority to run if it is able.

WARNING WARNING WARNING -- Did I say I'm going to warn you now? If the following line of C code while(1); or any other piece of code that causes the machine to enter an infinite loop, appears in the highest priority process running either FIFO or round-robin, the process will never yield and will never allow another process to run. In another words, the machine will LIVELOCK. It is easy to do, I did it twice while writing the sample program. It is a good idea to keep a terminal window open at a priority higher than any real-time program you are running during debug to prevent this from occurring. To accomplish this, write a simple program that takes three arguments, a pid, a priority, and a scheduler. This program will then assign the specified priority and scheduler to the given pid.

Systems Calls to Deal with Scheduling

Now that you have a bit of a background with how the different schedulers work under POSIX.1b it is time to look at the system calls that will allow you to change your processes priority and scheduler. The following calls are available under POSIX.1b:

sched_getparam(...): allows a process to query the priority of a process (based on pid)

sched_setparam(...): allows a process to set the priority of a process (based on pid)

sched_getscheduler(...): allows a process to query a processes's scheduler (based on pid)

sched_setscheduler(...): allows a process to set a processes's scheduler and priority (based on pid)

sched_get_priority_max(...): returns the maximum priority for a specified scheduling policy

sched_get_priority_min(...): returns the minimum priority for a specified scheduling policy

The commands sched_get_priority_max(...) and sched_get_priority_min(...) gives a portable way for a process to determine what the maximum and minimum priorities are for any single scheduling policy. They are used to determine what priorities are valid. The sched_getparam(...) and sched_getscheduler(...) allow a process to query the priority and/or scheduler of any process in the system (permissions permitting). The sched_setparam(...) and sched_setscheduler(...) allow a process to change the priority and/or scheduler of any process in the system (permissions permitting). The following code segments show how to use the scheduling calls.

Several of the above commands use the sched_param structure. According to POSIX.1b standards, this structure only needs to contain one member, "sched_priority". However, it can contain other members. The following is the definition for "sched_param":

  
  struct sched_param {
    ...
    int sched_priority;
    ...
  };

The following code segment queries the maximum and minimum priorities for the round-robin scheduler:

  
  int min_rr_prio;
  int max_rr_prio;
  ...
  if( ( min_rr_prio = get_sched_priority_min( SCHED_RR ) ) == -1 ) {
    fprintf(stderr,"Error querying minimum scheduling priority!\n");
    exit(1);
  }

  if( ( max_rr_prio = get_sched_priority_max( SCHED_RR ) ) == -1 ) {
    fprintf(stderr,"Error querying maximum scheduling priority!\n");
    exit(1);
  }
  ...

The following code segment queries the priority and scheduler of process 1234:

  
  pid_t pid = 1234;
  struct sched_param param;
  int policy;
  int priority;
  ...
  if( ( policy = sched_getscheduler( pid ) ) == -1 ) {
    fprintf(stderr,"Error querying scheduler!\n");
    exit(1);
  }

  if( sched_getparam( pid, &param ) == -1 ) {
    fprintf(stderr,"Error querying priority!\n");
    exit(1);
  }
  priority = param.sched_priority;
  ...

The final code segment sets the priority and scheduler of process 1234 to FIFO 18:

  
  pid_t pid = 1234;
  struct sched_param param;
  int policy;
  ...
  policy = SCHED_FIFO;
  param.sched_priority = 18;

  if( sched_setscheduler( pid, policy, &param ) == -1 ) {
    fprintf(stderr,"Error setting scheduler/priority!\n");
    exit(1);
  }
  ...

Priority Inversion

Priority Inversion is a problem that will likely occur in any real-time system that has multiple processes running at varying priorities attempting to access locks. Priority Inversion occurs when a low priority process is holding a lock that a high priority process needs. If spin locks (spin locks are a type of lock that performs an atomic test and set operation continuously until it succeeds) are being used, a deadlock will occur. The high priority process needs to get the lock. The low priority process holds the lock and must run to release the lock. The high priority process will never yield the processor to the low priority process due to the scheduling algorithm, thus it will never release the lock and the high priority process will never progress. The example program will show you an example of priority inversion using three processes. In this example program, semaphores are used to avoid the deadlock described above.

Although the example program does not implement a solution to the priority inversion problem, several solutions do exist. The first, and probably best, is to make sure that the above situation does not occur. This can be done by stating the simple rule that all high priority processes cannot access locks. In this way, the priority inversion problem is void. The second solution is one that the programmer will not be able to implement, but if present in the operating system, can take advantage of. Priority inheritance will detect when a priority inversion has occurred and attempt to raise the priority of the low priority process to that of the high priority process so that it may run. Once it releases its resource, it is lowered to its original priority. The high priority process can now get the lock, run, and complete its work. Again, priority inheritance must be implemented in the operating system and the programmer has no control over it. If it is there, take advantage of it; if it is not, work around the problem.

Example Program

The example program will show you how priority inversion can have negative effects on a real-time application. Suppose a simple real-time application exists with the following processes:

The following scheme will eventually happen if the above three process are running as described:

The low priority process will obtain the lock.

The medium priority process will start to run.

The high priority process will attempt to obtain the lock, but will be unable to since low priority process has the lock. The high priority process will suspend.

The medium priority process will run again and continue to run until it has completed its work (NOTE: this could take an extended period of time.)

Once the medium priority process has finished, the low priority process will run and then release the lock.

After the lock is released, the high priority process will finally get the processor back, obtain the lock, and perform the work that needs to be done. However, by this time, it is probably too late.

The sample program will demonstrate the above scenario. In fact, the example program is setup in such a way as to predictably have the above scenario occur every time it runs. This is done by using a shared memory segment which contains a flag which will let the medium priority process know when the low priority process has the lock and by placing a series of micro-sleep calls in specific locations within the code.

Compile and run the sample program and watch the output. You will see something similar to the following on the screen:

  
  Child 1 started ...
  Child 2 started ...
  Child 3 started ...
  High priority process started ...
  Middle priority process started ...
  Low priority process started ...
  High trying to get the lock!
  High has the lock!
  High release the lock!
  Low has the lock!
  High trying to get the lock!
  Low has lock, Middle entering long loop!
  Low has lock, Middle exiting long loop!
  Low release the lock!
  High has the lock!
  High release the lock!
  Low has the lock!
  High trying to get the lock!
  Low has lock, Middle entering long loop!
  Low has lock, Middle exiting long loop!
  Low release the lock!
  High has the lock!
  High release the lock!
  Low has the lock!
  High trying to get the lock!
  Low has lock, Middle entering long loop!

The key point to note is the time that it takes to go from the line that states, High trying to get the lock! to the line that states, High has the lock!. Under ideal circumstances, this time should be minimized. However, when running this program, you will see that an extended period of time is spent between the time where the middle priority process enters its long loop and the time where it exits its long loop. This is a classic example of priority inversion.

The details of the code are available, but are not discussed here. There is nothing new in the code, the forking of processes, the use of a shared memory segment, and the changing of priority have all been discussed previously. Therefore, it is left to the reader to examine and analyze the code.

Be sure to run the above program on several of the machines in the real-time lab and examine what happens. Priority inversion is an important real-time concept and should be thoroughly understood.

Due Date

31 de Setembro

Luís Fernando Faina
Last modified: Wed Oct 23 16:17:05 2002