Kernel versus User-Mode threads
• We have been talking about Kernel threads
– Native threads supported directly by the kernel
– Every thread can run or block independently
Critical Section Problem – One process may have several threads waiting on different things
• Downside of kernel threads: a bit expensive
– Need to make a crossing into kernel mode to schedule
• Even lighter weight option: User Threads
– User program provides scheduler and thread package
– May have several user threads per kernel thread
– User threads may be scheduled non-premptively relative to each
other (only switch on yield())
– Cheap
• Downside of user threads:
– When one thread blocks on I/O, all threads block
Some slides and/or pictures are adapted from – Kernel cannot adjust scheduling among all threads
• Operating System Concepts, 9th edition, by Silberschatz, Galvin, – Option: Scheduler Activations
Gagne, John Wiley & Sons, 2013
• Lecture notes by Dr. Prof. John Kubiatowicz (Berkeley) • Have kernel inform user level when thread blocks…
1
1 2
Multithreading Models Many-to-One
• Many user-level threads mapped
to single kernel thread
• Many-to-One • One thread blocking causes all
to block
• Multiple threads may not run in
parallel on muticore system
• One-to-One because only one may be in
kernel at a time
• Few systems currently use this
• Many-to-Many model
• Examples:
– Solaris Green Threads
– GNU Portable Threads
3 4
One-to-One Many-to-Many Model
• Each user-level thread maps to
kernel thread • Allows many user level
• Creating a user-level thread creates threads to be mapped to
a kernel thread many kernel threads
• More concurrency than many-to-one • Allows the operating
• Number of threads per process system to create a
sometimes restricted due to sufficient number of
overhead kernel threads
• Examples
– Windows
• Solaris prior to version 9
– Linux • Windows with the
– Solaris 9 and later ThreadFiber package
5 6
1
Correctness for systems with concurrent
Multiprocessing vs Multiprogramming threads
• Remember Definitions: • If dispatcher can schedule threads in any way,
– Multiprocessing Multiple CPUs programs must work under all circumstances
– Multiprogramming Multiple Jobs or Processes – Can you test for this?
– Multithreading Multiple threads per Process – How can you know if your program works?
• What does it mean to run two threads “concurrently”? • Independent Threads:
– Scheduler is free to run threads in any order and – No state shared – separate address space
interleaving: FIFO, Random, … – Deterministic Input state determines results
– Dispatcher can choose to run each thread to completion – Reproducible Can recreate Starting Conditions, I/O
or time-slice in big chunks or small chunks
– Scheduling order doesn’t matter (if switch() works!!!)
A • Cooperating Threads:
Multiprocessing B – Same address space, shared state between multiple
C
threads
– Non-deterministic
A B C
– Non-reproducible
• Non-deterministic and Non-reproducible means that
Multiprogramming A B C A B C B bugs can be intermittent
– Sometimes called “Heisenbugs”
7 8
Problem is at the lowest level Example of Concurrent Program
• Most of the time, threads are working on separate int balance = 100;
data, so scheduling doesn’t matter: void deposit (int amount) {
Thread A Thread B balance += amount;
x = 1; y = 2; exit(0);
• However, What about (Initially, y = 12): }
Thread A Thread B
x = 1; y = 2;
x = y+1; y = y*2;
int main () {
– What are the possible values of x? threadCreate(deposit, 10);
• Or, what are the possible values of x below? threadCreate(deposit, 20);
Thread A Thread B waitForAllDone();/*make sure all children finish*/
x = 1; x = 2; printf (“The balance is %d”, balance);
– X could be 1 or 2 (non-deterministic!)
– Could even be 3 for serial processors: }
• Thread A writes 0001, B writes 0010. What are the possible output?
• Scheduling order ABABABBA yields 3!
10
9 10
More Concurrent Programming:
Race Condition
Linked lists (head is shared)
insert (head, elem) { void *removed (head) { • This kind of bug, which only occurs under
Elem->next := head; void *tmp; certain timing conditions, is called a race
head := elem; tmp := head; condition.
} head := head->next; • Output depends on ordering of thread
return tmp; execution
}
• More concretely:
(1) two or more threads access a shared variable with
no synchronization, and
(2) at least one of the threads writes to the variable
Assume one thread calls insert and one calls
remove.
11 12
11 12
2
Critical Section Critical Section (CS) Problem
• Section of code that: • Provide entry and exit routines
– Must be executed by one thread at a time – All threads must call entry before executing
– If more than one thread executes at a time, CS.
have a race condition – All threads must call exit after executing CS
– Ex: linked list from before – Thread must not leave entry rounte until it’s
• Insert/Delete code forms a critical section safe.
• What about just the Insert or Delete code?
13 14
13 14
Structure of threads for Critical Properties of Critical Section
Section Problem Solution
• Threads do the following • Mutual Exclusion: at most one thread is
executing CS.
While (1) {
• Absence of Deadlock: two or more
call entry(); threads trying to get into CS => at least one
critical section succeeds.
call exit(); • Absence of Unnecessary Delay: if only
one thread trying to get into CS, it succeeds
do other stuff;
• Bounded Waiting: thread eventually gets
} into CS.
15 16
15 16
Software Solution for Critical Section
Critical-Section Handling in OS Problem
• Two thread solution
Two approaches depending on if kernel • Assume that the load and store machine-
language instructions are atomic; that is,
is preemptive or non- preemptive cannot be interrupted
– Preemptive – allows preemption of process • The two threads share two variables:
when running in kernel mode – int turn;
– Non-preemptive – runs until exits kernel – Boolean flag[2]
mode, blocks, or voluntarily yields CPU • The variable turn indicates whose turn it is to
enter the critical section
• Essentially free of race conditions in kernel mode • The flag array is used to indicate if a thread
is ready to enter the critical section. flag[i]
= true implies that threadi is ready!
17 18
3
Critical Section Solution Attempt #1 Critical Section Solution Attempt #2
(2 thread solution) (2 thread solution)
Intially, turn == 0 Intially, flag[0] == flags[1] == false
entry (id) { entry (id) {
while (turn !=id); /* if not my turn, spin */ flag [id] = true; /* I want to go in */
while (flag[1-id]); /* proceed if other not trying */
}
}
exit(id) { exit(id) {
turn = 1 - id; /* other thread’s turn*/ flag[id] = false; /* I am out*/
} }
19 20
19 20
Critical Section Solution Attempt #3
Satisfying the 4 properties
(2 thread solution)
Intially, flag[0] == flags[1] == false, turn • Mutual exclusion
– turn must be 0 or 1 => only one thread can be in CS
entry (id) { • Absence of deadlock
flag [id] = true; /* I want to go in */ – turn must be 0 or 1 => one thread will be allowed in
turn = 1 – id; • Absence of unnecessary delay
while (flag[1-id] && turn == 1-id); /* proceed if – only one thread trying to get into CS => flag[other] is
other not trying */ false => will get in
} • Bounded Waiting
exit(id) { – spinning thread will not modify turn
flag[id] = false; /* I am out*/ – thread trying to go back in will set turn equal to
} spinning thread
21 22
21 22
Critical Section Problem Hardware Support
multiple threads solutions
• Provide instruction that is:
– Atomic
• Bakery algorithm – Fairly easy for hardware designer to
implement
• It’s a mess • Read/Modify/Write
– Trying to prove it correct is a headache – Atomically read value from memory, modify it
in some way, write it back to memory
• Use to develop simpler critical section
solution for any number of threads.
23 24
23 24
4
Atomic Operation Test-and-Set
• An operation that, once started, runs to • Many machines have it
completion
• Indivisible bool TestAndSet(bool &target) {
• Ex: loads and stores bool b = target;
– Meaning: if thread A stores “1” into variable x target = true;
and thread B stores “2” into variable x about return b;
the same time, result is either “1” or “2”. }
25
• Executes atomically 26
25 26
CS solution with Test-and-Set Basic Idea With Atomic Instructions
Initially, s == false;
• Each thread has a local flag
entry () { • One variable shared by all threads
bool spin;
spin = TestAndSet(s); • Use the atomic instruction with flag,
while (spin) shared variable
spin = TestAndSet(s);
– On a change, one thread to go in
}
– Other threads will not see this change
exit() { • When done with CS, set shared variable
s = false;
} back to initial state.
27 28
27 28
Problems with busy-waiting CS
Other Atomic Instructions
solution
The definition of the Swap instruction • Complicated
• Inefficient
void Swap(bool &a, bool &b) { – Consumes CPU cycles while spinning
bool temp = a; • Priority inversion problem
a = b; – Low priority thread in CS, high priority thread
spinning can end up causing deadlock
b = temp;
}
Solution: block when waiting for CS
29 30
29 30