Chapter 4: Threads
Operating System Concepts – 9th Edition Silberschatz, Galvin and Gagne
Chapter 4: Threads
Overview
Multicore Programming
Multithreading Models
Thread Libraries
Implicit Threading
Threading Issues
Operating System Examples
Operating System Concepts – 9th Edition 4.2 Silberschatz, Galvin and Gagne
Objectives
To introduce the notion of a thread—a fundamental
unit of CPU utilization that forms the basis of
multithreaded computer systems
To discuss the APIs for the Pthreads, Windows,
and Java thread libraries
To explore several strategies that provide implicit
threading
To examine issues related to multithreaded
programming
To cover operating system support for threads in
Windows and Linux
Operating System Concepts – 9th Edition 4.3 Silberschatz, Galvin and Gagne
Motivation
Most modern applications are multithreaded
Threads run within application
Multiple tasks with the application can be
implemented by separate threads
Update display
Fetch data
Spell checking
Answer a network request
Process creation is heavy-weight while thread
creation is light-weight
Can simplify code, increase efficiency
Kernels are generally multithreaded
A thread of execution is the smallest sequence of
programmed instructions that can be managed
independently by a scheduler
Operating System Concepts – 9th Edition 4.4 Silberschatz, Galvin and Gagne
Multithreaded Server Architecture
Operating System Concepts – 9th Edition 4.5 Silberschatz, Galvin and Gagne
Benefits
Responsiveness – may allow continued execution if
part of process is blocked, especially important for
user interfaces
Resource Sharing – threads share resources of
process, easier than shared memory or message
passing
Economy – cheaper than process creation, thread
switching lower overhead than context switching
Scalability – process can take advantage of
multiprocessor architectures
Operating System Concepts – 9th Edition 4.6 Silberschatz, Galvin and Gagne
Multicore Programming
Multicore or multiprocessor systems putting pressure on
programmers, challenges include:
Dividing activities
Balance
Data splitting
Data dependency
Testing and debugging
Parallelism implies a system can perform more than one
task simultaneously
Concurrency supports more than one task making
progress
Single processor / core, scheduler providing
concurrency
Operating System Concepts – 9th Edition 4.7 Silberschatz, Galvin and Gagne
Multicore Programming (Cont.)
Types of parallelism
Data parallelism – distributes subsets of the
same data across multiple cores, same operation
on each
Task parallelism – distributing threads across
cores, each thread performing unique operation
As # of threads grows, so does architectural support
for threading
CPUs have cores as well as hardware threads
Consider Oracle SPARC T4 with 8 cores, and 8
hardware threads per core
Operating System Concepts – 9th Edition 4.8 Silberschatz, Galvin and Gagne
Concurrency vs. simultaneously
Concurrency means executing multiple tasks at the same
time but not necessarily simultaneously. There are two tasks
executing concurrently, but those are run in a 1-core CPU, so
the CPU will decide to run a task first and then the other
task or run half a task and half another task, etc. Two tasks
can start, run, and complete in overlapping time periods i.e.
Task-2 can start even before Task-1 gets completed. It all
depends on the system architecture.
Parallelism means that an application splits its tasks up into
smaller subtasks which can be processed in parallel, for
instance on multiple CPUs at the exact same time.
Parallelism requires multiple processor while concurrency
may not necessary require multi processor, it can single
processor.
Operating System Concepts – 9th Edition 4.9 Silberschatz, Galvin and Gagne
Concurrency vs. Parallelism
Concurrent execution on single-core system:
Parallelism on a multi-core system:
Note: Concurrency increases response time, whereas parallelism
increases throughput and speedup
Operating System Concepts – 9th Edition 4.10 Silberschatz, Galvin and Gagne
Single and Multithreaded Processes
Operating System Concepts – 9th Edition 4.11 Silberschatz, Galvin and Gagne
Amdahl’s Law
Identifies performance gains from adding additional cores to an
application that has both serial and parallel components
S is serial portion
N processing cores
That is, if application is 75% parallel / 25% serial, moving from 1 to
2 cores results in speedup of 1.6 times
As N approaches infinity, speedup approaches 1 / S
Serial portion of an application has disproportionate effect on
performance gained by adding additional cores
But does the law take into account contemporary multicore
systems?
Operating System Concepts – 9th Edition 4.12 Silberschatz, Galvin and Gagne
User Threads and Kernel Threads
User threads - management done by user-level threads library
Three primary thread libraries:
POSIX Pthreads
Windows threads
Java threads
Kernel threads - Supported by the Kernel
Examples – virtually all general purpose operating systems,
including:
Windows
Solaris
Linux
Tru64 UNIX
Mac OS X
Operating System Concepts – 9th Edition 4.13 Silberschatz, Galvin and Gagne
Multithreading Models
Many-to-One
One-to-One
Many-to-Many
Operating System Concepts – 9th Edition 4.14 Silberschatz, Galvin and Gagne
Many-to-One
Many user-level threads mapped
to single kernel thread
One thread blocking causes all
to block
Multiple threads may not run in
parallel on muticore system
because only one may be in
kernel at a time
Few systems currently use this
model
Examples:
Solaris Green Threads
GNU Portable Threads
Operating System Concepts – 9th Edition 4.15 Silberschatz, Galvin and Gagne
One-to-One
Each user-level thread maps to kernel
thread
Creating a user-level thread creates a
kernel thread
More concurrency than many-to-one
Number of threads per process
sometimes restricted due to overhead
Examples
Windows
Linux
Solaris 9 and later
Operating System Concepts – 9th Edition 4.16 Silberschatz, Galvin and Gagne
Many-to-Many Model
Allows many user level threads
to be mapped to many kernel
threads
Allows the operating system
to create a sufficient number
of kernel threads
Solaris prior to version 9
Windows with the ThreadFiber
package
Operating System Concepts – 9th Edition 4.17 Silberschatz, Galvin and Gagne
Two-level Model
Similar to M:M, except that it allows a user
thread to be bound to kernel thread
Examples
IRIX
HP-UX
Tru64 UNIX
Solaris 8 and earlier
Operating System Concepts – 9th Edition 4.18 Silberschatz, Galvin and Gagne
Lightweight process
Many systems implement either the many-to-many or the
two-level model place an intermediate data structure
between the user and kernel threads. This data structure—
typically known as a lightweight process, or LWP—is shown
in below Figure. The LWP appears to be a virtual processor
on which the application can schedule a user thread to run,
to the user-thread library. Each Light Weight Process is
attached to a kernel thread, and it is kernel threads that the
operating system schedules to run on physical processors.
The LWP blocks as well, if a kernel thread blocks (such as
while waiting for an I/O operation to complete). The user-
level thread attached to the LWP up the chain also blocks.
Operating System Concepts – 9th Edition 4.19 Silberschatz, Galvin and Gagne
Thread Libraries
Thread library provides programmer with API
for creating and managing threads
Two primary ways of implementing
Library entirely in user space
Kernel-level library supported by the OS
Operating System Concepts – 9th Edition 4.20 Silberschatz, Galvin and Gagne
Pthreads
May be provided either as user-level or kernel-level
A POSIX standard (IEEE 1003.1c) API for thread
creation and synchronization
Specification, not implementation
API specifies behavior of the thread library,
implementation is up to development of the library
Common in UNIX operating systems (Solaris, Linux,
Mac OS X)
Operating System Concepts – 9th Edition 4.21 Silberschatz, Galvin and Gagne
Pthreads Example
Operating System Concepts – 9th Edition 4.22 Silberschatz, Galvin and Gagne
Pthreads
Pthreads Example
Example (Cont.)
(Cont.)
Operating System Concepts – 9th Edition 4.23 Silberschatz, Galvin and Gagne
Pthreads Code for Joining 10 Threads
Operating System Concepts – 9 th Edition 4. 21 Silberschatz, Galvin and Gagne ©2013
Operating System Concepts – 9th Edition 4.24 Silberschatz, Galvin and Gagne
Windows Multithreaded C Program
Operating System Concepts – 9th Edition 4.25 Silberschatz, Galvin and Gagne
Windows Multithreaded C Program (Cont.)
Operating System Concepts – 9th Edition 4.26 Silberschatz, Galvin and Gagne
Java Threads
Java threads are managed by the JVM
Typically implemented using the threads model
provided by underlying OS
Java threads may be created by:
Extending Thread class
Implementing the Runnable interface
Operating System Concepts – 9th Edition 4.27 Silberschatz, Galvin and Gagne
Java Multithreaded Program
Operating System Concepts – 9th Edition 4.28 Silberschatz, Galvin and Gagne
Java Multithreaded Program (Cont.)
Operating System Concepts – 9th Edition 4.29 Silberschatz, Galvin and Gagne
Implicit Threading
Growing in popularity as numbers of threads
increase, program correctness more difficult with
explicit threads
Creation and management of threads done by
compilers and run-time libraries rather than
programmers
Three methods explored
Thread Pools
OpenMP
Grand Central Dispatch
Other methods include Microsoft Threading
Building Blocks (TBB), java.util.concurrent
package
Operating System Concepts – 9th Edition 4.30 Silberschatz, Galvin and Gagne
Thread Pools
Create a number of threads in a pool where they
await work
Advantages:
Usually slightly faster to service a request with
an existing thread than create a new thread
Allows the number of threads in the
application(s) to be bound to the size of the
pool
Separating task to be performed from
mechanics of creating task allows different
strategies for running task
i.e.Tasks could be scheduled to run
periodically
Windows API supports thread pools:
Operating System Concepts – 9th Edition 4.31 Silberschatz, Galvin and Gagne
OpenMP
Set of compiler directives
and an API for C, C++,
FORTRAN
Provides support for parallel
programming in shared-
memory environments
Identifies parallel regions –
blocks of code that can run
in parallel
#pragma omp parallel
Create as many threads as
there are cores
#pragma omp parallel for
for(i=0;i<N;i++) {
c[i] = a[i] + b[i];
}
Run for loop in parallel
Operating System Concepts – 9th Edition 4.32 Silberschatz, Galvin and Gagne
Grand Central Dispatch
Apple technology for Mac OS X and iOS operating
systems
Extensions to C, C++ languages, API, and run-time
library
Allows identification of parallel sections
Manages most of the details of threading
Block is in “^{ }” - ˆ{ printf("I am a block"); }
Blocks placed in dispatch queue
Assigned to available thread in thread pool when
removed from queue
Operating System Concepts – 9th Edition 4.33 Silberschatz, Galvin and Gagne
Grand Central Dispatch
Two types of dispatch queues:
serial – blocks removed in FIFO order, queue is
per process, called main queue
Programmers can create additional serial
queues within program
concurrent – removed in FIFO order but several
may be removed at a time
Three system wide queues with priorities low,
default, high
Operating System Concepts – 9th Edition 4.34 Silberschatz, Galvin and Gagne
Threading Issues
Semantics of fork() and exec() system calls
Signal handling
Synchronous and asynchronous
Thread cancellation of target thread
Asynchronous or deferred
Thread-local storage
Scheduler Activations
Operating System Concepts – 9th Edition 4.35 Silberschatz, Galvin and Gagne
Semantics of fork() and exec()
Does fork()duplicate only the calling thread or
all threads?
Some UNIXes have two versions of fork
exec() usually works as normal – replace the
running process including all threads
Operating System Concepts – 9th Edition 4.36 Silberschatz, Galvin and Gagne
Signal Handling
Signals are used in UNIX systems to notify a
process that a particular event has occurred.
A signal handler is used to process signals
1. Signal is generated by particular event
2. Signal is delivered to a process
3. Signal is handled by one of two signal
handlers:
1. default
2. user-defined
Every signal has default handler that kernel runs
when handling signal
User-defined signal handler can override
default
For single-threaded, signal delivered to
process
Operating System Concepts – 9th Edition 4.37 Silberschatz, Galvin and Gagne
Signal Handling (Cont.)
Where should a signal be delivered for multi-
threaded?
Deliver the signal to the thread to which the
signal applies
Deliver the signal to every thread in the
process
Deliver the signal to certain threads in the
process
Assign a specific thread to receive all signals
for the process
Operating System Concepts – 9th Edition 4.38 Silberschatz, Galvin and Gagne
Thread Cancellation
Terminating a thread before it has finished
Thread to be canceled is target thread
Two general approaches:
Asynchronous cancellation terminates the target
thread immediately
Deferred cancellation allows the target thread to
periodically check if it should be cancelled
Pthread code to create and cancel a thread:
Operating System Concepts – 9th Edition 4.39 Silberschatz, Galvin and Gagne
Thread Cancellation (Cont.)
Invoking thread cancellation requests cancellation,
but actual cancellation depends on thread state
If thread has cancellation disabled, cancellation
remains pending until thread enables it
Default type is deferred
Cancellation only occurs when thread reaches
cancellation point
I.e. pthread_testcancel()
Then cleanup handler is invoked
On Linux systems, thread cancellation is handled
through signals
Operating System Concepts – 9th Edition 4.40 Silberschatz, Galvin and Gagne
Thread-Local Storage
Thread-local storage (TLS) allows each thread to
have its own copy of data
Useful when you do not have control over the
thread creation process (i.e., when using a thread
pool)
Different from local variables
Local variables visible only during single
function invocation
TLS visible across function invocations
Similar to static data
TLS is unique to each thread
Operating System Concepts – 9th Edition 4.41 Silberschatz, Galvin and Gagne
Scheduler Activations
Both M:M and Two-level models require
communication to maintain the appropriate
number of kernel threads allocated to the
application
Typically use an intermediate data
structure between user and kernel threads
– lightweight process (LWP)
Appears to be a virtual processor on
which process can schedule user thread
to run
Each LWP attached to kernel thread
How many LWPs to create?
Scheduler activations provide upcalls - a
communication mechanism from the kernel
to the upcall handler in the thread library
This communication allows an application
to maintain the correct number kernel
threads
Operating System Concepts – 9th Edition 4.42 Silberschatz, Galvin and Gagne
Operating System Examples
Windows Threads
Linux Threads
Operating System Concepts – 9th Edition 4.43 Silberschatz, Galvin and Gagne
Windows Threads
Windows implements the Windows API – primary
API for Win 98, Win NT, Win 2000, Win XP, and Win
7
Implements the one-to-one mapping, kernel-level
Each thread contains
A thread id
Register set representing state of processor
Separate user and kernel stacks for when
thread runs in user mode or kernel mode
Private data storage area used by run-time
libraries and dynamic link libraries (DLLs)
The register set, stacks, and private storage area
are known as the context of the thread
Operating System Concepts – 9th Edition 4.44 Silberschatz, Galvin and Gagne
Windows Threads (Cont.)
The primary data structures of a thread include:
ETHREAD (executive thread block) – includes
pointer to process to which thread belongs
and to KTHREAD, in kernel space
KTHREAD (kernel thread block) – scheduling
and synchronization info, kernel-mode stack,
pointer to TEB, in kernel space
TEB (thread environment block) – thread id,
user-mode stack, thread-local storage, in user
space
Operating System Concepts – 9th Edition 4.45 Silberschatz, Galvin and Gagne
Windows Threads Data Structures
Operating System Concepts – 9th Edition 4.46 Silberschatz, Galvin and Gagne
Linux Threads
Linux refers to them as tasks rather than threads
Thread creation is done through clone() system call
clone() allows a child task to share the address
space of the parent task (process)
Flags control behavior
struct task_struct points to process data
structures (shared or unique)
Operating System Concepts – 9th Edition 4.47 Silberschatz, Galvin and Gagne
End of Chapter 4
Operating System Concepts – 9th Edition Silberschatz, Galvin and Gagne
Key points
Threading/multithreading in operating system
Benefits of threading
Benefits of multicore processors
Concurrency vs. Parallelism
Amdahl’s Law
Multithreading Models
Lightweight process
Operating System Concepts – 9th Edition 4.49 Silberschatz, Galvin and Gagne