0% found this document useful (0 votes)
122 views11 pages

Dot Net 4.0 Parallelism

The document discusses parallelism features in .NET 4.0. It introduces parallel extensions which include data parallelism, Parallel LINQ (PLINQ), and task parallelism. Data parallelism allows dividing data across processors using Parallel.For and Parallel.ForEach. PLINQ enables parallel LINQ queries by using AsParallel() on data sources. Task parallelism breaks programs into tasks that can run concurrently using the Task Parallel Library.

Uploaded by

105705
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
122 views11 pages

Dot Net 4.0 Parallelism

The document discusses parallelism features in .NET 4.0. It introduces parallel extensions which include data parallelism, Parallel LINQ (PLINQ), and task parallelism. Data parallelism allows dividing data across processors using Parallel.For and Parallel.ForEach. PLINQ enables parallel LINQ queries by using AsParallel() on data sources. Task parallelism breaks programs into tasks that can run concurrently using the Task Parallel Library.

Uploaded by

105705
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

.Net 4.

0 Parallelism
25/02/2012

INS 2

SUDHA SHANMUGAM
sudha.shanmugam@tcs.com
.net 4.0 Parallelism

Contents
1. Introduction:....................................................................................................... 3
2. Parallel Extensions in .NET 4.0 .......................................................................... 4
3. Data Parallelism ................................................................................................. 7
4. Parallel LINQ (PLINQ) ...................................................................................... 9
5. Task Parallelism ................................................................................................10
6. Conclusion ........................................................................................................11
.net 4.0 Parallelism

1. Introduction:
As most processors today provide multiple cores or multiple threads. With multiple cores
or threads we need to change how we tackle problems in code. The best way to take
advantage of the extra cores prior to .NET Framework 4.0 was to create threads or utilize
the ThreadPool. For many developers utilizing Threads or the ThreadPool can be a little
daunting. The .NET 4.0 Framework drastically simplified the process of utilizing the extra
processing power through the Task Parallel Library (TPL).

The TPL is broken down into three different type of parallelism:


• Data Parallelism,
• Parallel LINQ (PLINQ) and
• Task Parallelism.

These three different types of parallelism are designed to make it easier to tackle some of
the common areas within your applications where you can take better advantages of the
processor. Before we can take advantage of TPL we first need to add the base using
statement to the class as shown below.

Using System.Threading.Tasks;

Traditionally, computer software has been written for serial computation. To solve a
problem, an algorithm is constructed and implemented as a serial stream of instructions.
These instructions are executed on a central processing unit on one computer. Only one
instruction may execute at a time—after that instruction is finished, the next is executed.
.net 4.0 Parallelism

Parallel computing, on the other hand, uses multiple processing elements simultaneously
to solve a problem. This is accomplished by breaking the problem into independent parts
so that each processing element can execute its part of the algorithm simultaneously with
the others.

The processing elements can be diverse and include resources such as a single computer
with multiple processors, several networked computers, specialized hardware, or any
combination of the above.

2. Parallel Extensions in .NET 4.0


Parallel Extensions in .Net 4.0 provides a set of libraries and tools to achieve parallelism in
computation. Two paradigms of parallel computing supported by .Net 4.0 are:
• Data Parallelism – This refers to dividing the data across multiple processors for
parallel execution. E.g. we are processing an array of 1000 elements we can
distribute the data between two processors say 500 each. This is supported by the
Parallel LINQ (PLINQ) in .NET 4.0
• Task Parallelism – This breaks down the program into multiple tasks which can be
parallelized and are executed on different processors. This is supported by Task
Parallel Library (TPL) in .NET 4.0.
.net 4.0 Parallelism

The figure below shows the framework:

The key components are given below:


• Coordinated Data Structures (CDS) - These are set of APIs added to the BCL
which provides a set of thread safe collection classes, lightweight synchronization
classes and lazy initializers. The concurrent collections are added in the new
System.Collections.Concurrent namespace and others are added to the
System.Threading namespace. These are cross cutting components used by the
other three as shown in the figure above.
• Task Parallel Library (TPL) – These are set of APIs present in System.Threading
and System.Threading.Task namespace which provides facilities for Task
Parallelism. The System.Threading.Parallel class provides a method Invoke which
can accept an array of delegates thus enabling parallel execution of multiple
methods. The methods Parallel.For and Parallel.ForEach enable parallel execution
of loops thus supporting data parallelism. These are used by PLINQ.
• Parallel LINQ (PLINQ) – PLINQ provides a parallel implementation of LINQ and
supports all the standard query operators. This is achieved by the
System.Linq.ParallelEnumerable class which provides a set of extension methods
on IEnumerable interface. Developers can opt-in for parallelism by invoking the
AsParallel method of ParallelEnumerable class on the data source. Using PLINQ we
can combine sequential and parallel queries and it also supports
ordered/unordered execution.
.net 4.0 Parallelism

The following figure gives the overview of parallel programming architecture in .NET:
.net 4.0 Parallelism

3. Data Parallelism
One of the most common areas where you can take advantage of parallel processing is
related to data processing. Often within code you have a need to perform a For loop or a
For Each loop on a set of data. These loops execute serially and can be a bottleneck to
performance. As the size of the data set being executed upon increases the impact on
performance can become quite noticeable.
Data Parallelism essentially allows you to modify your For and For Each loops to execute in
parallel. The following code snippet shows a simple traditional for loop.

for (int i = 0; i < 10; i++)


{
//Perform Work
}

This code can be easily modified to use TPL, perform the same function and take
advantage of the processor as shown below:

Parallel.For(0, 10, delegate(int i)


{
//Perform Work
});
.net 4.0 Parallelism

As you can see this code snippet does not use the For keyword instead it uses the
Parallel.For method. The Parallel.For method accepts 3 parameters, the start number, the
ending number and the Action. The Action in this case is an inline delegate. It is also
entirely possible to perform the Parallel.For operation on a method instead of an inline
delegate as shown in the following snippet.
Parallel.For (0, 10, ForWork);

static void ForWork(int i)


{
//Perform Work
}

This performs the same function as the previous snippet; however, it allows you to
externalize work performed by the delegate to a separate method. As you can see
converting For loops to better take advantage of the processor is a relatively simple
process.

The ForEach loop is really a specialized version of the For loop designed to iterate through
a list. As such, the conversion is similar to a ForEach loops as shown in the following
snippet:

//Create a list of Employee


List Employees = new List();

//Serial foreach loop


foreach (Employee e in Employees)
{
//Perform Work on Employee
}

//Parallel ForEach loop


Parallel.ForEach(Employees, delegate(Employee e)
{
//Perform Work on Employee
.net 4.0 Parallelism

});

The Parallel.ForEach works as you would expect, accepting a single parameter of the List
and the Action. It is important to note that while it is easy in theory to convert to use the
parallel loops we do need to give some consideration to what we are performing within
Action being executed. For instance, you need to watch out for operations which are not
thread safe, such as inserting into a list. Operations which are not thread safe need special
attention. In order to prevent collisions against the other threads you should use the lock
keyword as shown below:

lock (Employees)
{
Employees.Add (e);
}

When you are locking an object such as a List you should only include those statements
which are essential. The lock statement essentially serializes access to the Employee List in
the example above. Thus if you serialize a significant portion of your code within the loop
you cancel out the advantages.

4. Parallel LINQ (PLINQ)


LINQ at its lowest level is essentially a series of operations performed within a For Each
loop. It would make sense to that LINQ can be altered to take advantage of the Data
Parallelism from above. Before we talking about enabling PLINQ we will start with the
following simple query:

var query = from e in Employees


where e.Name.StartsWith("Adal")
select e;

In this example we have a very simple LINQ query to scan through the list of Employees
and return all items where Name starts with "Adal". To turn on the PLINQ operation we
make a small change to this query as shown below:

var query = from e in Employees.AsParallel()


where e.Name.StartsWith("Adal")
select e;
.net 4.0 Parallelism

As you can see in this snippet by adding AsParallel() onto the end of the IEnumerable List
we can enable the parallel operation. There are other extensions we can apply to the
IEnumerable source to preserve the order AsOrdered(), perform aggregations Aggregate()
as well as other operations. Keep in mind that PLINQ is not a performance solution to all
LINQ operations. LINQ to SQL and LINQ to Entities operations are performed on the
respective database and as such PLINQ isn't going to be able to affect those queries.

5. Task Parallelism
Task Parallelism allows you to create a task and launch that task asynchronously. In its
simplest form we can create a single task and execute the operation as shown below:

Task.Factory.StartNew(delegate()
{
//Perform Work
});

This example creates a single task and executes it; however, since it is only a single task it is
not terribly useful. It would be far more useful to provide a list of tasks you need to
accomplish in parallel and wait for the execution to complete as is the case in the following
snippet:

List Tasks = new List()


{
Task.Factory.StartNew(delegate()
{
//Perform Work
}),
Task.Factory.StartNew(delegate()
{
//Perform Work
})
};
.net 4.0 Parallelism

Here we are creating a tasks list, added 2 tasks and starting those tasks. Then to make it
more useful we use the Task.WaitAll method to pause the current thread until the
provided list of tasks complete.

6. Conclusion

Advantages of Parallel Computing:

• Save of Time & Money: Using more resources for a particular task results in cost
savings. Parallel clusters can be built from cheap components.
• Sole very large problems: Some programs are so large and complicated that is
practically impossible to run on a single computer given the limited resources
available on a single computer. Parallel computing enables us to do large
calculations easily and in less time.
• Provide concurrency: A single compute resource can only do one thing at a time.
Multiple computing resources can be doing many things simultaneously.
• Use of non-local resources: Using computing resources on a wide area network, or
even the Internet when local compute resources are scarce.
• Limits to serial computing: Both physical and practical reasons pose significant
constraints to simply building ever faster serial computers:
- Transmission speeds - the speed of a serial computer is directly dependent
upon how fast data can move through hardware. Absolute limits are the
speed of light (30 cm/nanosecond) and the transmission limit of copper wire
(9 cm/nanosecond). Increasing speeds necessitate increasing proximity of
processing elements.
- Limits to miniaturization - processor technology is allowing an increasing
number of transistors to be placed on a chip. However, even with molecular or
atomic-level components, a limit will be reached on how small components
can be.
- Economic limitations - it is increasingly expensive to make a single processor
faster. Using a larger number of moderately fast commodity processors to
achieve the same (or better) performance is less expensive.

Parallel computer programs are more difficult to write than sequential ones because
concurrency introduces several new classes of potential software bugs, of which race
conditions are the most common. Communication and synchronization between the
different subtasks are typically one of the greatest obstacles to getting good parallel
program performance.

You might also like