CyberLab Training Division :
Intel VTune Amplifier is a commercial application for software performance analysis for 32 and 64-bit x86 based machines, and has both GUI and command line interfaces. It is available for both Linux and Microsoft Windows operating systems. Although basic features work on both Intel and AMD hardware, advanced hardware-based sampling requires an Intel-manufactured CPU.
Whether you are tuning for the first time or doing advanced performance optimization, Intel® VTune Amplifier provides a rich set of performance insight into CPU & GPU performance, threading performance & scalability, bandwidth, caching and much more. Analysis is faster and easier because VTune Amplifier understands common threading models and presents information at a higher level that is easier to interpret. Use its powerful analysis to sort, filter and visualize results on the timeline and on your source.
It is available as part of Intel Parallel Studio or as a stand-alone product.
VTune Amplifier assists in various kinds of code profiling including stack sampling, thread profiling and hardware event sampling. The profiler result consists of details such as time spent in each sub routine which can be drilled down to the instruction level. The time taken by the instructions are indicative of any stalls in the pipeline during instruction execution. The tool can be also used to analyze thread performance. The new GUI can filter data based on a selection in the timeline.
For More Details.
Visit: http://www.cyberlabzone.com
Influencing policy (training slides from Fast Track Impact)
09 intel v_tune_session_13
1. Slide 1 of 17
Code Optimization and Performance Tuning Using Intel VTune
In this session, you will learn to:
Identify the benefits of multithreading
Design applications using threads
Objectives
2. Slide 2 of 17
Code Optimization and Performance Tuning Using Intel VTune
Multithreading increases the performance of your application
by:
Performing multiple tasks in parallel
Better utilization of system resources
Increasing the speed of your application
Identifying the Benefits of Multithreading
3. Slide 3 of 17
Code Optimization and Performance Tuning Using Intel VTune
Thread is a sequential flow of control within a program.
It is a sequence of instructions executed within a program.
Every program consists of at least one thread, called the
main thread.
The main thread is responsible for initializing the programs.
Identifying the Benefits of Multithreading (Contd.)
4. Slide 4 of 17
Code Optimization and Performance Tuning Using Intel VTune
What is a thread?
Just a minute
Answer:
A thread is a sequential flow of control within a program.
5. Slide 5 of 17
Code Optimization and Performance Tuning Using Intel VTune
The main benefits of threads are:
Increased performance
Better resource utilization
Simpler communication
Identifying the Benefits of Multithreading (Contd.)
6. Slide 6 of 17
Code Optimization and Performance Tuning Using Intel VTune
The two inherent conditions of a multithreaded application
are:
Concurrency
Parallelism
Identifying the Benefits of Multithreading (Contd.)
Concurrency refers to a situation in
which one or more threads are in
progress simultaneously.
In this case, the processor switches
from one thread to another during the
execution of the application.
►
Parallelism refers to the simultaneous
execution of multiple tasks.
In a multithreaded application, which is
running on a multiprocessor system,
threads execute in parallel.
►
7. Slide 7 of 17
Code Optimization and Performance Tuning Using Intel VTune
The best time for threading an application is during its
design phase.
During the design phase, you can accommodate all the data
and code restructuring related to threading.
This reduces the overall effort during the application
development.
A program comprising of multiple independent activities can
be redesigned in such a manner so that each activity can
be defined as a separate thread.
This enables you to decompose your work into simple
independent activities and improve the functionality and
performance of your application.
Designing Applications Using Threads
8. Slide 8 of 17
Code Optimization and Performance Tuning Using Intel VTune
You can improve the functionality of your application by
assigning different threads to different functions of the
application.
By assigning a separate thread to each function, all the
functions can execute independent of each other.
This approach makes the application more efficient because
threading is easier than switching functions within a serial
code.
Designing Applications Using Threads (Contd.)
9. Slide 9 of 17
Code Optimization and Performance Tuning Using Intel VTune
Performance of an application depends upon the
combination of various factors such as speed and utilization
of system resources.
A multithreaded application running on a single processor
system leads to a better utilization of system resources.
A multithreaded application running on a multiprocessor
system leads to better utilization of system resources and
increased speed of the application.
Designing Applications Using Threads (Contd.)
10. Slide 10 of 17
Code Optimization and Performance Tuning Using Intel VTune
How does the use of threads improve the functionality of an
application?
Just a minute
Answer:
Threads improve the functionality of an application by
assigning different threads to different functions. This makes it
easier to control the execution of multiple functions within an
application.
11. Slide 11 of 17
Code Optimization and Performance Tuning Using Intel VTune
Data decomposition refers to the process of:
Breaking down a program into logical chunks or individual
tasks.
Identifying the dependencies between the tasks.
The two types of decomposition methods are:
Task decomposition
Data decomposition
Designing Applications Using Threads (Contd.)
► It refers to the process of decomposing
a program on the basis of the functions
it performs.
It is also known as functional
decomposition.
In this case, you can assign separate
threads to the independent functions in
your program.
► It refers to the process of decomposing
an application in such a way so that the
same operation is performed repeatedly
for different data.
12. Slide 12 of 17
Code Optimization and Performance Tuning Using Intel VTune
In a multithreaded application, you need to take care of
various complexities, which may arise during thread
interaction. These complexities are:
Race conditions
Critical region
Mutual exclusion
Synchronization
Deadlocks
Identifying the Complexities Involved in Multithreaded Applications
► Race conditions occur when the output
of the program depends upon which
thread reaches a particular block of
code first.
Race conditions lead to different
results every time a program is
executed.
► Critical region refers to those portions
of your application that access shared
variables.► Mutual exclusion allows only one
thread to be executing in a critical
region at a given time.
When a thread is executing the code
that accesses a shared resource in a
critical region, any other thread that
might desire entry to the critical region
must wait to access that region.
► Synchronization controls the relative
order of thread execution and resolves
any conflicts among threads.
Synchronization is based on the
concept of monitoring.
► Deadlock refers to a situation in which
a thread waits for a condition that can
never occur.
Deadlock halts the execution of your
application preventing it to continue
further.
13. Slide 13 of 17
Code Optimization and Performance Tuning Using Intel VTune
List the complexities involved in a multithreaded application.
Just a minute
Answer:
The complexities involved in a multithreaded application are as
follows:
Race conditions
Critical region
Mutual exclusion
Synchronization
Deadlocks
14. Slide 14 of 17
Code Optimization and Performance Tuning Using Intel VTune
Problem Statement:
John has developed a code in C# in which he acquires a lock
on two resources. However on execution, the application
comes to a halt after some time. He wants to analyze the
processor utilization on his system using the counter monitor
feature of VTune. Help John to accomplish his task.
Activity: Analyzing the Processor Activity During Deadlock
15. Slide 15 of 17
Code Optimization and Performance Tuning Using Intel VTune
Solution
To analyze the performance of the application, you need to
perform the following tasks:
1. Configure counter monitor using the counter monitor configuration
wizard.
2. Analyze the processor utilization on the system.
Activity: Analyzing the Processor Activity During Deadlock (Contd.)
16. Slide 16 of 17
Code Optimization and Performance Tuning Using Intel VTune
In this session, you learned that:
You can improve the speed of your application to a great extent
by using multiple threads in your application.
A thread is a sequential flow of control within a program.
The main benefits of threads are as follows:
Increased performance
Better resource utilization
Simpler communication
In a multithreaded application, threads run concurrently or in
parallel.
You can improve the functionality of your application by
assigning different threads to different functions of the
application. These functions may or may not be dependant on
each other.
Summary
17. Slide 17 of 17
Code Optimization and Performance Tuning Using Intel VTune
The threads enable you to improve the performance of your
application.
The process of breaking down a program into logical chunks or
individual tasks and identifying the dependencies between
them is referred to as decomposition.
In a multithreaded application, you need to take care of various
complexities that may arise during thread interaction. These
complexities are as follows:
Race condition
Critical region
Mutual exclusion
Synchronization
Deadlock
Summary (Contd.)