Upcoming SlideShare
×

# An eternal question of timing

245 views

Published on

It seemed that long forum debates about methods of measuring algorithm's running time, functions to use and precision that should be expected were over. Unfortunately, we have to return to this question once again. Today we will discuss the question how we should measure speed of a parallel algorithm.

0 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

Views
Total views
245
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
3
0
Likes
0
Embeds 0
No embeds

No notes for slide

### An eternal question of timing

1. 1. An eternal question of timingAuthor: Andrey KarpovDate: 30.03.2011It seemed that long forum debates about methods of measuring algorithms running time, functions touse and precision that should be expected were over. Unfortunately, we have to return to this questiononce again. Today we will discuss the question how we should measure speed of a parallel algorithm.I want to say right away that I will not give you a concrete recipe. I myself have faced the issue ofmeasuring parallel algorithms speed only recently, so I am not an expert in this question. So, this post israther a research article. I will appreciate if you share your opinions and recommendations with me. Ithink we will manage the problem together and make out an optimal solution.The task is to measure the running time of a fragment of user code. I would use the following class tosolve this task earlier:class Timing {public: void StartTiming(); void StopTiming(); double GetUserSeconds() const { return double(m_userTime) / 10000000.0; }private: __int64 GetUserTime() const; __int64 m_userTime;};__int64 Timing::GetUserTime() const { FILETIME creationTime; FILETIME exitTime; FILETIME kernelTime; FILETIME userTime; GetThreadTimes(GetCurrentThread(), &creationTime, &exitTime,
2. 2. &kernelTime, &userTime); __int64 curTime; curTime = userTime.dwHighDateTime; curTime <<= 32; curTime += userTime.dwLowDateTime; return curTime;}void Timing::StartTiming() { m_userTime = GetUserTime();}void Timing::StopTiming() { __int64 curUserTime = GetUserTime(); m_userTime = curUserTime - m_userTime;}This class is based on the GetThreadTimes function that allows you to separate running time of usercode from running time of system functions. The class is intended for estimate of running time of athread in user mode, so we use only the returned parameter lpUserTime.Now consider a code sample where some number is calculated. We will use the Timing class to measurethe running time.void UseTiming1(){ Timing t; t.StartTiming(); unsigned sum = 0; for (int i = 0; i < 1000000; i++) { char str[1000]; for (int j = 0; j < 999; j++)
3. 3. str[j] = char(((i + j) % 254) + 1); str[999] = 0; for (char c = a; c <= z; c++) if (strchr(str, c) != NULL) sum += 1; } t.StopTiming(); printf("sum = %un", sum); printf("%.3G seconds.n", t.GetUserSeconds());}Being presented in this form, the timing mechanism behaves as it was expected and gives, say, 7seconds on my machine. The result is correct even for a multi-core machine since it does not matterwhich cores will be used while the algorithm is running (see Figure 1).Figure 1 - Work of one thread on a multi-core computerNow imagine that we want to use capabilities of multi-core processors in our program and estimatebenefits we will get from parallelizing the algorithm relying on the OpenMP technology. Lets parallelizeour code by adding one line:#pragma omp parallel for reduction(+:sum)for (int i = 0; i < 1000000; i++){ char str[1000];