SlideShare a Scribd company logo
Estimation of the minimum time of executing tasks at optimal distribution of load among
processors.
Author: Andrey Karpov
Annotation. The article briefly concerns methods of estimating the minimum time of executing tasks at optimal distribution of
load among processors. The given methods can be used both for homogeneous and heterogeneous computer systems.
To the reader
This document is part of a series of articles devoted to the questions of creating quality and effective program solutions for
modern 64-bit multi-core systems. You can see other articles on the site http://www.viva64.com.
Introduction
Despite great computational power of modern computers there are tasks solution of which in sequential mode takes much
time. The time for solving such tasks can be greatly reduced by using abilities of modern multi-core processors for calculations.
In order to fully use the advantages provided by these processors it is necessary to improve algorithms of solving tasks taking into
consideration the possibility of parallel data processing performed by several processors simultaneously. It is also important to
distribute calculations in such a way that each processor be used most fully and the total time of solving a task tend to minimum.
The article gives a review of the methods of estimating the minimum time of executing tasks at their optimal distribution
among computational nodes. Situations are taken into account when several parallel tasks are executed on one system taking
some resources of computational nodes. In this case the system is considered heterogeneous (anisotropic) in relation to the
program we're interested in.
1. Independent calculations of equal difficulty on homogeneous computational nodes
Suppose we have N independent calculations of equal difficulty. We need to distribute them among P processors which
have equal computational power (figure 1).
The natural solution of this task is assigning
P
N
calculations to each processor.
1 2 3 4 5 6 7 8 9 10
Processor 1 Processor 3Processor 2
Calculations
11
Figure 1. Distribution of independent calculations of equal difficulty on homogeneous computational nodes.
But this solution is proper only in that case if N contains P  0mod PN . Otherwise there remain from 1 to 1P non-
assigned calculations. It will be a mistake to assign all the remaining calculations to one processor as in this case the time of
termination of all the calculations will equal  PN
P
N
mod . It is better to distribute the remaining calculations by one for each of
 PN mod processors. In this case the time of termination of all the calculations will equal 1
P
N
. It is obvious that
 PN
P
N
P
N
mod1  , that's why the second method can be much better.
2. Independent calculations of equal difficulty on heterogeneous computational nodes
Suppose we have N independent calculations of equal difficulty. We need to distribute them among P processors which
have different computational powers  Pipi ,1,  (figure 2).
1 2 3 4 5 6 7 8 9 10
Processor 1 Processor 2
Calculations
11
Figure 2. Distribution of independent calculations of equal difficulty on heterogeneous nodes.
The time the processor with computational power ip spends on executing one calculation equals
ip
1
. Thus, we need to split
the calculations into P groups with  Pini ,1,  calculations in each so that the time of termination of all the calculations be
minimum, i.e.:
 
Nn
p
n
P
i
i
i
i
Pi









1
,1
min,max
3. Independent calculations of different difficulty on homogeneous computational nodes
Suppose we have N independent calculations of different difficulty  Nici ,1,  . We need to distribute them among P
processors which have equal computational power (figure 3).
1 2 3 4 5 6 7 8 9 10
Processor 1 Processor 3Processor 2
Calculations
11
Figure 3. Distribution of independent calculations of different difficulty on homogeneous computational nodes.
For the minimum time of termination of all the calculations it is necessary that all P processors be loaded most evenly, that
is all the processors should be assigned calculations of approximately equal sizes.
Thus, the task comes to the following: it is necessary to split the calculations into P groups with in calculations in each
 Pi ,1 , so that:
 
minmax
1
,1










in
j
ij
Pi
c ,
where ijc — difficulty of j-calculation in in -group.
4. Independent calculations of different difficulty on heterogeneous computational nodes
Suppose we have N independent calculations of different difficulty  Nici ,1,  . We need to distribute them among P
processors which have different computational powers  Pipi ,1,  (figure 4).
Processor 1 Processor 2
Calculations
1 2 3 4 5 6 7 8 9 10 11
Figure 4. Distribution of independent calculations of different difficulty on heterogeneous computational nodes.
The time the processor with computational power ip spends on executing one calculation with difficulty jc equals
i
j
p
c
.
For the minimum time of termination of all the calculations it is necessary that all the processors end calculations
approximately at the same time.
Thus, the task comes to the following: it is necessary to split the calculations into P groups with in calculations in each
 Pi ,1 , so that:
 
minmax
1
,1

















i
n
j
ij
Pi p
c
i
,
where ijc — difficulty of j-calculation in in -group.
5. Dependent calculations of equal difficulty on homogeneous computational nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
 1i -calculation, i.e.     1 ifgif kk . Suppose also we have P processors which have equal computational power. Such
calculations can be performed simultaneously if we split all the calculations into P groups in each of which calculations are
performed sequentially and in the same order as the source calculations.
There are no illustrations to this section and further text because it is difficult to make them clear.
The task of distributing calculations among processors in this case can be formulated as follows: an ordered set of
calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:
 
minmax 1
1,1
 

ii
Pi
cc ,
where ic — power of ic subset.
6. Dependent calculations of equal difficulty on heterogeneous computational nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
 1i -calculation, i.e.     1 ifgif kk . Suppose also we have P processors which have different computational powers
 Pipi ,1,  .
The task of distributing calculations among processors in this case can be formulated as follows:
An ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of
elements in such a way that:
 
minmax
1
1
1,1




i
i
i
i
Pi p
c
p
c
,
where ic — power of ic subset. That is, the maximum difference in time of performing calculations in neighboring subsets must
be minimum.
7. Dependent calculations of different difficulty on homogeneous computational nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
 1i -calculation, i.e.     1 ifgif kk . Each calculation is correlated with its difficulty iw . Suppose also we have P processors
which have equal computational powers.
The task of distributing calculations among processors in this case can be formulated as follows:
An ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of
elements in such a way that:
 
minmax
1
1,1
 


ii cj
j
cj
j
Pi
ww ,
where  icj
jw — difficulty of calculations making part of ic subset.
8. Dependent calculations of different difficulty on heterogeneous nodes
Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for
 1i -calculation, i.e.     1 ifgif kk . Each calculation is correlated with its difficulty iw .
Suppose also we have P processors which have different computational powers  Pipi ,1,  .
The task of distributing calculations among processors in this case is formulated as follows: an ordered set of calculations
C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:
 
minmax
1
1,1
1






i
cj
j
i
cj
j
Pi p
w
p
w
ii
where  icj
jw — difficulty of calculations making part of ic subset.
Additional sources:
1. M.V. Yakobovskiy, S.A. Sukov. Dynamic load balancing // Materials of the conference "High-performance calculations
and their applications", Chernogolovka, 2000, pp. 34-39.
2. V.P. Ivannikov, N.S. Kovalevskiy, V.M. Metelskiy. Of minimum time of implementing competitive processes in
synchronous operations. // Programming. 2000, № 5, pp. 44-52.
3. E. Tanenbaum. Distributed systems. Principles and paradigms. - St. Petersburg: Piter, 2003. - pp. 877.
4. A.A. Bukatov, V.N. Datsuk, A.I. Zhegulo. Programming of multi-processor computer systems. Rostov-on-Don. Publishing
House OOO "VCRU", 2003, pp. 208.
5. S.A. Nemnyugin, O.L. Stesik. Parallel programming for multi-processor computer systems. - St. Petersburg: BHV-
Peterburg, 2002. - pp. 400.
About the Author
Andrey Nikolaevich Karpov, http://www.viva64.com
Develops program solutions in the sphere of resource-intensive applications' quality and performance increase. One of the
developers of Viva64 static analyzer for verifying 64-bit software. Participates in developing VivaCore open library for working
with C/C++ code.

More Related Content

Viewers also liked

20150610 febelmar privacy matters eu regulation
20150610 febelmar privacy matters eu regulation20150610 febelmar privacy matters eu regulation
20150610 febelmar privacy matters eu regulationFebelmar
 
Правила статического анализа кода для диагностики потенциально опасных констр...
Правила статического анализа кода для диагностики потенциально опасных констр...Правила статического анализа кода для диагностики потенциально опасных констр...
Правила статического анализа кода для диагностики потенциально опасных констр...
Sergey Vasilyev
 
Ipsos implicit politicians neuro event October 8 2015
Ipsos implicit  politicians neuro event October 8 2015Ipsos implicit  politicians neuro event October 8 2015
Ipsos implicit politicians neuro event October 8 2015
Febelmar
 
Расчет минимального времени выполнения задач при оптимальном распределении на...
Расчет минимального времени выполнения задач при оптимальном распределении на...Расчет минимального времени выполнения задач при оптимальном распределении на...
Расчет минимального времени выполнения задач при оптимальном распределении на...
Sergey Vasilyev
 
2016 09 29 kantar tns navigating the touchpoint revolution
2016 09 29 kantar tns navigating the touchpoint revolution2016 09 29 kantar tns navigating the touchpoint revolution
2016 09 29 kantar tns navigating the touchpoint revolution
Febelmar
 
GfK facial coding Febelmar neuro event October 8 2015
GfK facial coding   Febelmar neuro event October 8 2015GfK facial coding   Febelmar neuro event October 8 2015
GfK facial coding Febelmar neuro event October 8 2015
Febelmar
 
Bbi plus plan bk eng pg1 28 r2 lr
Bbi plus plan bk eng pg1 28 r2 lrBbi plus plan bk eng pg1 28 r2 lr
Bbi plus plan bk eng pg1 28 r2 lr
Ling Huang
 
2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact
2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact
2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact
Febelmar
 

Viewers also liked (8)

20150610 febelmar privacy matters eu regulation
20150610 febelmar privacy matters eu regulation20150610 febelmar privacy matters eu regulation
20150610 febelmar privacy matters eu regulation
 
Правила статического анализа кода для диагностики потенциально опасных констр...
Правила статического анализа кода для диагностики потенциально опасных констр...Правила статического анализа кода для диагностики потенциально опасных констр...
Правила статического анализа кода для диагностики потенциально опасных констр...
 
Ipsos implicit politicians neuro event October 8 2015
Ipsos implicit  politicians neuro event October 8 2015Ipsos implicit  politicians neuro event October 8 2015
Ipsos implicit politicians neuro event October 8 2015
 
Расчет минимального времени выполнения задач при оптимальном распределении на...
Расчет минимального времени выполнения задач при оптимальном распределении на...Расчет минимального времени выполнения задач при оптимальном распределении на...
Расчет минимального времени выполнения задач при оптимальном распределении на...
 
2016 09 29 kantar tns navigating the touchpoint revolution
2016 09 29 kantar tns navigating the touchpoint revolution2016 09 29 kantar tns navigating the touchpoint revolution
2016 09 29 kantar tns navigating the touchpoint revolution
 
GfK facial coding Febelmar neuro event October 8 2015
GfK facial coding   Febelmar neuro event October 8 2015GfK facial coding   Febelmar neuro event October 8 2015
GfK facial coding Febelmar neuro event October 8 2015
 
Bbi plus plan bk eng pg1 28 r2 lr
Bbi plus plan bk eng pg1 28 r2 lrBbi plus plan bk eng pg1 28 r2 lr
Bbi plus plan bk eng pg1 28 r2 lr
 
2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact
2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact
2016 09 29 ipsos fine tuning your multi touchpoint campaign to maximize impact
 

Recently uploaded

SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
Jen Stirrup
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
UiPathCommunity
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
Vlad Stirbu
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 

Recently uploaded (20)

SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...The Metaverse and AI: how can decision-makers harness the Metaverse for their...
The Metaverse and AI: how can decision-makers harness the Metaverse for their...
 
UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..UiPath Community Day Dubai: AI at Work..
UiPath Community Day Dubai: AI at Work..
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Quantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIsQuantum Computing: Current Landscape and the Future Role of APIs
Quantum Computing: Current Landscape and the Future Role of APIs
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 

Estimation of the minimum time of executing tasks at optimal distribution of load among processors

  • 1. Estimation of the minimum time of executing tasks at optimal distribution of load among processors. Author: Andrey Karpov Annotation. The article briefly concerns methods of estimating the minimum time of executing tasks at optimal distribution of load among processors. The given methods can be used both for homogeneous and heterogeneous computer systems. To the reader This document is part of a series of articles devoted to the questions of creating quality and effective program solutions for modern 64-bit multi-core systems. You can see other articles on the site http://www.viva64.com. Introduction Despite great computational power of modern computers there are tasks solution of which in sequential mode takes much time. The time for solving such tasks can be greatly reduced by using abilities of modern multi-core processors for calculations. In order to fully use the advantages provided by these processors it is necessary to improve algorithms of solving tasks taking into consideration the possibility of parallel data processing performed by several processors simultaneously. It is also important to distribute calculations in such a way that each processor be used most fully and the total time of solving a task tend to minimum. The article gives a review of the methods of estimating the minimum time of executing tasks at their optimal distribution among computational nodes. Situations are taken into account when several parallel tasks are executed on one system taking some resources of computational nodes. In this case the system is considered heterogeneous (anisotropic) in relation to the program we're interested in.
  • 2. 1. Independent calculations of equal difficulty on homogeneous computational nodes Suppose we have N independent calculations of equal difficulty. We need to distribute them among P processors which have equal computational power (figure 1). The natural solution of this task is assigning P N calculations to each processor. 1 2 3 4 5 6 7 8 9 10 Processor 1 Processor 3Processor 2 Calculations 11 Figure 1. Distribution of independent calculations of equal difficulty on homogeneous computational nodes. But this solution is proper only in that case if N contains P  0mod PN . Otherwise there remain from 1 to 1P non- assigned calculations. It will be a mistake to assign all the remaining calculations to one processor as in this case the time of termination of all the calculations will equal  PN P N mod . It is better to distribute the remaining calculations by one for each of  PN mod processors. In this case the time of termination of all the calculations will equal 1 P N . It is obvious that  PN P N P N mod1  , that's why the second method can be much better.
  • 3. 2. Independent calculations of equal difficulty on heterogeneous computational nodes Suppose we have N independent calculations of equal difficulty. We need to distribute them among P processors which have different computational powers  Pipi ,1,  (figure 2). 1 2 3 4 5 6 7 8 9 10 Processor 1 Processor 2 Calculations 11 Figure 2. Distribution of independent calculations of equal difficulty on heterogeneous nodes. The time the processor with computational power ip spends on executing one calculation equals ip 1 . Thus, we need to split the calculations into P groups with  Pini ,1,  calculations in each so that the time of termination of all the calculations be minimum, i.e.:   Nn p n P i i i i Pi          1 ,1 min,max
  • 4. 3. Independent calculations of different difficulty on homogeneous computational nodes Suppose we have N independent calculations of different difficulty  Nici ,1,  . We need to distribute them among P processors which have equal computational power (figure 3). 1 2 3 4 5 6 7 8 9 10 Processor 1 Processor 3Processor 2 Calculations 11 Figure 3. Distribution of independent calculations of different difficulty on homogeneous computational nodes. For the minimum time of termination of all the calculations it is necessary that all P processors be loaded most evenly, that is all the processors should be assigned calculations of approximately equal sizes. Thus, the task comes to the following: it is necessary to split the calculations into P groups with in calculations in each  Pi ,1 , so that:   minmax 1 ,1           in j ij Pi c , where ijc — difficulty of j-calculation in in -group.
  • 5. 4. Independent calculations of different difficulty on heterogeneous computational nodes Suppose we have N independent calculations of different difficulty  Nici ,1,  . We need to distribute them among P processors which have different computational powers  Pipi ,1,  (figure 4). Processor 1 Processor 2 Calculations 1 2 3 4 5 6 7 8 9 10 11 Figure 4. Distribution of independent calculations of different difficulty on heterogeneous computational nodes. The time the processor with computational power ip spends on executing one calculation with difficulty jc equals i j p c . For the minimum time of termination of all the calculations it is necessary that all the processors end calculations approximately at the same time. Thus, the task comes to the following: it is necessary to split the calculations into P groups with in calculations in each  Pi ,1 , so that:   minmax 1 ,1                  i n j ij Pi p c i , where ijc — difficulty of j-calculation in in -group.
  • 6. 5. Dependent calculations of equal difficulty on homogeneous computational nodes Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for  1i -calculation, i.e.     1 ifgif kk . Suppose also we have P processors which have equal computational power. Such calculations can be performed simultaneously if we split all the calculations into P groups in each of which calculations are performed sequentially and in the same order as the source calculations. There are no illustrations to this section and further text because it is difficult to make them clear. The task of distributing calculations among processors in this case can be formulated as follows: an ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:   minmax 1 1,1    ii Pi cc , where ic — power of ic subset. 6. Dependent calculations of equal difficulty on heterogeneous computational nodes Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for  1i -calculation, i.e.     1 ifgif kk . Suppose also we have P processors which have different computational powers  Pipi ,1,  . The task of distributing calculations among processors in this case can be formulated as follows: An ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:   minmax 1 1 1,1     i i i i Pi p c p c , where ic — power of ic subset. That is, the maximum difference in time of performing calculations in neighboring subsets must be minimum.
  • 7. 7. Dependent calculations of different difficulty on homogeneous computational nodes Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for  1i -calculation, i.e.     1 ifgif kk . Each calculation is correlated with its difficulty iw . Suppose also we have P processors which have equal computational powers. The task of distributing calculations among processors in this case can be formulated as follows: An ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:   minmax 1 1,1     ii cj j cj j Pi ww , where  icj jw — difficulty of calculations making part of ic subset. 8. Dependent calculations of different difficulty on heterogeneous nodes Suppose we have N dependent calculations so that calculation of k-step for i-calculation demands the result of k-step for  1i -calculation, i.e.     1 ifgif kk . Each calculation is correlated with its difficulty iw . Suppose also we have P processors which have different computational powers  Pipi ,1,  . The task of distributing calculations among processors in this case is formulated as follows: an ordered set of calculations C should be split into P non-overlapping ordered subsets ic preserving the sequence of elements in such a way that:   minmax 1 1,1 1       i cj j i cj j Pi p w p w ii where  icj jw — difficulty of calculations making part of ic subset.
  • 8. Additional sources: 1. M.V. Yakobovskiy, S.A. Sukov. Dynamic load balancing // Materials of the conference "High-performance calculations and their applications", Chernogolovka, 2000, pp. 34-39. 2. V.P. Ivannikov, N.S. Kovalevskiy, V.M. Metelskiy. Of minimum time of implementing competitive processes in synchronous operations. // Programming. 2000, № 5, pp. 44-52. 3. E. Tanenbaum. Distributed systems. Principles and paradigms. - St. Petersburg: Piter, 2003. - pp. 877. 4. A.A. Bukatov, V.N. Datsuk, A.I. Zhegulo. Programming of multi-processor computer systems. Rostov-on-Don. Publishing House OOO "VCRU", 2003, pp. 208. 5. S.A. Nemnyugin, O.L. Stesik. Parallel programming for multi-processor computer systems. - St. Petersburg: BHV- Peterburg, 2002. - pp. 400. About the Author Andrey Nikolaevich Karpov, http://www.viva64.com Develops program solutions in the sphere of resource-intensive applications' quality and performance increase. One of the developers of Viva64 static analyzer for verifying 64-bit software. Participates in developing VivaCore open library for working with C/C++ code.