To start with, I would like everyone to recall parallel programming concepts. Maybe some of you worked with .NET Framework – Parallel extensions, or did some hardcore multi threading. Or maybe for some of us it was only a practice in the university assignments. But let’s take a look. First of all, by the term parallel programming here, I am focusing on both multi-threaded and multi process applications.
So if we look at a very simple PC, doesn’t matter if it’s single core or multicore, with the tools and frameworks, we can easily develop applications running parallel. Those can be multithreaded, multi core supported, etc.. Microsoft, with Parallel extensions, or even with the simple threading functionality/namespaces in framework itself provide us enough tools to achieve whatever we need to do on a single station.
But if we start talking about more distributed applications, where run on multiple nodes in parallel, maybe exchanging messages in between, across their memory boundaries, through network, etc.. Then we have much more complicated structure than parallel framework or multithreading supports. For sure, throughout the history of operating systems and frameworks, countless of mechanisms were developed to support message passing interfaces and interprocess communications. But the problem with a cluster is not only processes talking to each other, maybe a more important problem is utilization of the cluster. Therefore many questions rise up; like; - What is the current cluster load? - What is the load per node? Which node is idle, which node overloaded? - Do processes (tasks / jobs) have enough resources to run? - and not only runtime questions, but how easy is the management of the cluster? Adding/removing nodes, upgrading them? Etc?
Keep these questions in mind, and add a requirement as “capacity on demand” a cluster which grows or shrinks by according to the demand. So all the questions got higher importance.
To address these questions for the applications running parallel on clusters, Microsoft has developed HPC Server, and with the all tools, sdks around, HPC Suite.
Let’s see a cluster topology to better understand the HPC Cluster structure.
As in every client – server drawing, let’s start with a client. So that we can fill this blank slide.
The entry point for a cluster is the Head Node. Head Node is basically responsible of being an entry point, and load distribution between the broker nodes.What I mean with that, the client comes here, says head node “hey, I want to submit a job”And head node replies “Ok, submit your job to this Broker node”.
Talking about broker nodes, let’s add them to the drawing as well. Simply, a broker node is responsible of overall monitoring of the cluster’s compute nodes. As they are also responsible of passing client requests to the compute nodes (and wait and communicate the answer back to the client) the traffic on a broker node can be busy. Therefore, HPC supports multiple Broker Nodes, but in small clusters, one broker node is generally enough. So let’s continue our theatre between Client and Cluster. The last thing was; head node said “submit your job to this Broker node” let’s say the one on the left. So Client this time comes to the broker node on the left, and says “hey, I want to submit a job, with these details”Broker node checks the job details and selects a proper compute node, where enough resources available. Allocates the resources on the node, and passes job definition to this particular compute node / nodes.
Let’s add the compute nodes to the image as well. So, the compute node does the calculation and returns the results back to broker node. And broker node passes the result to the client. Forsure, there are many queue mechanisms, etc. are working on the background if requested resources are not available, etc. but this is the basic scenario.What is the form of job request, reply etc.. We’ll discuss them soon.
This drawing shows that the compute nodes are on premise. But the beauty of HPC, it supports multiple compute node types. For example, rather than hosting a cluster on premise, I can place my compute nodes on to …..
Azure..!HPC comes with built in functionalities that can create and manage compute nodes on Azure as if they are on premise. Nice thing is, HPC also supports sizing on the Azure compute nodes based on demand (schedules)So, HPC supports one more type of compute nodes.. What can it be? Any idea?
We all have laptops in Avanade, but generally our clients have a large farm of desktops. And whenever the user is not working on his/her workstation, the cpu power of this workstation is just wasted. To utilize these idle workstations, HPC can use them as its compute nodes, too. So you can create a nice computing grid, just by investing nothing.!So these are mainly the topologies supported by HPC. Let’s see a little about management tools.
HPC comes with a nice set of tools, all managed via HPC Management Console where it supports functionalities on Cluster DeploymentCluster ManagementJob SchedulingHealth, Performance, diagnostics monitoring.
What we see here is the node management tab on one of the clusters we work on and if we look at the left hand side, you can see the nodes by their health status, their groups, by template (role) etc. At the middle, you see the list of the nodes, where filtered by your selection on the left menu.
We can also see the current heat map of the cluster. The boxes represent the nodes on the cluster, and the numbers in the boxes represent the current CPU usage. Obviously, the cluster is not loaded currently.There are times, where CPUs go full throttle and this map turns into dark red.!
We can also monitor the job queue. As you see here, there is a job at the top, with configuring state.. Which means it’s either waiting for resource or the resource is being allocated.There are 8 – 10 jobs running currentlyAnd then all the way down, completed jobs with Finished status.We can also see the requested resources on the right hand side. I think it is a good time to talk a little about resource allocation, too. HPC supports 3 types of resource allocation. You can allocate entire compute node, why do I do that? Maybe the application require all the resources (cores, memory on the compute node), or maybe you can only run one instance of this model type on a node. You can allocate one single core, why to do that? Single threaded process…You can allocate a socket, a physical socket that the CPU chip is plugged in.. So multi threaded applications work in a single chip, but on multicores
Let’s talk a little on the applications running on HPC. Basically HPC is able to run message passing interface applications – where it means, processes communicate with each other across the cluster. On the other hand, according to MS researches, one of the direct gain on the existing products would be on Excel, just by enabling of the parallel computing on a remote machine. So Excel 2010, with HPC Client installed, comes with a direct support of executing computations not in the excel engine on the workstation but on the HPC cluster.Another type of the applications are, simple executables (dlls, exe, xlsx or whatever the model type is) where you provide input parameters and after the execution, you collect the output parameters. No interaction between running instances.
Finally, let’s say couple of words for clients, To submit jobs, we need clients, forsure. These clients basically can be; HPC client console – but this is really a console that no one likes usingAs we discussed already, Excel is an important client for HPCBut in general, API Calls via HPC SDK are the main tools we use.