Provisioning in Microsoft HPC
Suppose you have five machines under your desk and need to establish a small HPC cluster for development from this estate, but you are too lazy to do it manually machine-by-machine. Or even better--you are in charge of a large HPC cluster that has thousands of nodes spanning multiple racks installed in an area over 2000 square meters. Obviously, manual installation is not an option here.
What to do? The answer is simple: use automatic provisioning. In this post, I'll try to share my recent experience provisioning with Microsoft High Performance Cluster 2008.
Good news: provisioning is an intrinsic feature of HPC 2008, which uses Microsoft Windows Deployment Server technology under the hood. WDS is a tool to exploit PXE (Preboot execution environment), which is a "must have" feature of all modern network cards.
This solution completely shields me from the complexity of WDS. Like many MS products, HPC Cluster manager provides a wizard for creation of a template. A template is a central notion of the whole system. It's merely an installation image, bootable over the network, accompanied by a script that does additional steps needed to get a node ready for joining the cluster.
However, just having an image is rarely sufficient. It allows you to install just the basic OS without anything on top of it, which is usually not what we want. That is where the script comes into play with its ability to execute (almost) arbitrary OS commands, even including executables from the network shares. If one tries to code those commands by hand, it would be somewhere between boring and very boring. Fortunately, HPC cluster manager's designers applied a lot of effort to make things simple. Just follow the wizard and you'll get a fully operational template in almost no time.
Hey, but what if you wanna add some specifics, something not provided by default? Well, nothing is lost, you can create a default template with the wizard and run the edit tool from the context menu. In the opened editor you can add new commands or delete the existing ones. When the template is ready, you invoke another wizard, which controls the process of installation. The only thing to do is to choose a template and turn on all the machines you want to have installed with this template. Simple.
Finally, if you don't have an installation image, you may create it from the distribution media by means of another wizard, embedded into HPC Cluster Manager. I'm not sure why this feature's needed, because for years the Microsoft installation media has included the installation images, but it might be valuable for image developers.
So far so good and it sounds like a magic, but there are some pitfalls I encountered when playing with MS HPC auto-provisioning.
Enjoy!
What to do? The answer is simple: use automatic provisioning. In this post, I'll try to share my recent experience provisioning with Microsoft High Performance Cluster 2008.
Good news: provisioning is an intrinsic feature of HPC 2008, which uses Microsoft Windows Deployment Server technology under the hood. WDS is a tool to exploit PXE (Preboot execution environment), which is a "must have" feature of all modern network cards.
This solution completely shields me from the complexity of WDS. Like many MS products, HPC Cluster manager provides a wizard for creation of a template. A template is a central notion of the whole system. It's merely an installation image, bootable over the network, accompanied by a script that does additional steps needed to get a node ready for joining the cluster.
However, just having an image is rarely sufficient. It allows you to install just the basic OS without anything on top of it, which is usually not what we want. That is where the script comes into play with its ability to execute (almost) arbitrary OS commands, even including executables from the network shares. If one tries to code those commands by hand, it would be somewhere between boring and very boring. Fortunately, HPC cluster manager's designers applied a lot of effort to make things simple. Just follow the wizard and you'll get a fully operational template in almost no time.
Hey, but what if you wanna add some specifics, something not provided by default? Well, nothing is lost, you can create a default template with the wizard and run the edit tool from the context menu. In the opened editor you can add new commands or delete the existing ones. When the template is ready, you invoke another wizard, which controls the process of installation. The only thing to do is to choose a template and turn on all the machines you want to have installed with this template. Simple.
Finally, if you don't have an installation image, you may create it from the distribution media by means of another wizard, embedded into HPC Cluster Manager. I'm not sure why this feature's needed, because for years the Microsoft installation media has included the installation images, but it might be valuable for image developers.
So far so good and it sounds like a magic, but there are some pitfalls I encountered when playing with MS HPC auto-provisioning.
- First of all never ever try to run this kind of installation on a network that is not under your control, and first of all, DHCP. Obviously, HPC will need to add a record to DHCP for a new node.
- Always use multicast mode when you are dealing with more then one node. It significantly reduces the time to provision, although even with multicast it may take a long time in some cases. In one of my experiments, I tried to provision 3 virtual nodes on the same VMware server with 5 additional VMs running. It took about 150 min on my not-so-old server (Supermicro SuperServer 6015B-UR 2 x XeonE5430 @2.66GHz (Quad-core) 16GB RAM).
- And last but not at all least. Do not use the clusrun utility in your template scripts. Obviously, commands from a template will be executed on all nodes for which the provisioning is requested. If you try to run clusrun (whose task is to run its argument command on every nodes in cluster) on each node , you may get an unpredictable result.
Enjoy!
Labels: .NET, Microsoft HPC, ~Kirill Shileev

0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
Links to this post:
Create a Link
<< Home