Skip to content

Instantly share code, notes, and snippets.

@wenming
Created June 16, 2012 14:32
Show Gist options
  • Save wenming/2941492 to your computer and use it in GitHub Desktop.
Save wenming/2941492 to your computer and use it in GitHub Desktop.
Resources for Azure scheduler
Hefinition of HPC:
High Performance Computing (HPC) is the use of servers, clusters, and supercomputers – plus associated software, tools, components, storage, and services – for scientific, engineering, or analytical tasks that are particularly intensive in computation, memory usage, or data management. HPC is used by scientists and engineers both in research and in production across industry, government, and academia. Within industry, HPC can frequently be distinguished from general business computing in that companies generally will use HPC applications to gain advantage in their core endeavors – e.g., finding oil, designing automobile parts, or protecting clients’ investments – as opposed to non-core endeavors such as payroll management or resource planning.
Azure HPC scheduler is a great way to run batch workload including but not limited to HPC.
The Azure HPC Scheduler includes 3 programming models:
MPI, SOA, and Parametric sweep.
MPI is a traditional HPC programming model which you can look up on the internet. It is tightly coupled and requires good networking solutions like Infiniband, etc. It's mostly for Science and engineering code.
SOA is Microsoft's implementation of a scalable SOA service farm. It's great for financial services where it requires sub mili second return on calls. A SOA/WCF call is made to a proxy and the proxy load balances these calls out to many instances of the web service running.
Parametric sweep engine is probably th easiest way to scale your embarassingly parallel workload. The best way to explain this is that the cluster will auto resource allocate for jobs like: compute_intense.exe file 1 -- file 100000
You basically run the same compute intensive program with slightly varying parameters.
The best way to get started:
There are slides in the training kit. For a simple demo to deploy a cluster please use: http://www.microsoft.com/en-us/download/details.aspx?id=28015 Just run the sample.
You will need client tools: http://www.microsoft.com/en-us/download/details.aspx?id=17017
And of course Windows Azure sdk 1.7.
I've built a weather service that's live based on the azure hpc scheduler:
weatherservice.cloudapp.net new forecast, then look for the forecast in the queue. (might take a minute or so to show up).
The service runs for 8 hours for 1 model, thus extremely compute intensive. Integratd with bing map and uses MVC4 for its web frontend.
and here's a more advanced paper:
Creating HPC Cloud Solutions with Windows HPC Server 2008 R2 and Windows Azure: Application Models and Data Considerations
http://www.microsoft.com/en-us/download/details.aspx?id=12006
There are a lot of real world demos and content in the training kit. You should look for them.
Demos include:
Parametric sweep rendering farm with a phone 7 client.
BLAST (parametric sweep with data movement).
Examples of SOA and
MPI rendering engines (taychon).
WindowsAzure-TrainingKit/Tutorial-HPCSOAapps
WindowsAzure-TrainingKit/Tutorial-HPCPowershellDeployment
WindowsAzure-TrainingKit/Tutorial-HPCMPIIntro
WindowsAzure-TrainingKit/Tutorial-HPCBasicParametricSweepApps
WindowsAzure-TrainingKit/Tutorial-TPLAzureScaleOut
WindowsAzure-TrainingKit/Tutorial-HPCImageRendering
WindowsAzure-TrainingKit/Tutorial-HPCBLAST
WindowsAzure-TrainingKit/Tutorial-HPCDeployToExistingCluster
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment