MATLAB/DCT
MathWorks' MATLAB Distributed Computing Toolkit has an interface to batch queuing systems and will, given the appropriate MATLAB commands, submit jobs to your queuing system and perform MATLAB operations in a parallel or distributed manner. For more on this see http://www.mathworks.com/help/toolbox/distcomp/index.html.
For MATLAB to be able to do this, it needs some understanding of the queuing system and it requires that the end-user of MATLAB DCT follow some simple steps to stage the distributed jobs.
Parallel Job Types
Distributed Jobs
Parallel Jobs
Parfor
Single Program Multiple Data (SPMD)
Parallel Configurations
Choosing a Parallel Configuration
There are three parallel configurations Local, MPIEXEC and Torque. Each has their own benefits and downsides.
- Most users will want to use MPIEXEC.
- The simplest to use config but most restrictive is Local.
- The Torque config should be used in rare cases. Contact the CAC before using.
MPIEXEC Parallel Configuration (Video Tutorial)
If you wish to more than 8 cores when using parallel configs (not distributed or parallel jobs) for parfor or SPMD you need to setup mpiexec parallel config. MPIEXEC holds the entire parrent MATLAB and parallel workers (tasks) inside a single PBS job.

MPIEXEC Setup
MPIEXEC cannot be mixed with other parallel configs already running. If a user wishes to use a differnt config they must wait for all other parallel MATLAB jobs to finish before switching config type
To setup MPIEXEC run once the following on the cluster.
mkdir ~/matlab/
cd ~/matlab
#this tells matlab to load the correct MPI library
wget http://cac.engin.umich.edu/resources/software/matlabdct/mpiLibConf.m
Once the mpiLibConf.m file is in place users need to setup the parallel configuration in their Mcode.
sched= findResource('scheduler', 'type', 'mpiexec')
set(sched, 'MpiexecFileName', '/home/software/rhel6/mpiexec/bin/mpiexec')
set(sched, 'EnvironmentSetMethod', 'setenv')
%use the 'sched' object when calling matlabpool
%the syntax for matlabpool must use the (sched, N) format
matlabpool (sched, 12)
Local Parallel Configuration
The local config requires no configuration but does have many limitations. Specifically the local config can only use a single node up to 12 cores. This parallel config works on CAEN lab machines with multiple cores allowing for testing of parallel MATLAB codes.
%example local config job on 8 cores
matlabpool open local 8
x=0;
parfor i=1:100
x=x+i;
end
%close the pool
matlabpool close
Torque Parallel Configuration
The Torque config hosts the parent MATLAB on the login node. Most users should not do this. The login nodes restrict resource use and are subject to reboots thus causing jobs to fail. users get the same results with MPIEXEC but without these downsides. If you think your job should use the Torque config please contact flux-support@umich.edu.

Distributed Jobs
A distributed job is one whose tasks do not directly communicate with each other. The tasks do not need to run simultaneously, and a worker might run several tasks of the same job in succession. Typically, all tasks perform the same or similar functions on different data sets in an embarrassingly parallel configuration. Distributed jobs do not use a parallel config such as Torque or MPIEXEC.
The CAC recommends the use of distributed jobs over parallel jobs due to how well the cluster can run many single jobs vs larger multiple CPU jobs. For those who cannot use tasks that do not communicate, they will use parallel jobs.
Distributed jobs are submitted from MATLAB. Therefore MATLAB creates and submits your PBS scripts for you. Distributed jobs allows you to shutdown the parent MATLAB, wait for the jobs to finish, and read in the values from the jobs at a latter time.
%create the sched object
sched=findResource('scheduler', 'type', 'torque')
%set PBS options 'man qsub' for valid options
%Do NOT set nodes, ppn or tpn
set(sched, 'SubmitArguments', '-l walltime=24:00:00 -q cac')
%Create a job
job1=createJob(sched)
%add a few tasks, each task will be a one CPU jobs in PBS
createTask(job1, @rand, 1, {3,3});
createTask(job1, @rand, 1, {3,3});
get(job1, 'Tasks')
%Submit job and its tasks to the cluster
submit(job1)
%you may now exit matlab
Check on the status of your jobs using qstat -u uniquename, when some or all of the task have finished start up matlab again, create the sched object and use functions like findJob, and getAllOutputArguments to get the results from the finished tasks. See the matlab DCT documentation for usage.
Parallel Jobs
Note: Parallel jobs requires that MATLAB be in your default modules
Parallel jobs are those in which the workers (or labs) can communicate with each other during the evaluation of their tasks.
The example will use the file colsum.m. You will need to place this on the cluster or your lab machine.
%create parallel sched object
sched=findResource('scheduler', 'type', 'torque')
%set PBS options 'man qsub' for valid options
%Do NOT set nodes, ppn or tpn
set(sched, 'SubmitArguments', '-l walltime=24:00:00 -q cac -M YourEmailAddressGoesHere@umich.edu -m abe')
%create parallel job, set the number of CPUs to use
pjob=createParallelJob(sched);
set(pjob, 'MaximumNumberOfWorkers', 4)
set(pjob, 'MinimumNumberOfWorkers', 4)
%create parallel tasks using colsum.m from above
set(pjob, 'FileDependencies', {'colsum.m'})
t=createTask(pjob, @colsum, 1, {})
%submit 10 CPU PBS job
submit(pjob)
Parallel jobs submitted from matlab also allows the submitting matlab process ran on the login node to shutdown while the tasks run.Behavior is the same as distributed jobs for recovering results from anew matlab session.
matlabpool (parfor)
The most common use of matlabpool is for the matlab command parfor. For an introduction to parfor see Mathworks Getting Started with parfor.
"A parfor-loop is useful in situations where you need many loop iterations of a simple calculation, such as a Monte Carlo simulation. parfor divides the loop iterations into groups so that each worker executes some portion of the total number of iterations. parfor-loops are also useful when you have loop iterations that take a long time to execute, because the workers can execute iterations simultaneously."
matlabpool and PBS
Be sure to set the number of cores to be used by MATLAB in your PBS file. Follow the example on our PBS page but change the resources line.
#PBS -l procs=4,gres=matlab:1%matlab_distrib_comp_engine:4
The value of procs=N and matlab_distrib_comp_engine:N must equal the number of labs (works in matlab) to the same value.
%matlab example for matlab pool
%this example assumes the MPIEXEC parallel config
sched= findResource('scheduler', 'type', 'mpiexec')
set(sched, 'MpiexecFileName', '/home/software/rhel6/mpiexec/bin/mpiexec')
set(sched, 'EnvironmentSetMethod', 'setenv')
%use the 'sched' object when calling matlabpool
%the size of the pool (4) should equal ppn in PBS
matlabpool (sched, 4)
%
clear A
parfor i=1:100
A(i)=i;
end
A
%
%reduction example
x=0;
parfor i=1:100
x=x+i;
end
x
%close the pool
matlabpool close
matlabpool (SPMD)
Single Program Multiple Data (SPMD) does work on the CAC systems. It is very similar to parfor in configuration. Please contact the CAC at flux-support@umich.edu for help.
For more specific instructions on using MATLAB at the CAC, please see our MATLAB page.



