Matlab
Matlab in Serial
To access Matlab on the clusters, you must first type module load matlab.
A sample PBS script that shows a 1-CPU Matlab job looks like this:
#PBS -N matlab1
#PBS -q route
#PBS -l nodes=1,walltime=24:00:00
#PBS -M your-email-address
#PBS -m abe
#PBS -V
#
echo "I ran on:"
cat $PBS_NODEFILE
#
#cd to your execution directory first
cd /home/your-user-name/your-matlab-directory
#
matlab < file.m
#
Plotting in Batch Sample
y=1:100;
plot(y,y)
print -depsc yvsy
  %save the plot as yvsy.eps (postscript color)
  %run: 'help print' for options (-djpeg -dpng etc)
exit
  %matlab must exit in all batch jobs when finished
Note that your Matlab job will not have any access to a graphical display or terminal, so all of your input must be handled by one .m file (which can call other .m files) and all of your output must be written to either a file or the screen (standard out) by Matlab or saved with the print command. It is slightly faster to write your output to the screen (standard out) and let PBS deliver the results to you at the end. If you send your output to the screen, you can monitor it while it is running using the qpeek command; type qpeek --help for more information.
To use the Symbolics Toolkit you need to use the 32-bit version of matlab; start your session with the option -glnx86.
Matlab in parallel using the Distributed Computing Toolkit
Loading the Environment
To use Matlab's Distributed Computing Toolkit (DCT) you must first load Matlab with the command module load matlab/2006b.References
For information on Matlab's DCT please see:- Mathworks' Distributed Computing Toolbox User's Guide
- http://www.mathworks.com/access/helpdesk/help/pdf_doc/distcomp/distcomp.pdf
- http://www.mathworks.com/access/helpdesk/help/toolbox/distcomp/
Example 1: The distcomdemo_optim
Running the distcompdemo_optim_[dist|seq] demonstrations that come with the DCT looks like this:
[acaird@nyx-login matlab]$ module load matlab/2006b
[acaird@nyx-login matlab]$ matlab
< M A T L A B >
Copyright 1984-2006 The MathWorks, Inc.
Version 7.2.0.283 (R2006a)
January 27, 2006
To get started, type one of these: helpwin, helpdesk, or demo.
For product information, visit www.mathworks.com.
>> distcompdemo_optim_seq
Elapsed time is 78.2 seconds
>> sched=findResource('scheduler', 'type', 'generic')
sched =
distcomp.genericscheduler
>> set (sched, 'SubmitFcn', {@cac_submitfcn, '00:10:00', 'route'})
>> distcompdemoconfig('scheduler', sched);
>> distcompdemo_optim_dist
This demo will submit a job with 4 task(s) to the scheduler.
Submitting job
15108.nyx.engin.umich.edu
Elapsed time is 144.1 seconds
>>
As you can see, running in parallel doesn't always make the code run faster; there are many factors that go into whether or not you see a speed up; among them are: the amount of time your sub-job waits in the queue; how much communication there is between the parent and child processes; how much total work there is to be done and how much you can divide it up. The best thing is to try a few different settings (number of worker nodes, size of your problem) and see what works best for you.
Example 2: The Random Number Demo
>> sched=findResource('scheduler', 'type', 'generic')
sched =
distcomp.genericscheduler
>> set (sched, 'SubmitFcn', @Torque_submitfcn)
>> get (sched)
Type: 'generic'
DataLocation: '/home/acaird/matlab'
HasSharedFilesystem: 1
Jobs: [0x1 double]
ClusterMatlabRoot: ''
MatlabCommandToRun: 'matlab -dmlworker -nodisplay -r distcomp_evaluate_filetask'
SubmitFcn: @Torque_submitfcn
Configuration: ''
>> j = createJob(sched)
j =
distcomp.job: 1-by-1
>> get (j)
Name: 'Job1'
ID: 1
UserName: 'acaird'
Tag: ''
State: 'pending'
CreateTime: 'Wed Aug 16 11:49:34 EDT 2006'
SubmitTime: ''
StartTime: ''
FinishTime: ''
Tasks: [0x1 double]
FileDependencies: {0x1 cell}
PathDependencies: {0x1 cell}
JobData: []
Parent: [1x1 distcomp.genericscheduler]
UserData: []
Configuration: ''
>> get (sched)
Type: 'generic'
DataLocation: '/home/acaird/matlab'
HasSharedFilesystem: 1
Jobs: [1x1 distcomp.simplejob]
ClusterMatlabRoot: ''
MatlabCommandToRun: 'matlab -dmlworker -nodisplay -r distcomp_evaluate_filetask'
SubmitFcn: @Torque_submitfcn
Configuration: ''
>> createTask(j, @rand, 1, {3,3});
>> createTask(j, @rand, 1, {3,3});
>> createTask(j, @rand, 1, {3,3});
>> createTask(j, @rand, 1, {3,3});
>> createTask(j, @rand, 1, {3,3});
>> get (j, 'Tasks')
ans =
distcomp.task: 5-by-1
>> submit(j)
Submitting job
Required Matlab Commands
For every Matlab/DCT job you must start with these two lines:sched=findResource('scheduler', 'type', 'generic')This will set up the proper scheduler type so Matlab will know how to submit jobs to the scheduler. You should change the time limit (00:10:00) to whatever you think is accurate for your job, and you should change the queue (route) if you want to use a specific queue, otherwise you'll get routed wherever PBS thinks you should go.
set (sched, 'SubmitFcn', {@cac_submitfcn, '00:10:00', 'route'})
If you don't want to wait for your jobs to complete (if you have long-running jobs), you can submit your Matlab jobs without the waitForState command that the documentation suggests, simply let the job start and return you to the >> prompt. At that point you can quit Matlab and log out of the system if you want. To find your jobs, check on their status, or retrieve the results, you have several options after restarting Matlab and running the two "DCT required" commands above (schded=... and set (sched,...):
- If you know the name of the job, you can type:
myjob = findJob (sched,'Name','Job2')
Then you can work with "myjob" just like a job that you had waited for. -
If you don't know the name of the job, you can type:
myjob = findJob (sched,'UserName','YourUserName')
Then you can look at each job like:get(myjob(1))
Then you can either do another "findJob" with the exact name to get it to be a 1x1 job matrix, or you can work with the elements of myjob like:
get(myjob(2))
etc.
results = getAllOutputArguments(myjob(2))
For the technical details of how the MatlabDCT works in our environment, please see our Matlab DCT page.
