MATLAB

Introduction

MATLAB handles a range of computing tasks in engineering and science, from data acquisition and analysis to application development. The MATLAB environment integrates mathematical computing, visualization, and a powerful technical language. It is especially well-suited to vectorized calculations and has a Parallel Computing Toolbox (not included in all licenses) that streamlines parallelization of code.

Availability

MATLAB is available on several ARC systems. ARC maintains a Matlab Distributed Computing Server license for parallel Matlab through cooperation with the university's IT Procurement and Licensing Solutions, who also offer discounted licenses to departments and students (note that MATLAB is also included in some of the Student Bundles).

Interface

There are two types of environments in which the MATLAB application can be used on ARC resources:

  • Graphical interface via OnDemand
  • Command-line interface. You can also start MATLAB from the command line on Unix systems where MATLAB is installed. Note that the command line runs on the login node, so big computations should be submitted as jobs, either from via a traditional job submission or from within MATLAB.

Parallel Computing in MATLAB

There are two primary means of obtaining parallelism in MATLAB:

  • parfor: Replacing a for loop with a parfor loop splits the loop iterations among a group of processors. This requires that the loop iterations be independent of each other.
  • spmd: Single program multiple data (spmd) allows multiple processors to execute a single program (similar to MPI).

Slides and example programs for both parfor and spmd are available in the Resources section.

Submitting Remote Batch Jobs

In order to run large jobs on ARC's systems, you will need to submit your job to that system's queue. MathWorks provides functionality to do so from within MATLAB via the batch command; please see this documentation for details. More general information on jobs on ARC machines is available here and in the video tutorials.

MATLAB also comes with a Job Monitor to allow tracking of remote jobs via a graphical interface. Right-clicking on jobs will allow you to show its output, load its variables, delete it, etc.

Remote Output Files

Remote Matlab jobs start in the directory specified by the CurrentFolder parameter to batch(). Output files written to remote jobs will be saved in this location. Alternatively, you may specify the full path to where you want it to save the file, e.g.

save('/home/johndoe/output')

Note that if you submit from your personal machine, these files will not be copied back to your local machine; you will need to manually log into the machine to get them. Alternatively, you can tell Matlab to change to the directory on the ARC cluster where job information is stored; MATLAB will automatically mirror this location to your local machine when the job completes. Here is an example command for switching to the job directory:

cd(sprintf('%s/%s',getenv('MDCE_STORAGE_LOCATION'),getenv('MDCE_JOB_LOCATION')));

Note that once the job completes, you will need to look in its local job directory to get the output files; this location can be configured in your local cluster profile. Be sure to remove any output files you need before deleting your job (e.g. with the delete command).

Changing MATLAB\'s Path

To add a folder to MATLAB's path on ARC's systems, edit the MATLABPATH environment variable. This can be made permanent by editing it in your .bashrc file. For example, this line would add the folder mydir in your Home directory to MATLAB\'s path anytime it opens in your account:

echo "export MATLABPATH=\\$HOME/mydir:\$MATLABPATH\" >> ~/.bashrc

An alternative is to create a pathdef.m file in the directory where MATLAB starts. This will add folders to MATLAB\'s path whenever it starts in the folder where pathdef.m is located. For example, the following at the MATLAB command line would add mydir to the path when MATLAB opens in your Home directory:

addpath('/home/johndoe/mydir');
savepath('/home/johndoe/pathdef.m')

Using the Matlab Compiler (mex)

To compile C/C++ or Fortran code in Matlab, just make sure to load the compiler module that you want to use before you open Matlab. Here is an example of compiling MatConvNet, which in this case requires the GCC compiler, which is available via the foss module:

#load modules
module reset; module load foss/2020b matlab/R2021a

#open matlab and do the install 
#(vl_compilenn is the installer script in this case)
matlab -nodisplay
[matlab starts]
>> vl_compilenn

Examples

Prime Numbers:
This example uses parfor to count in parallel the prime numbers between 1 and 10,000,000. (The correct answer is 664,579.) There are a few ways it can be run on ARC resources:

  • A submission script to submit it as a job from the command line is provided here. More general information on jobs on ARC machines is available here and in the video tutorials.
  • A full example to setup job submission from within MATLAB (from a cluster login node) and to submit this job with batch() is provided at the bottom of this documentation.

Resources

The following resources should be helpful to users trying to get started with MATLAB or Parallel MATLAB.

  • The Interdisciplinary Center for Applied Mathematics (ICAM) and ARC have teamed up to offer a number of Parallel Matlab classes, including an introduction/overview and more advanced topics in PARFOR and SPMD. For slides and examples, see ARC's Matlab Course Archive.
  • MathWorks provides a number of excellent tutorials on MATLAB here. Their Parallel and GPU computing tutorials and sample codes will help users get started with parallel computing constructs. See also their webinars for more in-depth looks are more advanced topics.