HPC Frequently Asked Questions
- How can I run multiple serial tasks inside one job?
- How can I run multiple short, parallel tasks inside one job?
Question: Why can't I log in?
Answer: Log in problems can occur for a number of reasons. If you cannot log into one of ARC's systems, please check the following:
- Is your PID password expired? Try logging into my.vt.edu. If you cannot log in there, then your PID password has likely expired and needs to be changed. (Contact 4Help for help with this issue.)
- Are you on-campus? If you are not on-campus, you will need to connect to the Virginia Tech VPN in order to access ARC's systems.
- Is the hostname correct? Please check the name of the login node(s) for the system you are trying to access. For example, for login to Cascades, the hostname is not cascades.arc.vt.edu but rather cascades1.arc.vt.edu or cascades2.arc.vt.edu.
- Do you have an account? You must request an account on a system before you can log in.
- Is there a maintenance outage? ARC systems are occassionally taken offline for maintenance purposes. Users are typically notified via email well ahead of maintenance outages.
- If you are a Windows user, are you using PuTTY? Please make sure that you have downloaded and are using PuTTY if you are trying to log in from a Windows machine.
If you have checked all of the above and are still not sure why you cannot log in, please submit a help ticket.
Question: How much does it cost to use ARC's systems?
Answer: ARC's systems are free, though privileged access can be purchased through the Investment Program. For most systems, this means that Virginia Tech researchers can simply request an account to get access. Use of the clusters (submitting and running jobs) does require an approved allocation, which in turn requires some basic information to be provided, but getting an allocation does not require monetary payment of any kind. More information on how to get started with ARC is here. More information on the Investment Program is here.
Question: Why is my job not starting?
Cascades, Dragonstooth, or Huckleberry (clusters with Slurm scheduler)
Answer: Typically the squeue command will provide the reason a job isn't starting. This shows information about all pending or queued jobs, so it may be helpful to query for only your own jobs squeue -u <your pid>or only for a particular job squeue -u <jobid>. For example:
[brownm12@calogin2 ~]$ squeue -u brownm12 JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 310926 normal_q bash brownm12 PD 0:00 64 (PartitionNodeLimit)
This job has been submitted with a request for 64 nodes which exceeds the per-job limit on the
Other common reasons:
|Priority/Resources||these two are the most common reasons given for a job being pending (PD). They simply mean that the job is waiting in the queue for resources to become available.|
|QOSMaxJobsPerUserLimit||QOS applied to the partition restricts users to a maximum number of concurrent running jobs. As your jobs complete, queued jobs will be allowed to start.|
|QOSMaxCpuMinutesPerJobLimit||QOS applied to the partition restricts jobs to a maximum number of CPU-minutes. To run, the job must request either fewer CPUs or less time.|
|PartitionTimeLimit||requested timelimit exceeds the maximum for the partition|
Newriver only (Torque/Moab scheduler)
Answer: Typically the command checkjob -v <job id> will provide the reason. There are many different reasons that a submitted job will not start, though they often fall into one of the following categories:
- The job is missing header information. Job submissions must include a set of flags for the scheduler; if your job does not include one of these flags (or includes incorrect information), it may wind up stuck or rejected outright. Sample job submission scripts are included in the Examples section on each system page; compare your submitted job with the #PBS lines in that script to ensure that you have all of the required information. Examples for some missing flags:
- If the #PBS -q <queue name> flag is missing, it will produce the error message "qsub: Unknown queue MSG=cannot locate queue".
- If the #PBS -W group_list=<group name> flag is missing, it will produce the error message "qsub: Unauthorized Request MSG=group ACL is not satisfied".
- If the #PBS -A flag is incorrect, it will produce the error message "Defer:InvalidAccount" or "Insufficient funds: There are no valid allocations against which to make the lien."
- The job violates system policies. Each system has a set of policies that govern how users may use its resources (e.g., how long jobs may run or how many cores a given user many consume at one time). These policies are described in the Policies section on each system page. If your job violates one of these policies, it may wind up stuck and never run (look for "job violates constraints" or "job violates...limit" in the checkjob output) or it may be rejected with an immediate error message. Please ensure that your job is within the policies for the given system and queue that you are trying to use.
- The required resources are not available (yet). When you submit a job, you request a set of resources. This typically includes the number of nodes and number of cores that you require, but it may also include information about the type of those resources. (BlueRidge and Ithaca, for example, each have some "highmem" nodes with twice as much memory as normal nodes.) If those resources are not available at the times that you submit your job - that is, they are being used by another user - then your job will remain in queue until the resources that you have requested are available for the amount of time that you require.
- The system will soon have an outage. ARC periodically takes systems offline to perform maintenance. When this occurs, all system resources will be reserved starting on the date and time that the maintenance will begin. So if your job is scheduled to run for 100 hours (4 days, 4 hours) and is submitted four days before the start of a maintenance outage, your job will remain in queue until after the maintenance is complete. (Note that ARC will occasionally place non-maintenance reservations on a subset of a system's resources, such as for training classes.) The command showres will, on most systems, show information about the size and date/time of any reservations scheduled on a system. When a maintenance outage on a given system is scheduled, ARC sends an email notification to all users of that system to notify then of when the system will be unavailable.
Question: When will my job start?
Answer: The command showstart <job id> will provide the system's best guess as to when the job will start. If showstart returns something like "Estimated Rsv based start in INFINITY", then either the system is about to undergo maintenance or something is wrong with the job. See "Why is my job not starting?" for more information.
Question: How do I submit an interactive job?
Answer: A user can request an interactive session on a compute node (e.g., for debugging purposes), using
interact, a wrapper on
qsub -I. By default, this script will request one full node for one hour. If an allocation is provided, the request typically goes to
dev_q; if not, it goes to
open_q. The request can be customized with standard job submission flags used by
qsub. Examples include:
- Request two hours:
interact -l walltime=2:00:00
- Request two hours with allocation "yourallocation":
interact -l walltime=2:00:00 -A yourallocation
- Request two hours on one core and one GPU with allocation "yourallocation":
interact -lnodes=1:ppn=1:gpus=1 -l walltime=2:00:00 -A yourallocation
(The flags for requesting resources may vary from system to system; please see the documentation for the system that you want to use.)
Once the job has been submitted, the system will respond with "
qsub: waiting for job 14156.master.cluster to start". Once the resources requested are available, the system will say "
qsub: job 14156.master.cluster ready" and then show a prompt on a compute node. You can issue commands on the compute node as you would on the login node or any other system. To exit the interactive session, simply type
Note: As with any other job, if all resources on the requested queue are being used by running jobs at the time an interactive job is submitted, it may take some time for the interactive job to start.
Question: How do I change a job's stack size limit?
Answer: If your code needs higher stack sizes then please use the following command to launch your program across multiple nodes:
mpirun -bind-to-core -np $PBS_NP /bin/bash -c "ulimit -s unlimited; ./your_program"
Question: How do I check my running job's resource usage?
Answer: The command checkjob -v <job id> provides some basic resource utilization information for running jobs. For example, this job has allocated 12 cores on three nodes and is using them almost perfectly (a load average of 11.64):
Req TaskCount: 36 Partition: NORMAL Opsys: --- Arch: --- Features: standard Utilized Resources Per Task: PROCS: 0.33 MEM: 214M SWAP: 28G Avg Util Resources Per Task: PROCS: 0.33 Max Util Resources Per Task: PROCS: 0.33 MEM: 214M SWAP: 28G Average Utilized Memory: 213.33 MB Average Utilized Procs: 11.64 TasksPerNode: 12 NodeCount: 3 Allocated Nodes:
If your job is occupying one or more entire nodes and you would like node-by-node information, the jobload command will report core and memory usage for each node of a given job. Example output is:
[jkrometi@brlogin2 ~]$ jobload 123456 Basic job information: Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time 123456.master.cluster johndoe normal_q job.sh 111452 4 64 -- 60:00:00 R 07:31:13 Job is running on nodes: br190 br255 br332 br355 Node utilization is: node cores load pct mem used pct br190 16 16.7 104.3 62.9GB 7.1GB 11.2 br255 16 16.8 104.8 62.9GB 10.2GB 16.3 br332 16 16.6 103.4 62.8GB 7.5GB 12.0 br355 16 16.4 102.2 62.8GB 7.3GB 11.7
This BlueRidge job is using all 16 cores and 7-10GB memory on each of its four nodes. If more information is required about a given node, the status line from the command pbsnodes <node id> can provide it.
Question: I need a software package for my research. Can you install it for me?
Answer: At any given time, ARC staff is trying to balance many high-priority tasks to improve, refine, or augment our systems. Unfortunately, this means that we typically cannot install all or even most of the software that our users require to do their research. As a result, the set of applications on each system does not typically change unless a new software package is requested by a large number of users. However, users are welcome to install software that they require for their research in their Home directory. This generally involves copying the source code into your home directory and then following the directions provided with the software to build that source code into an executable. If the vendor does not provide source code and just provides an executable (which is true of some commercial software packages), then you need to select the right executable for the system hardware and copy that into your home directory.
Question: What does a "Disk quota exceeded" error mean?
Answer: If you run jobs out of your Home directory, you may occasionally encounter an error that looks like the following:
Unable to copy file /opt/torque/4.1.2/spool/spool/25111.hosched.arc.vt.edu.hpc.OU to /home/yourpid/CCl3.o25111 *** error from copy /bin/cp: cannot create regular file `/home/yourpid/CCl3.o25111': Disk quota exceeded *** end error output Output retained on that host in: /opt/torque/4.1.2/spool/undelivered/25111.hosched.arc.vt.edu.hpc.OU
This means that your Home directory has exceeded the maximum allowable size. You will need to reduce the size of your Home directory in order to run jobs successfully again. (This is one reason that we encourage running jobs out of Work or Scratch.)
Question: How do I add a user to an allocation?
Answer: To add a user to an existing allocation, follow these steps:
- Go to your allocations page. (You may be prompted for a password.)
- You will see a list of your allocations. Click on the one you want to modify. Follow the above demonstration to add users.
- The page will refresh and the user's PID should be included in the 'Users' column. They are now added to the project.
- Once the user has been added to the allocation, they will be able to run jobs against it. You can check that they have been added successfully using the
quotacommand at the command line.
Question: How do I attach to my process for debugging?
Answer: Debuggers like gdb make software development much more efficient. Attaching to a process for debugging requires that the targeted process and the user's current process be in the same group. Processes launched through the scheduler are in the group specified by the group_list qsub parameter. When sshing to a node to debug a running job, the following command sg can be used to change your current process to same group as the target process. For example:
Start a job through the scheduler.
[user@brlogin1 ~]$ gcc -g -o loop loop.c
$ qsub -q normal_q -W group_list=blueridge -A yourAllocation loopSubmissionScript
Using checkjob -v we determine the job is running on br296. Ssh to the node.
[user@brlogin1 ~]$ ssh br296
$ pidof loop 4028
Confirm the current process and target process are in different groups
[user@br296 ~]$ id -g -n user user
$ ps -o "pid,group" 4028 PID GROUP 4028 blueridge
Attach to the Process
[user@br296 ~]$ sg - blueridge -c gdb (gdb) attach 4028
Question: How can I submit a job that depends on the completion of another job?
Answer: Sometimes it may be useful to split one large computation into multiple jobs (e.g. due to queue limits), but submit those jobs all at once. Jobs can be made dependent on each other using the -Wdepend=afterok: flag to qsub. For example, here we submit three jobs, each of which depends on the preceding one:
[johndoe@brlogin2 ~]$ qsub test.sh 71804.master.cluster
$ qsub -Wdepend=afterok:71804 test.sh 71805.master.cluster
$ qsub -Wdepend=afterok:71805 test.sh 71806.master.cluster
The first job starts right away, but the second doesn't start until the first one finishes and the third job doesn't start until the second one finishes. This allows the user to split their job up into multiple pieces, submit them all right away, and then just monitor them as they run one after the other to completion.
Question: How can I run multiple serial tasks inside one job?
Answer: Many ARC systems are "dedicated node", meaning that even a serial job will reserve a full node. For this reason, users with serial (sequential) programs are encouraged to "package" multiple serial tasks into a single job submitted to the scheduler. This can be done with third-party tools (gnu parallel is a good one) or using a loop within the job submission script. (A similar structure can be used to run multiple short, parallel tasks inside a job.) The basic structure is to loop through the number of tasks using while or for, start the task in the background using the & operator, and then use the wait command to wait for the tasks to finish:
# Define variables numtasks=16 np=1 # Loop through numtasks tasks while [ $np -le $numtasks ] do # Run the task in the background with input and output depending on the variable np ./a.out $np > $np.out & # Increment task counter np=$((np+1)) done # Wait for all of the tasks to finish wait
To ensure that the same program (with the same inputs) isn't being run multiple times, users should make sure that the loop variable (np, above) is used to specify input files or parameters. A complete submission script for running 16 tasks on a BlueRidge node, with each task using a separate tarball as input, is here.
Question: How can I run multiple short, parallel tasks inside one job?
Answer: Sometimes users have a parallel application that runs quickly, but that they need to run many times. In this case, it may be useful to package multiple parallel runs into a single job. This can be done using a loop within the job submission script. An example structure:
# Specify the list of tasks tasklist="task1 task2 task3" # Loop through the tasks for tsk in $tasklist; do # run the task $tsk mpirun -np $PBS_NP ./a.out $tsk done
To ensure that the same program (with the same inputs) isn't being run multiple times, users should make sure that the loop variable (tsk, above) is used to specify input files or parameters. Note that, unlike when running multiple serial tasks at once, in this case each task will not start until the previous one has finished.