====== Torque Common Commands useful for Managers and Operators ====== Information about command usage can be obtained using the links below or via the man or info commands. Example: info qdel man qdel ===== Commands: ===== == qmgr == [[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qmgr.htm|Qmgr]] is the main application used to manage PBS Torque. This is used to create queues, modify queue settings, and print the configuration. Please review the man page or the link for information about this command. Any changes implemented using this command are permanent and are written to the running configuration file. user@polyp1:~$ qmgr -c "print queue batch" user@polyp1:~$ qmgr -c "p q batch" # # Create queues and set their attributes. # # # Create and define queue batch # create queue batch set queue batch queue_type = Execution set queue batch resources_max.walltime = 01:00:00 set queue batch resources_default.mem = 2gb set queue batch resources_default.ncpus = 1 set queue batch resources_default.nodes = 1 set queue batch resources_default.pmem = 2gb set queue batch resources_default.vmem = 2gb set queue batch resources_default.walltime = 01:00:00 set queue batch enabled = True set queue batch started = True == qstat == [[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qstat.htm|qstat]] - Qstat shows the status of jobs in the queuing system. mjm519@polyp1:~$ qstat -q server: polyp1 Queue Memory CPU Time Walltime Node Run Que Lm State ---------------- ------ -------- -------- ---- --- --- -- ----- MOSEK -- -- -- -- 0 0 48 E R AMPL -- -- -- -- 0 0 8 E R long -- -- 72:00:00 -- 0 0 30 E R gpu -- -- -- -- 0 0 4 E R verylong -- -- 240:00:0 -- 0 0 20 E R medium -- -- 04:00:00 -- 0 0 10 E R coraverylong -- -- -- -- 0 0 -- E R special -- -- 72:00:00 -- 0 0 24 E R batch -- -- 01:00:00 -- 0 0 -- E R short -- -- 02:00:00 -- 0 0 -- E R urgent -- -- -- -- 0 0 -- D S background -- -- -- -- 0 0 -- E R mediumlong -- -- 24:00:00 -- 0 0 60 E R ----- ----- 0 0 mjm519@polyp1:~$ qstat -Q Queue Max Tot Ena Str Que Run Hld Wat Trn Ext T Cpt ---------------- --- ---- -- -- --- --- --- --- --- --- - --- MOSEK 48 0 yes yes 0 0 0 0 0 0 E 0 AMPL 8 0 yes yes 0 0 0 0 0 0 E 0 long 30 0 yes yes 0 0 0 0 0 0 E 0 gpu 4 0 yes yes 0 0 0 0 0 0 E 0 verylong 20 0 yes yes 0 0 0 0 0 0 E 0 medium 100 0 yes yes 0 0 0 0 0 0 E 0 coraverylong 0 0 no no 0 0 0 0 0 0 E 0 special 24 0 yes yes 0 0 0 0 0 0 E 0 batch 0 0 yes yes 0 0 0 0 0 0 E 0 short 0 0 yes yes 0 0 0 0 0 0 E 0 urgent 0 0 no no 0 0 0 0 0 0 E 0 background 0 0 yes yes 0 0 0 0 0 0 E 0 mediumlong 60 0 yes yes 0 0 0 0 0 0 E 0 == qalter == [[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qalter.htm|qalter]] - alter a non-running queued job. qalter The line below changes the node specification for job id 1398668. Original job submission did not specify a node, this specifies a node with the same ppn (processors per node) qalter 1398668 -l nodes=polyp14:ppn=1 == qdel == [[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qdel.htm|qdel]] - Delete a batch job qdel qdel 1186460 == qhold == [[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qhold.htm|qhold]] Place non-running job in a state so it will not run. If the job is running, and checkpointing is not enabled the job will terminate. == qmove == [[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qmove.htm|qmove]] Move a submitted job to another queue. == qrun == [[http://docs.adaptivecomputing.com/torque/4-1-4/Content/topics/commands/qrun.htm|qrun]] Command used to take a non-running job and make it run. When I use this, I look at running jobs and then use the qalter command to specify the node I want the job to run on (based on available resources). == qstart == [[https://www.mankier.com/8/qstart|qstart]] The qstart command directs the torque server process batch jobs. This command can enable the entire server or a queue. == qstop == [[https://www.mankier.com/8/qstop|qstop]] The qstop command directs the torque server to stop processing batch jobs. Can be use to disable the server or just a queue. == qenable == [[https://www.mankier.com/8/qenable|qenable]] The qenable command directs the destination (server / queue) to accept jobs for processing. == qdisable == [[https://www.mankier.com/8/qdisable|qdisable]] The qdisable command directs the destination (server / queue) to stop accepting jobs for processing. == pbsnodes == [[https://docs.adaptivecomputing.com/torque/4-1-3/Content/topics/commands/pbsnodes.htm|pbsnodes]] This command can be use to enable and disable nodes in the queuing system. manager@polyp1:~$ pbsnodes -l polyp5 down polyp14 offline polyp15 offline polyp30 offline Take a node offline: manager@polyp1:~$ pbsnodes -o polyp15 Put a node back online: manager@polyp1:~$ pbsnodes -c polyp15 Current Normal Output from pbsnodes -l: manager@polyp1:~$ pbsnodes -l polyp5 down,offline polyp30 offline