User Tools

Site Tools


tutorial:torque

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
tutorial:torque [2016/10/07 10:11]
sertalpbilal
tutorial:torque [2016/11/10 14:59]
sertalpbilal Change of Order
Line 2: Line 2:
  
 TORQUE provides control over batch jobs and distributed computing resources. It is an advanced open-source product based on the original PBS project and incorporates the best of both community and professional development. It incorporates significant advances in the areas of scalability, reliability, and functionality and is currently in use at tens of thousands of leading government, academic, and commercial sites throughout the world. TORQUE may be freely used, modified, and distributed under the constraints of the included license. TORQUE provides control over batch jobs and distributed computing resources. It is an advanced open-source product based on the original PBS project and incorporates the best of both community and professional development. It incorporates significant advances in the areas of scalability, reliability, and functionality and is currently in use at tens of thousands of leading government, academic, and commercial sites throughout the world. TORQUE may be freely used, modified, and distributed under the constraints of the included license.
 +
 +
 +
 +===== Prerequisite =====
 +In order to extract your output and error results in Torque, you need to have password-less connection between nodes. If you have not set it once, execute the following commands. These commands create a public and private key so that when a node want to transfer a file to your home folder, it does not require the password.
 +After connecting to polyps enter:
 +
 +<code bash>
 +ssh-keygen -N ""
 +</code>
 +
 +Then just press ENTER for any question. After that type the following commands:
 +
 +<code bash>
 +touch ~/.ssh/authorized_keys2
 +chmod 600 ~/.ssh/authorized_keys2
 +cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys2
 +</code>
 +Now, you will get the error log and output log files for your jobs.
 +
 +
 +
  
 ===== Hardware ===== ===== Hardware =====
Line 11: Line 33:
  
 ===== Submitting Jobs ===== ===== Submitting Jobs =====
- 
-Check [[#prerequisite|prerequisite]] before using Torque. 
  
 Jobs can be submitted either using a submission file or directly from command line. First we explain how it is done and then we will discuss the options. Jobs can be submitted either using a submission file or directly from command line. First we explain how it is done and then we will discuss the options.
Line 79: Line 99:
 Queue            Memory CPU Time Walltime Node  Run Que Lm  State Queue            Memory CPU Time Walltime Node  Run Que Lm  State
 ---------------- ------ -------- -------- ----  --- --- --  ----- ---------------- ------ -------- -------- ----  --- --- --  -----
 +gpu                --      --       --      --    0   0 --   E R
 medium             --      --       --      --    0   0 --   E R medium             --      --       --      --    0   0 --   E R
 short              --      --       --      --    0   0 --   E R short              --      --       --      --    0   0 --   E R
 long               --      --       --      --    0   0 --   E R long               --      --       --      --    0   0 --   E R
 batch              --      --       --      --    0   0 --   E R batch              --      --       --      --    0   0 --   E R
-verylong           --      --       --      --    0   --   E R +verylong           --      --       --      --    0   50   E R 
-                                               ----- ----- +AMPL               --      --       --      --      0 10   E R 
-                                                       0+MOSEK              --      --       --      --      50   E R 
 </code> </code>
 +
 +If you want to use AMPL or MOSEK, you have to use queue: AMPL or MOSEK, because we have limited licenses for them.
 +
 +
  
 You can see limits using this command ''qstat -f -Q'' You can see limits using this command ''qstat -f -Q''
Line 156: Line 182:
 </code> </code>
  
-Allocating more than one CPU under PBS can be done in a number of ways, using the -l flag and the following resource descriptions:+Allocating more than one CPU under PBS can be done in a number of ways, using the ''-l'' flag and the following resource descriptions:
  
   * nodes - specifies the number of separate nodes that should be allocated   * nodes - specifies the number of separate nodes that should be allocated
Line 162: Line 188:
   * ppn - how many processes to allocate for each node   * ppn - how many processes to allocate for each node
  
-The allocation made by pbs will be reflected in the contents of the nodefile, which can be accessed via the $PBS_NODEFILE environment variable.+The allocation made by pbs will be reflected in the contents of the nodefile, which can be accessed via the ''$PBS_NODEFILE'' environment variable.
  
 The difference between ncpus and ppn is a bit subtle. ppn is used when you actually want to allocate multiple processes per node. ncpus is used to qualify the sort of nodes you want, and only secondarily to allocate multiple slots on a cpus. Some examples should help. The difference between ncpus and ppn is a bit subtle. ppn is used when you actually want to allocate multiple processes per node. ncpus is used to qualify the sort of nodes you want, and only secondarily to allocate multiple slots on a cpus. Some examples should help.
Line 196: Line 222:
 c2 c2
 </code> </code>
 +
 +===== Mass Operations =====
 +
 +==== Submitting multiple jobs ====
 +An easy way to submit multiple jobs via PBS is using a batch script. Suppose we would like to give all file names inside a folder with MPS extension into our solver. We can write a PBS Script such as
 +<code bash submit.pbs>
 +cd /home/sec312/
 +/usr/local/cplex/bin/x86-64_linux/cplex ${FILENAME}
 +</code>
 +and a BASH script:
 +<code bash bashloop.sh>
 +for f in dataset/*.mps
 +do
 +    qsub -q batch -v FILENAME=$f submit.pbs
 +done
 +</code>
 +Here, option ''-v'' passes all arguments (''FILENAME'' in our example'') that we define into PBS file. You can submit several arguments by separating them with commas. DON'T use space between arguments.
 +
 +After having these two files, simply calling
 +<code>
 +./bashloop.sh
 +</code>
 +will submit all jobs into Torque.
 +
 +==== Cancelling all jobs ====
 +You can call
 +<code bash>
 +qselect -u <username> -s R | xargs qdel
 +</code>
 +to cancel all of your running jobs.
 +
 +<code bash>
 +qselect -u <username> | xargs qdel
 +</code>
 +will cancel all jobs (both running jobs and queue).
 +
  
 ===== Advanced ===== ===== Advanced =====
Line 221: Line 283:
   * **PBS_WALLTIME** (the walltime requested by the user or default walltime allotted by the scheduler)   * **PBS_WALLTIME** (the walltime requested by the user or default walltime allotted by the scheduler)
  
- 
-===== Prerequisite ===== 
-In order to extract your output and error results in Torque, you need to have password-less connection between nodes. If you have not set it once, execute the following commands. These commands create a public and private key so that when a node want to transfer a file to your home folder, it does not require the password. 
-After connecting to polyps enter: 
- 
-<code> 
-ssh-keygen -N "" 
-</code> 
- 
-Then just press ENTER for any question. After that type the following commands: 
- 
-<code> 
-touch ~/.ssh/authorized_keys2 
-chmod 600 ~/.ssh/authorized_keys2 
-cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys2 
-</code> 
-Now, you will get the error log and output log files for your jobs. 
  
  
tutorial/torque.txt · Last modified: 2024/02/28 13:12 by mjm519