User Tools

Site Tools



Info / Resources



Sample Pages

Quick Navigation


This is an old revision of the document!


What is CONDOR

CONDOR is a job manager to schedule computational jobs.

Check the following link for an introduction to CONDOR.


Submitting A Single Job

To submit a job via CONDOR, you need to create a .sub file. This .sub file must include a program that you will execute (e.g., matlab, cplex, etc.) along with the arguments for the program (such as your file to be executed). It's an automated way to run programs.

A case study: Matlab

Suppose that we want to run a MATLAB code on Polyps. Here is an example .sub file which submits the matlab file 'test.m' to condor for running and saves results of the code to 'out.txt' file, while CONDOR errors and logs are stored at 'error.txt' and 'log.txt', respectively.

# Specify the executable software, i.e. mathematica, mosek, etc
Executable = /usr/local/matlab/latest/bin/matlab
Universe   = vanilla
getenv     = true
# Specify argument file
arguments  = -nosplash -nodesktop -logfile test.log -r test
#request_cpus = 16
#request_memory = 2
# name output file 
output     = ./out.txt
# name error file
error      = ./error.txt
#name log file
log      = ./log.txt
transfer_executable = false
# Submit to queue

After making sure all the files you specified exists in the correct directory, use

condor_submit myexp.sub

to submit the file to condor.

You can find the “Executable” of a program by calling which program command.
Frequently used executables on Polyps:
  • Matlab: /usr/local/matlab/latest/bin/matlab
  • Cplex: /usr/local/cplex/bin/x86-64_linux/cplex
  • Mosek: /usr/local/mosek/7.1/tools/platform/linux64x86/bin/mosek
  • Ampl: /usr/local/ampl/ampl

Submitting Multiple Jobs

There are multiple ways to submit a set of experiments (multiple jobs). Here we have two different ways to achieve the same result.

1. Via Bash Script

A simple example to demonstrate the use of nested loops in multiple jobs submission In this example, the executable “test” takes two integers as the arguments(inputs) “test” is in the same directory as this submit file.

One is running “test” with two paramentes i and j in the following nested loop

for(int i=0; i< ilimit; ++i)
   for(int j=0; j< jlimit; ++j)
       test -i -j;

The following example demonstrates a two-layer nestedloop with “ilimit=5, jlimit=10” Nested Loop with more than two layers can be achieved in the same logic
getenv = TRUE
Universe =vanilla
## ilimit=5, jlimit=10
## N=(ilimit)*(jlimit)=50
## ilimit is implicitly included in the "N"
I = $$([ $(Process) / $(jlimit) ])
J = $$([ $(Process) % $(jlimit) ])
Executable =test
arguments= "$(I) $(J)"
Error =test.err
Log =test.log
queue $(N)

Output Correspondance

## test -i=0 -j=0 -> test0.txt
## test -i=0 -j=1 -> test1.txt
## ......
## test -i=4 -j=9 -> test49.txt

A simple example to demonstrate the use of variables in multiple-job submission In this example, the executable “test” takes a single integer as the argument(input) “test” is in the same directory as this submit file.

Executable “test” will be run 5 times with input 0 to 4, respectively. The corresponding output files are test0.txt to test4.txt $(Process) is a macro that supplies the process ID, 0 to 4 in this case. It could be used as an iteration counter

getenv = TRUE
Universe =vanilla
Executable =test
arguments= $(Process)
Error =test.err
Log =test.log
queue 5
## Executable "test" will be run with input 0 to 4
## A variable N is defined to specify the number of jobs
Executable =test
arguments= $(Process)
Error =test.err
Log =test.log
queue $(N)
## Executable "test" will be run with input 5 to 9
## The corresponding output files are "test0.txt" to "test4.txt"
## Variable I is defined based on the macro $(Process)
I=$$([ $(Process)+5])
Executable =test
arguments= $(I)
Error =test.err
Log =test.log
queue 5

2. Via Python (Script)

You can use the same executable, options, etc. and change some of them to create new jobs. Then when you submit your file using condor_submit, it will put all of them at the same time.

For your experiments, you can create a script to generate multiple jobs. Below, you will find an example Python script that generates multiple experiments with a changing argument.
# This script search the data folder and
# create condor submission file (condor.sub) for same problem with different arguments
# Open file and write common part
cfile = open('condor.sub','w')
common_command = \
'Executable = ../test/portfolio \n\
Universe   = vanilla\n\
getenv     = true\n\
transfer_executable = false \n\n'
# Loop over various values of an argument and create different output file for each
# Then put it in the queue
for a in xrange(5,8):
    run_command =  \
'arguments  = -a %d\n\
output     = out.%d.txt\n\
queue 1\n\n' %(a,a)

This script will generate the following condor file

Executable = ../test/portfolio
Universe   = vanilla
getenv     = true
transfer_executable = false
arguments  = -a 5
output     = out.5.txt
queue 1
arguments  = -a 6
output     = out.6.txt
queue 1
arguments  = -a 7
output     = out.7.txt
queue 1
Be sure to provide output argument to your Condor submissions. Otherwise, you may not able to see results of your tasks.

Checking Jobs

To check the job progress, use command

condor_q -global   #this checks all the jobs on condor
condor_q -run      #this checks all running jobs
condor_q userid    #this checks all jobs under specific user name
If you think somehow your jobs are not being processed, you can debug and see the reasons by calling condor_q userid -analyze command.

Removing Jobs

First find the ID of the job you will terminate

condor_q userid

Then call

condor_rm ID

Example: I call condor_q sec312 to list all jobs belong to my username. This gives a list similar to this

-- Submitter: : <> :
42989.0   sec312         10/25 19:56   0+00:00:29 R  0   0.0  symphony -F air04.
42989.1   sec312         10/25 19:56   0+00:00:29 R  0   0.0  symphony -F air05.
42989.5   sec312         10/25 19:56   0+00:00:28 R  0   0.0  symphony -F dsbmip

Now let say I want to terminate 42989.5. I call condor_rm 42989.5. CONDOR confirms by saying
Job 42989.5 marked for removal

You can remove all your jobs using command condor_rm username.

Frequently Used CONDOR Commands

A summary of frequently used commands in CONDOR:

Command Action Basic Usage Example
condor_submit submit a job condor_submit [submit file] $ condor_submit job.condor
condor_q show status of jobs condor_q [cluster] $ condor_q 1170
condor_rm remove jobs from the queue condor_rm [cluster] $ condor_rm 1170
condor_rm userid remove all jobs of user


Some other CONDOR commands

Command Action
condor_userprio shows the user priority condor_userprio
condor_status show the current status of CONDOR nodes

Running MPI Jobs with Condor

FIXME To submit MPI jobs to our condor pool you can check Dr. Takac's MPI tutorial

Using AMPL with Condor

We have limited license of AMPL installed in COR@L Lab. The license allows at most 10 simultaneous AMPL jobs. If you are using AMPL in your experiments you can let condor know about this and it will schedule all jobs that needs AMPL considering the license limit. For this you should add the following line to your condor submission file.

concurrency_limits = AMPL

condor.1452620241.txt.gz · Last modified: 1998/12/03 12:11 (external edit)