HTC Job Submission

HTCondor Job command file

The following shows the simplest HTCondor job command file, assuming your program is abc.exe. If you are using MATLAB or Java, you may find links to an template at the end of this page.

Basic Structure

Here is a minimalist HTCondor submission file which takes the program abc.exeand run it.

Universe = vanilla
Executable  = abc.exe

Requirements = ( Target.OpSys == "WINDOWS" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Name,6,1),"N") =?= 0 ) )
request_cpus = 1

Error   = err.$(Cluster).$(Process)
Output  = out.$(Cluster).$(Process)
Log = log.$(Cluster).$(Process)

queue

Three files will be automatically produced by the time of submission. They are prefixed with:

  1. out — output file, which are the content of the stdout generated by the process.
  2. err — error file, which are the content of the stderr generated by the process.
  3. log — log file, which records events of the job itself along its life-cycle.

The suffix is the cluster ID and process ID, which we will discuss below.

Automatic Variables

$(Cluster) — Cluster ID. A new ID is given to each condor_submitcall.
$(Process) — Job ID. A new ID is given to each job generated by the job submission script.
In HTCondor it is often displayed as two parts, separated by a dot (.) like 123.0 where 123 is the cluster ID and 0 is the job ID.

Command Line Arguments

Command line arguments to be passed to the executable (either .exe or .bat) could be specified using the expression Arguments. For example the following sends the job ID as the command line argument:

Universe = vanilla
Executable  = abc.exe
Arguments = $(Process)

Requirements = ( Target.OpSys == "WINDOWS" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Name,6,1),"N") =?= 0 ) )
request_cpus = 1

Error   = err.$(Cluster).$(Process)
Output  = out.$(Cluster).$(Process)
Log = log.$(Cluster).$(Process)

queue

Input/Output Files

Any files which should be accessible by the executable besides itself have to be sent using the Input file mechanism. Adding the following lines send a 7-zip to the remote machine in order to extract the file and use the data within.

should_transfer_files = YES
transfer_input_files = 7z.dll, 7-zip.dll, 7z.exe, data.7z

If your executable generates output files, make sure it is generated at the same folder as the executable itself. (When in doubt, make sure your files is at the folder pointed by the environment variable %CD%or %TEMP%. Then you can specify the path of the output files (but no wildcard could be used). Empty place-holder files are generated using the same name as your output files when you submit the jobs. The actual content will be copied over after any machine run the code. If the specified output file does not appear at the target machine when the job complete, HTCondor will generate an error and hold the job.

transfer_output_files = abc.data
when_to_transfer_output = ON_EXIT_OR_EVICT

Requirements Strings

One may set the Requirementsstring in order to force the job to land on a certain type of machine.

Type Subtype Requirements String Remarks
Normal Slots
(which run jobs after Learning Commons closure)
Any Windows Requirements = ( Target.OpSys == "WINDOWS" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Name,6,1),"N") =?= 0 ) )
Windows 10 Requirements = ( Target.OpSysAndVer == "WINDOWS1000" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Name,6,1),"N") =?= 0 ) ) ~160 PCs
Debug/24-hour slots
(for debug/develop use)
Windows 10 Requirements = ( Target.OpSysAndVer == "WINDOWS1000" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Machine,0,7),"HTCW10C") =?= 0 ) ) 10 VMs
Local Universe
(i.e., htclogin.hku.hk. Abusive use is prohibited)
CentOS 6
htclogin
Requirements = ( Target.OpSysAndVer == "CentOS6" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Machine,0,7),"htclogin") =?= 0 ) ) 1 VM

Other Tricks

The queuekeyword can appear multiple times in a submission script and the variables’ value at that point is used to generate jobs. For example the following is a valid job description:

Universe = vanilla
Executable  = abc.exe

Requirements = ( Target.OpSys == "WINDOWS" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Name,6,1),"N") =?= 0 ) )
request_cpus = 1

Error   = err.$(Cluster).$(Process)
Output  = out.$(Cluster).$(Process)
Log = log.$(Cluster).$(Process)

Arguments = abc
queue

Arguments = def
queue

Arguments = ghi
queue

The other way is to generate multiple jobs with the same set of parameters (useful if you have randomization within your code, or your code is a randomized algorithm). For example we could run the program abc.exe as 100 different jobs:

Universe = vanilla
Executable  = abc.exe

Requirements = ( Target.OpSys == "WINDOWS" && ( Target.Arch == "INTEL" || Target.Arch == "X86_64" ) && ( strcmp(substr(Target.Name,6,1),"N") =?= 0 ) )
request_cpus = 1

Error   = err.$(Cluster).$(Process)
Output  = out.$(Cluster).$(Process)
Log = log.$(Cluster).$(Process)

queue 100

Further Reading

http://www.iac.es/sieinvens/siepedia/pmwiki.php?n=HOWTOs.CondorSubmitFile

User Applications

  1. MATLAB jobs
  2. Java jobs
  3. Other Packages
  • We are working on some other packages. They will be added here once available.