HTC User Commands

HTCondor commands

Some commands are useful for keeping track of your job’s progress. In the following examples, 123 and 123.0 would mean your job’s cluster ID and cluster+job ID.

condor_q— To generate a list of unmatched jobs

  • List all jobs in the queue: condor_q
    • Status codes: I — Idle, R — Running, C — Completed, X — Cancelled, H — Held (for handle held jobs, check the Example below)
  • Check if the job 123.0could be run. If it is held, the command provides reason for that: condor_q -analyze 123.0

condor_rm— To cancel jobs

  • Cancel job 123.0: condor_rm 123.0
  • Cancel all jobs in cluster 123: condor_rm 123

condor_status— To show the list of all PCs and whether they are available

  • List all available nodes: condor_status
  • List all available 24-hour nodes: condor_status -constraint "strcmp(substr(Machine,0,3),\"HTC\")=?=0"

condor_history— To show the history of run/cancelled jobs

  • List all history: condor_history
  • List history of your own jobs: condor_history -y hku_portal_ID
  • Check the wall clock time of the job at the node: condor_history 123.0 -format "%f\n" RemoteWallClockTime
  • Check the last node which processed the job: condor_history 123.0 -format "%s\n" LastRemoteHost

condor_hold— To hold jobs manually

  • Hold job123.0: condor_hold 123.0
  • Hold all jobs in cluster123: condor_hold 123

condor_release— To release jobs manually

  • Release your job:condor_release 123.0
  • Release all jobs in cluster123: condor_release 123

condor_qedit— To edit job submission

  • To reset the requirement string to the one we recommend: condor_qedit 123.0 Requirements "( Target.OpSys == \"WINDOWS\" && ( Target.Arch == \"INTEL\" || Target.Arch == \"X86_64\" ) && ( strcmp(substr(Target.Name,6,1),\"N\") =?= 0 ) )"

Examples

  1. You list all your jobs: condor_q -y hku_portal_ID
  2. You found one of your job is held (say, job 123.0has status “H” in the condor_qlistings) and you want to investigate: condor_q -analyze 123.0
    1. If the error is “file not found”, just put the file to the indicated path, and release the job: condor_release 123.0
    2. If the error is related to the Requirementsstring, you may want to “reset” to the default one on our HTCondor system: condor_qedit 123.0 Requirements "( Target.OpSys == \"WINDOWS\" && ( Target.Arch == \"INTEL\" || Target.Arch == \"X86_64\" ) && ( strcmp(substr(Target.Name,6,1),\"N\") =?= 0 ) )"
    3. If you decide to cancel the job altogether: condor_rm 123.0

Further Reading

http://www.iac.es/sieinvens/siepedia/pmwiki.php?n=HOWTOs.CondorUsefulCommands