1581382 : Your held jobs¶
Created: 2026-03-04T08:16:42Z - current status: new¶
Here is the anonymized and summarized version of the reported issue:
Summary of the Issue¶
A user has 874 held jobs in the National Analysis Facility (NAF) HTCondor system. The jobs were placed on hold because they exceeded the default runtime limit (3 hours for "lite-class" jobs). The hold reason for each job is listed as:
"Job runtime longer than reserved".
Suggested Solution¶
-
Check Job Details To verify the runtime and other resource usage for a specific held job, run:
bash condor_q [JOB_ID] -af HoldReason RequestRuntime RemoteWallClockTime(Replace[JOB_ID]with an actual job ID from the list.) -
Options to Resolve the Issue
- Option 1: Delete and Resubmit
Delete the held jobs and resubmit them with updated runtime requirements (e.g., using
RequestRuntimein the submit file).bash condor_rm [JOB_ID] # Delete a single job condor_rm -constraint 'HoldReason == "Job runtime longer than reserved"' # Delete all held jobs with this reason -
Option 2: Edit and Release Jobs Adjust the runtime limit for held jobs and release them:
bash condor_qedit [JOB_ID] "RequestRuntime = [NEW_RUNTIME_IN_SECONDS]" condor_release [JOB_ID](Example:RequestRuntime = 14400for 4 hours.) -
Prevent Future Issues
- Test Jobs First: Run a small batch of jobs to verify runtime/memory requirements before submitting large numbers.
- Use Dedicated Runtime Classes: For jobs exceeding 3 hours, specify a longer runtime class (e.g.,
bide) in the submit file.