wrench::JobManager
-
class wrench::JobManager : public wrench::Service
A helper daemon (co-located with and explicitly started by an execution controller), which is used to handle all job executions.
Public Functions
-
std::shared_ptr<CompoundJob> createCompoundJob(std::string name)
Create a Compound job.
- Parameters
name – the job’s name (if empty, a unique job name will be picked for you)
- Returns
the job
-
std::shared_ptr<PilotJob> createPilotJob()
Create a pilot job.
- Throws
std::invalid_argument –
- Returns
the pilot job
Create a standard job.
- Parameters
task – a task (which must be ready)
- Throws
std::invalid_argument –
- Returns
the standard job
Create a standard job.
- Parameters
task – a task (which must be ready)
file_locations – a map that specifies locations where input/output files should be read/written. When unspecified, it is assumed that the ComputeService’s scratch storage space will be used.
- Throws
std::invalid_argument –
- Returns
the standard job
Create a standard job.
- Parameters
task – a task (which must be ready)
file_locations – a map that specifies, for each file, a list of locations, in preference order, where input/output files should be read/written. When unspecified, it is assumed that the ComputeService’s scratch storage space will be used.
- Throws
std::invalid_argument –
- Returns
the standard job
Create a standard job.
- Parameters
tasks – a list of tasks (which must be either READY, or children of COMPLETED tasks or of tasks also included in the list)
- Throws
std::invalid_argument –
- Returns
the standard job
Create a standard job.
- Parameters
tasks – a list of tasks (which must be either READY, or children of COMPLETED tasks or of tasks also included in the list)
file_locations – a map that specifies locations where files, if any, should be read/written. When empty, it is assumed that the ComputeService’s scratch storage space will be used.
- Throws
std::invalid_argument –
- Returns
the standard job
Create a standard job.
- Parameters
tasks – a list of tasks (which must be either READY, or children of COMPLETED tasks or of tasks also included in the standard job)
file_locations – a map that specifies locations where input/output files, if any, should be read/written. When empty, it is assumed that the ComputeService’s scratch storage space will be used.
pre_file_copies – a vector of tuples that specify which file copy operations should be completed before task executions begin. The ComputeService::SCRATCH constant can be used to mean “the scratch storage space of the ComputeService”.
post_file_copies – a vector of tuples that specify which file copy operations should be completed after task executions end. The ComputeService::SCRATCH constant can be used to mean “the scratch storage space of the ComputeService”.
cleanup_file_deletions – a vector of file tuples that specify file deletion operations that should be completed at the end of the job. The ComputeService::SCRATCH constant can be used to mean “the scratch storage space of the ComputeService”.
- Throws
std::invalid_argument –
- Returns
the standard job
Create a standard job.
- Parameters
tasks – a list of tasks (which must be either READY, or children of COMPLETED tasks or of tasks also included in the list)
file_locations – a map that specifies, for each file, a list of locations, in preference order, where input/output files should be read/written. When unspecified, it is assumed that the ComputeService’s scratch storage space will be used.
- Throws
std::invalid_argument –
- Returns
the standard job
Create a standard job.
- Parameters
tasks – a list of tasks (which must be either READY, or children of COMPLETED tasks or of tasks also included in the standard job)
file_locations – a map that specifies, for each file, a list of locations, in preference order, where input/output files should be read/written. When unspecified, it is assumed that the ComputeService’s scratch storage space will be used.
pre_file_copies – a vector of tuples that specify which file copy operations should be completed before task executions begin. The ComputeService::SCRATCH constant can be used to mean “the scratch storage space of the ComputeService”.
post_file_copies – a vector of tuples that specify which file copy operations should be completed after task executions end. The ComputeService::SCRATCH constant can be used to mean “the scratch storage space of the ComputeService”.
cleanup_file_deletions – a vector of file tuples that specify file deletion operations that should be completed at the end of the job. The ComputeService::SCRATCH constant can be used to mean “the scratch storage space of the ComputeService”.
- Throws
std::invalid_argument –
- Returns
the standard job
-
simgrid::s4u::Mailbox *getCreatorMailbox()
Return the mailbox of the job manager’s creator.
- Returns
a mailbox
-
unsigned long getNumRunningPilotJobs() const
Get the list of currently running pilot jobs.
- Returns
a set of pilot jobs
-
void kill()
Kill the job manager (brutally terminate the daemon, clears all jobs)
-
virtual void stop() override
Stop the job manager.
- Throws
std::runtime_error –
Submit a compound job to a compute service.
- Parameters
job – a compound job
compute_service – a compute service
service_specific_args – arguments specific for compute services:
to a BareMetalComputeService: {{“actionID”, “[hostname:][num_cores]}, …}
If no value is provided for a task, then the service will choose a host and use as many cores as possible on that host.
If a “” value is provided for a task, then the service will choose a host and use as many cores as possible on that host.
If a “hostname” value is provided for a task, then the service will run the task on that host, using as many of its cores as possible
If a “num_cores” value is provided for a task, then the service will run that task with this many cores, but will choose the host on which to run it.
If a “hostname:num_cores” value is provided for a task, then the service will run that task with the specified number of cores on that host.
to a BatchComputeService: {{“-t”:”<int>” (requested number of minutes)},{“-N”:”<int>” (number of requested hosts)},{“-c”:”<int>” (number of requested cores per host)}[,{“actionID”:”[node_index:]num_cores”}] [,{“-u”:”<string>” (username)}]}
to a VirtualizedClusterComputeService: {} (jobs should not be submitted directly to the service)}
to a CloudComputeService: {} (jobs should not be submitted directly to the service)}
to a HTCondorComputeService:
For a “grid universe” job that will be submitted to a child BatchComputeService: {{“-universe”:”grid”, {“-t”:”<int>” (requested number of minutes)},{“-N”:”<int>” (number of requested hosts)},{“-c”:”<int>” (number of requested cores per host)}[,{“-service”:”<string>” (BatchComputeService service name)}] [, {“actionID”:”[node_index:]num_cores”}] [, {“-u”:”<string>” (username)}]}
For a “non-grid universe” job that will be submitted to a child BareMetalComputeService: {}
- Throws
std::invalid_argument –
Submit a pilot job to a compute service.
- Parameters
job – a pilot job
compute_service – a compute service
service_specific_args – arguments specific for compute services:
to a BatchComputeService: {“-t”:”<int>” (requested number of minutes)},{“-N”:”<int>” (number of requested hosts)},{“-c”:”<int>” (number of requested cores per host)}
to a BareMetalComputeService: {} (pilot jobs should not be submitted directly to the service)}
to a VirtualizedClusterComputeService: {} (pilot jobs should not be submitted directly to the service)}
to a CloudComputeService: {} (pilot jobs should not be submitted directly to the service)}
to a HTCondorComputeService: {} (pilot jobs should be be submitted directly to the service)
- Throws
std::invalid_argument –
Submit a standard job to a compute service.
- Parameters
job – a standard job
compute_service – a compute service
service_specific_args – arguments specific for compute services:
to a BareMetalComputeService: {{“taskID”, “[hostname:][num_cores]}, …}
If no value is provided for a task, then the service will choose a host and use as many cores as possible on that host.
If a “” value is provided for a task, then the service will choose a host and use as many cores as possible on that host.
If a “hostname” value is provided for a task, then the service will run the task on that host, using as many of its cores as possible
If a “num_cores” value is provided for a task, then the service will run that task with this many cores, but will choose the host on which to run it.
If a “hostname:num_cores” value is provided for a task, then the service will run that task with the specified number of cores on that host.
to a BatchComputeService: {{“-t”:”<int>” (requested number of minutes)},{“-N”:”<int>” (number of requested hosts)},{“-c”:”<int>” (number of requested cores per host)}[,{“taskID”:”[node_index:]num_cores”}] [,{“-u”:”<string>” (username)}]}
to a VirtualizedClusterComputeService: {} (jobs should not be submitted directly to the service)}
to a CloudComputeService: {} (jobs should not be submitted directly to the service)}
to a HTCondorComputeService:
For a “grid universe” job that will be submitted to a child BatchComputeService: {{“-universe”:”grid”, {“-t”:”<int>” (requested number of minutes)},{“-N”:”<int>” (number of requested hosts)},{“-c”:”<int>” (number of requested cores per host)}[,{“-service”:”<string>” (BatchComputeService service name)}] [, {“taskID”:”[node_index:]num_cores”}] [, {“-u”:”<string>” (username)}]}
For a “non-grid universe” job that will be submitted to a child BareMetalComputeService: {}
- Throws
std::invalid_argument –
Terminate a compound job that hasn’t completed/expired/failed yet.
- Parameters
job – the job to be terminated
- Throws
std::invalid_argument –
std::runtime_error –
Terminate a pilot jobthat hasn’t completed/expired/failed yet.
- Parameters
job – the job to be terminated
- Throws
std::invalid_argument –
std::runtime_error –
Terminate a standard job that hasn’t completed/expired/failed yet.
- Parameters
job – the job to be terminated
- Throws
std::invalid_argument –
std::runtime_error –
-
std::shared_ptr<CompoundJob> createCompoundJob(std::string name)