wrench::JobManager Class Reference

A helper daemon (co-located with and explicitly started by a WMS), which is used to handle all job executions. More...

#include <JobManager.h>

Inheritance diagram for wrench::JobManager:
wrench::Service

Public Member Functions

PilotJobcreatePilotJob (unsigned long num_hosts, unsigned long num_cores_per_hosts, double ram_per_host, double duration)
 Create a pilot job. More...
 
StandardJobcreateStandardJob (std::vector< WorkflowTask * > tasks, std::map< WorkflowFile *, StorageService * > file_locations, std::set< std::tuple< WorkflowFile *, StorageService *, StorageService * >> pre_file_copies, std::set< std::tuple< WorkflowFile *, StorageService *, StorageService * >> post_file_copies, std::set< std::tuple< WorkflowFile *, StorageService * >> cleanup_file_deletions)
 Create a standard job. More...
 
StandardJobcreateStandardJob (std::vector< WorkflowTask * > tasks, std::map< WorkflowFile *, StorageService * > file_locations)
 Create a standard job. More...
 
StandardJobcreateStandardJob (WorkflowTask *task, std::map< WorkflowFile *, StorageService * > file_locations)
 Create a standard job. More...
 
void forgetJob (WorkflowJob *job)
 Forget a job (to free memory, only once a job has completed or failed) More...
 
std::set< PilotJob * > getPendingPilotJobs ()
 Get the list of currently pending pilot jobs. More...
 
std::set< PilotJob * > getRunningPilotJobs ()
 Get the list of currently running pilot jobs. More...
 
void kill ()
 Kill the job manager (brutally terminate the daemon, clears all jobs)
 
void stop ()
 Stop the job manager. More...
 
void submitJob (WorkflowJob *job, ComputeService *compute_service, std::map< std::string, std::string > service_specific_args={})
 Submit a job to compute service. More...
 
void terminateJob (WorkflowJob *)
 Terminate a job (standard or pilot) that hasn't completed/expired/failed yet. More...
 
- Public Member Functions inherited from wrench::Service
std::string getHostname ()
 Get the name of the host on which the service is / will be running. More...
 
double getNetworkTimeoutValue ()
 Returns the service's network timeout value. More...
 
bool getPropertyValueAsBoolean (std::string)
 Get a property of the Service as a boolean. More...
 
double getPropertyValueAsDouble (std::string)
 Get a property of the Service as a double. More...
 
std::string getPropertyValueAsString (std::string)
 Get a property of the Service as a string. More...
 
bool isUp ()
 Returns true if the service is UP, false otherwise. More...
 
void setNetworkTimeoutValue (double value)
 Sets the service's network timeout value. More...
 
void start (std::shared_ptr< Service > this_service, bool daemonize=false)
 Start the service. More...
 

Additional Inherited Members

- Public Types inherited from wrench::Service
enum  State { UP, DOWN }
 Service states. More...
 

Detailed Description

A helper daemon (co-located with and explicitly started by a WMS), which is used to handle all job executions.

Member Function Documentation

PilotJob * wrench::JobManager::createPilotJob ( unsigned long  num_hosts,
unsigned long  num_cores_per_host,
double  ram_per_host,
double  duration 
)

Create a pilot job.

Parameters
num_hoststhe number of hosts required by the pilot job
num_cores_per_hostthe number of cores per host required by the pilot job
ram_per_hostthe number of bytes of RAM required by the pilot job on each host
durationthe pilot job's duration in seconds
Returns
the pilot job
Exceptions
std::invalid_argument
StandardJob * wrench::JobManager::createStandardJob ( std::vector< WorkflowTask * >  tasks,
std::map< WorkflowFile *, StorageService * >  file_locations,
std::set< std::tuple< WorkflowFile *, StorageService *, StorageService * >>  pre_file_copies,
std::set< std::tuple< WorkflowFile *, StorageService *, StorageService * >>  post_file_copies,
std::set< std::tuple< WorkflowFile *, StorageService * >>  cleanup_file_deletions 
)

Create a standard job.

Parameters
tasksa list of tasks (which must be either READY, or children of COMPLETED tasks or of tasks also included in the standard job)
file_locationsa map that specifies on which storage services input/output files should be read/written. When unspecified, it is assumed that the ComputeService's scratch storage space will be used.
pre_file_copiesa set of tuples that specify which file copy operations should be completed before task executions begin. The ComputeService::SCRATCH constant can be used to mean "the scratch storage space of the ComputeService".
post_file_copiesa set of tuples that specify which file copy operations should be completed after task executions end. The ComputeService::SCRATCH constant can be used to mean "the scratch storage space of the ComputeService".
cleanup_file_deletionsa set of file tuples that specify file deletion operations that should be completed at the end of the job. The ComputeService::SCRATCH constant can be used to mean "the scratch storage space of the ComputeService".
Returns
the standard job
Exceptions
std::invalid_argument
StandardJob * wrench::JobManager::createStandardJob ( std::vector< WorkflowTask * >  tasks,
std::map< WorkflowFile *, StorageService * >  file_locations 
)

Create a standard job.

Parameters
tasksa list of tasks (which must be either READY, or children of COMPLETED tasks or of tasks also included in the list)
file_locationsa map that specifies on which storage services input/output files should be read/written. When unspecified, it is assumed that the ComputeService's scratch storage space will be used.
Returns
the standard job
Exceptions
std::invalid_argument
StandardJob * wrench::JobManager::createStandardJob ( WorkflowTask task,
std::map< WorkflowFile *, StorageService * >  file_locations 
)

Create a standard job.

Parameters
taska task (which must be ready)
file_locationsa map that specifies on which storage services input/output files should be read/written. When unspecified, it is assumed that the ComputeService's scratch storage space will be used.
Returns
the standard job
Exceptions
std::invalid_argument
void wrench::JobManager::forgetJob ( WorkflowJob job)

Forget a job (to free memory, only once a job has completed or failed)

Parameters
joba job to forget
Exceptions
std::invalid_argument
WorkflowExecutionException
std::set< PilotJob * > wrench::JobManager::getPendingPilotJobs ( )

Get the list of currently pending pilot jobs.

Returns
a set of pilot jobs
std::set< PilotJob * > wrench::JobManager::getRunningPilotJobs ( )

Get the list of currently running pilot jobs.

Returns
a set of pilot jobs
void wrench::JobManager::stop ( )
virtual

Stop the job manager.

Exceptions
WorkflowExecutionException
std::runtime_error

Reimplemented from wrench::Service.

void wrench::JobManager::submitJob ( WorkflowJob job,
ComputeService compute_service,
std::map< std::string, std::string >  service_specific_args = {} 
)

Submit a job to compute service.

Parameters
joba workflow job
compute_servicea compute service
service_specific_argsarguments specific for compute services:
Exceptions
std::invalid_argument
WorkflowExecutionException
void wrench::JobManager::terminateJob ( WorkflowJob job)

Terminate a job (standard or pilot) that hasn't completed/expired/failed yet.

Parameters
jobthe job to be terminated
Exceptions
WorkflowExecutionException
std::invalid_argument
std::runtime_error

The documentation for this class was generated from the following files:
  • /home/wrench/wrench/include/wrench/managers/JobManager.h
  • /home/wrench/wrench/src/wrench/managers/JobManager.cpp