On-Demand Pilot Job Scheduling for Adaptive Execution of HEP Workflows on HPC Systems (poster)

May 8th, 2025

Categories: Applications, Supercomputing, Data Science, High Performance Computing

Authors

Sharma, S., Wang, X., Lan, Z., Papka, M. E.

About

HEP workflows often consist of complex DAGs with streaming job arrivals that challenge traditional static scheduling approaches. We introduce a dynamic pilot job scheduling model that allocates HPC resources on demand when queued DAG jobs reach a configurable threshold. The simulation framework models interactions between the trace parser, DAG manager, pilot job manager, and HPC resources. The system improves scheduling responsiveness and resource utilization by reserving nodes as needed and scheduling jobs in batches based on DAG dependencies.

Resources

PDF

URL

Citation

Sharma, S., Wang, X., Lan, Z., Papka, M. E., On-Demand Pilot Job Scheduling for Adaptive Execution of HEP Workflows on HPC Systems (poster), The 12th Greater Chicago Area Systems Research Workshop (GCASR), Chicago, IL, May 8th, 2025. https://gcasr.org/2025/posters