Process Tracking for Parallel Job Control

        Job management subsystems in parallel environments have to address two important issues: (i) how to associate processes present in the system to the tasks of parallel jobs, and (ii) how to control execution of these tasks. The standard UNIX mechanism for job control, process groups, is not appropriate for this purpose as processes can escape their original groups and start new ones. We introduce the concept of genealogy, in which a process is identified by the genetic footprint it inherits from its parent. With this concept, tasks are defined by sets of processes with a common ancestor. Process tracking is the mechanism by which we implement the genealogy concept in the IBM AIX operating system. No changes to the kernel are necessary and individual process control is achieved through standard UNIX signaling methods. Performance evaluation, on both uniprocessor and multiprocessor systems, demonstrate the efficacy of job control through process tracking. Process tracking has been incorporated in a research prototype gang-scheduling system for the IBM RS/6000 SP.

By: Hubertus Franke, José Moreira, Pratap Pattnaik

Published in: Lecture Notes in Computer Science, volume 1659, (no ), pages 144-61 in 1999

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .