SODA: An Optimizing Scheduler for Large-Scale Stream-Based Distributed Computer Systems

This paper describes SODA, a scheduler for System S. System S is a highly scalable distributed computer system designed to handle complex applications processing enormous quantities of streaming data. Unlike traditional batch applications, streaming applications are open-ended. The system cannot typically delay the processing of the data. The scheduler must be able to shift resource allocation dynamically in response to changes to resource availability, job arrivals and departures, incoming data rates and so on. The design assumptions of System S, in particular, pose additional scheduling challenges. SODA must deal with a highly complex optimization problem, which must be solved in real-time while maintaining scalability. SODA relies on a careful problem decomposition, and intelligent use of both heuristic and exact algorithms. This research report is intended to be as complete a description of SODA as is reasonably practical. We describe the design and functionality of SODA, give overviews and extensive details of the mathematical components, and describe experiments to show the performance of the scheduler. We describe three key input data infrastructure components. We candidly describe our experiences and lessons learned during the integration of this complex component within the larger, hugely complex System S. We also describe two major SODA variants as well as several other key subtleties, bells and whistles. Finally, we give a list of future SODA work items. System S is very much an ongoing project.

By: Joel Wolf, Nikhil Bansal, Kirsten Hildrum, Sujay Parekh, Deepak Rajan, Rohit Wagle, Kun-Lung Wu, Lisa Fleischer

Published in: MIDDLEWARE 2008 - ACM/IFIP/USENIX 9th International Middleware ConferenceLeuven, Belgium, p.306-25 in 2008

Please obtain a copy of this paper from your local library. IBM cannot distribute this paper externally.

Questions about this service can be mailed to reports@us.ibm.com .