Low-Synchronization, Mostly Lock-Free, Elastic Scheduling for Streaming Runtimes

We present the scalable, elastic operator scheduler in IBM Streams 4.2. Streams is a distributed stream processing system used in production at many companies in a wide range of industries. The programming language for Streams, SPL, presents operators, tuples and streams as the primary abstractions. A fundamental SPL optimization is operator fusion, where multiple operators execute together in the same process. Streams 4.2 automatically performs fusion at submission time, because we discovered that in practice, customers did not have the expertise to do so. However, this presented a new problem: potentially thousands of operators would execute together in the same process, with no user guidance for thread placement. We needed a way to automatically figure out how many threads to use, with arbitrarily sized applications on a wide variety of hardware, and without any input from programmers. Our solution has two components. The first is a scalable operator scheduler that minimizes synchronization, locks and global data, while allowing threads to execute any operator and dynamically come and go. The second are elastic algorithms to dynamically adjust the number of threads to optimize performance, using the principles of trust and establishing trends. We demonstrate our scheduler’s ability to scale to over a hundred threads, and our elasticity algorithm’s ability to adapt to different workloads on an Intel Xeon system with 176 logical cores, and an IBM Power8 system with 184 logical cores.

By: Scott Schneider, Kun-Lung Wu

Published in: RC25641 in 2016

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc25641.pdf

Questions about this service can be mailed to reports@us.ibm.com .