A Cloud-Native Monitoring and Analytics Framework

Operational visibility is an important administrative capability and is one of the critical factor in deciding the success or failure of a cloud service. Today, it is increasingly becoming more complex along many dimensions which include being able to track both persistent and volatile system state, as well as provide higher level services such as log analytics, software discovery, behavioral anomaly detection, drift analysis to name a few. In addition, the target endpoints to monitor are becoming increasingly varied in terms of their heterogeneity, cardinality, and lifecycles, while being hosted across different software stacks. In this paper, we present our unified monitoring and analytics pipeline to provide operational visibility, that overcomes the limitations of traditional monitoring solutions, as well as provides a uniform platform as opposed to configuring, installing and maintaining multiple siloed solutions. Our OpVis framework has been running in our production cloud for over two years, while providing a multitude of such operational visibility and analytics functionality uniformly across heterogeneous endpoints. To be able to adapt to the ever-changing cloud landscape, we highlight it’s extensibility model that enables custom data collection and analytics based on the cloud user’s requirements. We describe its monitoring and analytics capabilities, present performance measures, and discuss our experiences while supporting operational visibility for our cloud deployment.

By: Fabio A. Oliveira, Sahil Suneja, Shripad Nadgowda, Priya Nagpurkar, Canturk Isci

Published in: RC25669 in 2017

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc25669.pdf

Questions about this service can be mailed to reports@us.ibm.com .