ConfEx: An Analytics Framework for Text-Based Software Configurations in the Cloud

Modern cloud applications are designed in a highly configurable way to provide increased reusability and portability. With the growing complexity of these applications, configuration errors (i.e., misconfigurations) have become major sources of service outages and disruptions. While some research has so far focused on automatically detecting errors on configurations that are represented as well-structured key-value pairs, discovering and extracting configurations remain a challenge for a wide range of cloud applications that store their configurations in loosely-structured text files.

This paper proposes ConfEx, a framework that enables discovery and analysis of text-based configurations in multi-tenant cloud platforms and cloud image repositories. Our framework uses a novel vocabulary-based discovery technique to identify text-based configuration files in cloud system instances with unlabeled content. We show that, even for labeled configuration files, widely-used and expert-maintained configuration parsing tools lack the consistency and robustness needed for meaningful statistical analysis of configurations. We introduce a novel disambiguation technique that resolves the inconsistencies in the configuration-related data extracted by existing parsers. When tested on 4581 popular Docker Hub images, ConfEx achieves over 98% precision and recall in identifying configuration files, and consistently improves the efficacy of misconfiguration detection through outlier analysis as well as syntactic configuration validation.

By: Ozan Tuncer , Nilton Bila , Canturk Isci , Ayse K. Coskun

Published in: in 2018

rc25675.pdf

Questions about this service can be mailed to reports@us.ibm.com .