Towards Consensus Labeling of Malware Threats

The unprecedented immensity and variety of malware threats (e.g., virus, Trojan horses, worms) have spurred intensive research on large-scale malware analysis in both academia and industrial communities; yet, the knowledge bases built by such effort have not been collectively leveraged to a large extent. One fundamental barrier facing the integration of threat intelligence is the lack of malware labeling standards. We show the severity of this problem by an in-depth empirical study of the labeling systems of five popular anti-virus engines using a large collection of malware instances. Instead of attempting to unify the malware naming conventions, we propose a pragmatic alternative: leveraging correspondence evidences from multiple anti-virus sources to create a virtual, consensus malware categorization, such that different anti-virus vendors can communicate through this consensus scheme without changing their local naming conventions. We present a prototype malware label matching system LATIN that makes it possible to tell whether two malware samples under different naming conventions refer to the same malware category simply by their names.

By: Ting Wang, Xin Hu

Published in: RC25288 in 2012


