The Parallel Machine Learning (PML) Framework and the Transform Regression Algorithm

Machine learning techniques are increasingly being used with massive training data sets in application areas such as internet, retail, insurance, finance, manufacturing and life sciences. The Parallel Machine Learning (PML) toolkit is a software framework for machine learning algorithms on high-performance computer (HPC) platforms (such as the IBM Blue Gene/P supercomputer). Several well-known algorithms have been implemented using the PML framework to date, and we specifically describe the detailed implementation of the transform regression (TREG) algorithm, in view of its novelty, parallel scalability and wide applicability.

By: Sitaram Asur, Amol Ghoting, Ramesh Natarajan, Edwin Pednault

Published in: RC24882 in 2009

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

rc24882.pdf

Questions about this service can be mailed to reports@us.ibm.com .