Multidimensional Balanced Allocation for Multiple Choice & $(1+\beta)$ Processes

Allocation of balls into bins is a well studied abstraction for load balancing problems. The literature hosts numerous results for sequential (single dimensional) allocation case when $m$ balls are thrown into $n$ bins; such as: for multiple choice paradigm the expected gap between the heaviest bin and the average load is $O(\frac{\log\log(n)}{\log(d)})$~\cite{petra-heavy-case}, $(1+\beta)$ choice paradigm with $O(\frac{\log(n)}{\beta})$ gap~\cite{kunal-beta} as well as for single choice paradigm having $O(\sqrt{\frac{m\log(n)}{n}})$ gap~\cite{mm-thesis}. However, for multidimensional balanced allocations very little is known. Mitzenmacher~\cite{md-mm} proved $O(\log\log(nD))$ gap for the multiple choice strategy and $O(\log(nD))$ gap for single choice paradigm (where $D$ is the total number of dimensions with each ball having exactly $f$ populated dimensions) under the assumption that for each ball $f$ dimensions are uniformly distributed over the $D$ dimensions. In this paper we study the symmetric multiple choice process for both unweighted and weighted balls as well as for both multidimensional and scalar modes. Additionally, we present the results on bounds on gap for the $(1+\beta)$ choice process with multidimensional balls and bins.

In the first part of this paper, we study multidimensional balanced allocations for the symmetric $d$ choice process with $m >> n$ unweighted balls and $n$ bins. We show that for the symmetric $d$ choice process and with $m = O(n)$, the upper bound (assuming uniform distribution of $f$ populated dimensions over $D$ total dimensions) on the gap is $O(\ln\ln(n))$ w.h.p.. This upper bound on the gap is within $D/f$ factor of the lower bound. This is the first such tight result along with detailed analysis for $d$ choice paradigm with multidimensional balls and bins. %Further, we provide upper bounds on the gap when $f$ is a random variable.
This improves upon the best known prior bound of $O(\log\log(nD))$~\cite{md-mm}. For the general case of $m >> n$ the expected gap is bounded by $O(\ln\ln(n))$. For variable $f$ and non-uniform distribution of the populated dimensions (using analysis for weighted balls), we obtain the upper bound on the expected gap as $O(\log(n))$.

Further, for the multiple round parallel balls and bins, using symmetric $d$-choice process in multidimensional mode, we show that the gap is also bounded by $O(\log\log(n))$ for $m = O(n)$. The same bound holds for the expected gap when $m >> n$.
Our analysis also has the following strong implications for the sequential scalar case. For the weighted balls and bins and general case $m >> n$, we show that the upper bound on the expected gap is $O(\log(n))$ (assuming $E[W] = 1$ and second moment of the weight distribution is finite) which improves upon the best prior bound of $n^c$ ($c$ depends on the weight distribution that has finite fourth moment) provided in~\cite{kunal-weighted}. Our analysis also provides a much easier and elegant proof technique (as compared to~\cite{petra-heavy-case}) for the $O(\log\log(n))$ upper bound on the gap for scalar unweighted $m >> n$ balls thrown into $n$ bins using the symmetric multiple choice process.

Moreover, we study multidimensional balanced allocations for the $(1+\beta)$ choice process and the multiple ($d$) choice process. We show that for the $(1+\beta)$ choice process and $m=O(n)$ the upper bound (assuming uniform distribution of $f$ populated dimensions over $D$ total dimensions) on the gap is \textbf{$O(\frac{\log(n)}{\beta})$}, which is within $D/f$ factor of the lower bound. For fixed $f$ with non-uniform distribution and for random $f$ with Binomial distribution the expected gap remains \textbf{$O(\frac{\log(n)}{\beta})$} and is independent of the total number of balls thrown, $m$. This is the first such tight result along with detailed analysis for $(1+\beta)$ paradigm with multidimensional balls and bins.

By: Ankur Narang, Sourav Dutta and Souvik Bhattacherjee

Published in: RI11018 in 2011

LIMITED DISTRIBUTION NOTICE:

This Research Report is available. This report has been submitted for publication outside of IBM and will probably be copyrighted if accepted for publication. It has been issued as a Research Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IBM prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g., payment of royalties). I have read and understand this notice and am a member of the scientific community outside or inside of IBM seeking a single copy only.

md-bb.techrep.pdf

Questions about this service can be mailed to reports@us.ibm.com .