Weighted Deterministic Aperture

Nicolae Vartolomei · 2020/08

Weighted Deterministic Aperture is my contribution to Deterministic Aperture algorithm to support work distribution proportional to server capacities (weights).

Servers5

Clients3

Aperture2

Show Verbose State

🧙‍♂️ This is an interactive note. Opening the Verbose State spoiler allows setting custom weights and running simulations. Continue reading for more explanation.

Introduction to load balancing and subsetting

Load balancing is the process of distributing work over a set of resources. Let’s say the work is represented by requests issued by clients and resources are servers.

The problem gets challenging when efficiency and performance are introduced as requirements.

A good introduction to this problem can be found in the Load Balancing in the Datacenter chapter of the SRE Book¹. I’ll just highlight that in order to drive efficiency up we are looking for at least 2 properties in a load balancing algorithm:

Lower the number of servers with which each client interacts.²
Distribute the load evenly between the servers.

The solution described in the book for the first property is Deterministic Subsetting. It doesn’t work perfectly for every combination of (servers, clients, subset size). I.e. it is not possible to split 3 servers between 5 clients (unless the subset size is 3).

An additional complication for the subsetting is server diversity, and the problem of even load distribution to servers with different capacities.

Deterministic Aperture

Engineers at Twitter moved the problem from the discrete domain to the continuous domain by “projecting” servers and clients on a circle circumference. Deterministic Aperture: A distributed, load balancing algorithm blog post goes into great detail about the problem and the solution. Please read it, otherwise what follows will not make a lot of sense. If you prefer video format, watch authors present it at SREcon19³.

Weighted Deterministic Aperture

Deterministic Aperture algorithm spreads the work evenly between all servers. It is possible for clients to send a different amount of work to different servers by choosing appropriate load balancing policies (e.g. least-loaded). However, because this would happen after subsetting using the aperture the utility is limited, if not absent.

Example of why this doesn’t work: Consider 4 servers divided between 2 clients with an aperture of 2. ${Client}_0$ is assigned subset $\{\text{Server}_0, \text{Server}_1\}$ and ${Client}_1$ subset $\{\text{Server}_2, \text{Server}_3\}$ . $\text{Server}_0$ has twice the weight of any other server which means that it should get twice the work.
${Client}_1$ can’t do anything because the weights of his subset are equal. ${Client}_0$ would steer work towards $\text{Server}_0$ . But, that work would be moved away from $\text{Server}_1$ leaving it under utilized.

Using weights to divide the circumference

Authors of Deterministic Aperture did all the hard work by moving from the discrete domain to the continuous domain, and coming up with the idea that circumference should be fully covered by both servers and clients. Thus, the algorithm has all the properties required for this to work already.

Aperture.scala implementation has abstracted the [Ring] and uses 3 operations: Ring.pick, Ring.tryPickSecond and Ring.weight. These needed to be modified to support unevenly split circumference.

Weights are arbitrary numbers and can have different meanings depending on the situation. They can represent CPU resources, memory resources, storage capacity, or a any combination of these and other relevant variables.

finagle-core/src/main/scala/com/twitter/finagle/loadbalancer/aperture/Ring.scala

-private class Ring(size: Int, rng: Rng) {
+private class WeightedRing(weights: Vector[Int], rng: Rng) {
...

-  /**
-   * Returns the uniform width of any given index on the ring. The returned
-   * value is bounded between (0, 1].
-   */
-  val unitWidth: Double = 1.0 / size

+  /**
+   * Returns the width of any given index on the ring. The returned
+   * value is bounded between (0, 1].
+   */
+  def unitWidth(ix: Int): Double = weights(ix) / weights.sum

+  /**
+   * Returns the arc length from 0 to index position. The returned
+   * value is bounded between (0, 1].
+   */
+  def widthUntil(ix: Int): Double = weights.slice(0, ix).sum

...

   /**
    * Returns the (zero-based) index between [0, `size`) which the
    * position `offset` maps to.
    *
    * @param offset A value between [0, 1.0).
    */
   def index(offset: Double): Int = {
     if (offset < 0 && offset >= 1.0)
       throw new IllegalArgumentException(s"offset must be between [0, 1.0): $offset")

-    math.floor(offset * size).toInt % size
+    var length = 0
+    for (ix <- weights.indices) {
+      length += unitWidth(ix)
+
+      if (length == offset)
+        return (ix + 1) % weights.length
+      else if (length >= offset)
+        return ix
+    }
+
+    0
   }

   /**
    * Returns the ratio of the intersection between `index` and [offset, offset + width).
    */
   def weight(index: Int, offset: Double, width: Double): Double = {
-    if (index >= size)
+    if (index >= weights.length)
       throw new IllegalArgumentException(s"index must be < size: $index")
     if (width < 0 || width > 1.0)
       throw new IllegalArgumentException(s"width must be between [0, 1.0]: $width")

     // In cases where `offset + width` wraps around the ring, we need
     // to scale the range by 1.0 where it overlaps.
     val ab: Double = {
-      val ab0 = index * unitWidth
+      val ab0 = widthUntil(index)
       if (ab0 + 1 < offset + width) ab0 + 1 else ab0
     }
-    val ae: Double = ab + unitWidth
-    intersect(ab, ae, offset, offset + width) / unitWidth
+    val ae: Double = ab + unitWidth(index)
+    intersect(ab, ae, offset, offset + width) / unitWidth(index)
   }

   /**
    * Pick a random index between [0, `size`) where the range of the
    * index intersects with [offset, offset + width).
    *
    * @param width The width of the range. We interpret a width of 0 as the range
    * [offset, offset] and as such return a valid index.
    */
   def pick(offset: Double, width: Double): Int = {
     if (width < 0 || width > 1.0)
       throw new IllegalArgumentException(s"width must be between [0, 1.0]: $width")

     index((offset + (rng.nextDouble() * width)) % 1.0)
   }

   /**
    * Picks a random index between [0, `size`) where the positions for the
    * respective index intersect with [offset, offset + width), so long as
    * the index is not `a` (if the range permits it).
    *
    * @note we expose this outside of `pick2` so that we can avoid a tuple
    * allocation on the hot path.
    */
   def tryPickSecond(a: Int, offset: Double, width: Double): Int = {
     // Element `b` is picked from "piecewise" range we get by subtracting
     // the range of a, i.e.: [offset, ab), [ae, offset + width).
     // In cases where `offset + width` wraps around the ring, we need
     // to scale the range by 1.0 where it overlaps.
     val ab: Double = {
-      val ab0 = (a * unitWidth)
+      val ab0 = widthUntil(a)
       if (ab0 + 1 < offset + width) ab0 + 1 else ab0
     }
-    val ae: Double = ab + unitWidth
+    val ae: Double = ab + unitWidth(a)

     val overlap = intersect(ab, ae, offset, offset + width)
     val rem = width - overlap

     if (rem > 0) {
       // Instead of actually splitting the range into two, we offset
       // any pick that takes place in the second range if there is a
       // possibility that our second choice falls within [ab, ae].
       //
       // Note, special care must be taken to not bias towards ae + overlap, so
       // we treat the entire range greater than it uniformly.
       var pos = offset + (rng.nextDouble() * rem)
       if (pos >= ae - overlap) { pos += overlap }
       index(pos % 1.0)
     } else {
       // The range [offset, offset + width) is equivalent to [ab, ae).
       a
     }
   }

...

Interactive simulation

At the top of this page, you should see an interactive simulation of this algorithm. Set an arbitrary number of servers, clients, and an aperture. The ring visualization will display how the circle is divided between servers and clients.

Arcs on the circle are interactive. Selecting a server (green color) highlights all the clients that connect to it. Selecting a client (pink color) shows its computed physical aperture, and, depending on color intensity, server weights from the client point of view (aperture-server intersection relative to $1/|\text{servers}|$ ).

Verbose State section contains 2 tables with clients and servers states. For servers, it is possible to set custom weights and run a simulation where each client, turn by turn, will pick a server and issue a request using Weighted Deterministic Aperture algorithm.

Notes

~~There is a bug in the Ring.weight method in the original implementation.~~ (fixed as of finagle-21.1.0-gc4dc4fd) The version I’m using for this simulation:

weight(index, offset, width) {
  const a = [
    this.widthUntil(index),
    this.widthUntil(index) + this.unitWidth(index),
  ]
  const b = [offset, offset + width]

  const [first, last] = a[0] < b[0] ? [a, b] : [b, a]

  let i = intersect(...first, ...last)
  // If we wraparound check for any intersection after the wraparound,
  //  it can happen at most once.
  if (last[1] > 1) i += intersect(...first, 0, last[1] % 1.0)

  return i / (1 / this.weights.length)
}

B. Beyer, C. Jones, J. Petoﬀ, and N. R. Murphy. Site Reliability Engineering: How Google Runs Production Systems. “O’Reilly Media, Inc.”, 2016↩
This may remind you of hashing (e.g. Consistent hashing, Rendezvous hashing).↩
SREcon19 Americas - Aperture: A Non-Cooperative, Client-Side Load Balancing Algorithm ↩