TL;DR

It’s beneficial, sometimes, to buy in bulk. Or in this case, to rank in bulk. I’ll take a look at a recent improvement to our open source leaderboard library to show significant improvement when doing a bulk insert operation for ranking members in a leaderboard.

COSTCO AND YOU

Let’s face it, we’ve all been walking through Costco or Sam’s Club at one point and we’ve thought to ourselves, “I do need a palette of chocolate covered pretzels.” or “We could use a drum of grape jam at home.” Regardless of whether or not these are valid units of measurement, the time it takes to purchase an item in bulk is less than the time it takes to purchase the items individually when you know you’re going to need the bulk amount. So let’s explore this idea with leaderboards.

Let’s look at the performance of ranking 1 million members in a leaderboard.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  
insert_time = Benchmark.measure do
 
    1.upto(1000000) do |index|
 
      highscore_lb.rank_member("member_#{index}", index)
 
    end
 
  end
 
  => 29.340000 15.050000 44.390000 ( 81.673507)

81 seconds isn’t bad, but can we do better? What if a catastrophic failure forces us to rebuild our leaderboards from scratch? Every second counts right? Let’s rank the same 1 million members in the leaderboard all at once.

1
 
  2
 
  3
 
  4
 
  5
 
  6
 
  7
 
  8
 
  9
 
  10
 
  11
 
  
member_data = []
 
   => []
 
  1.upto(1000000) do |index|
 
    member_data << "member_#{index}"
 
    member_data << index
 
  end
 
   => 1
 
  insert_time = Benchmark.measure do
 
    highscore_lb.rank_members(member_data)
 
  end
 
   =>  22.390000   6.380000  28.770000 ( 31.144027)

31 seconds! As it turns out, “buying in bulk” really paid off. It helps to understand your underlying data store, in this case Redis, to know when it can handle bulk operations to your advantage. If we were to compare the time it’d take to rank, say 10 million members in a leaderboard, we’d be looking at almost 14 minutes (ranking individually) vs. 5 minutes (ranking in bulk).

FIN

As shown in this post, bulk operations can significantly impact the performance of your systems if the underlying data store can optimize those bulk operations. So don’t worry about that desk of Cheez-Its you just purchased. You’re worth it!