RedisInsight is the successor to RDBTools! RDBTools will reach EOL on 31st December 2019

How to Fix Lopsided Hash Slots in Redis

In Redis, the primary unit of distribution is a hash slot. Distributed versions of redis - including the open source Redis Cluster, commercial Redis Enterprise and even AWS ElastiCache - can only move around data 1 slot at a time.

This leads to an interesting problem - lop-sided slots. What if one slot (or a few slots) end up having most of the data?

Is that even possible?

Redis decides the hash-slot for a key using a well published algorithm. This algorithm will usually ensure that keys are well distributed.

But developers can influence the algorithm by specifying a hash tag. A hash tag is a portion of the key enclosed in curly braces {...}. When a hash-tag is specified, it will be used to decide the hash slot.

The hash-tag in redis is what most databases would call a partition key. If you choose a wrong partition key, you will get lopsided slots.

As an example, if your keys are like {users}:1234 and {users}:5432, redis will store all users in the same hash slot.

What’s the fix?

The fix is conceptually simple - you need to rename the key to remove the incorrect hash tag. So renaming {users}:1234 to users:{1234} or even users:1234 should do the trick…

… except that the rename command doesn’t work in redis cluster.

So the only way out is to first dump the key and then restore it against the new name.

Here is how it looks in code:



from redis import StrictRedis
try:
    from itertools import izip_longest
except:
    from itertools import zip_longest as izip_longest


def get_batches(iterable, batch_size=2, fillvalue=None):
    """
    Chunks a very long iterable into smaller chunks of `batch_size`
    For example, if iterable has 9 elements, and batch_size is 2,
    the output will be 5 iterables - each of length 2. 
    The last iterable will also have 2 elements, 
    but the 2nd element will be `fillvalue`
    """
    args = [iter(iterable)] * batch_size
    return izip_longest(fillvalue=fillvalue, *args)


def migrate_keys(allkeys, host, port, password=None):
    db = 0
    red = StrictRedis(host=host, port=port, password=password)

    batches = get_batches(allkeys)
    for batch in batches:
        pipe = red.pipeline()
        keys = list(batch)
        for key in keys:
            if not key:
                continue
            pipe.dump(key)
            
        response = iter(pipe.execute())
        # New pipeline to run the restore command
        pipe = red.pipeline(transaction=False)
        for key in keys:
            if not key:
                continue
            obj = next(response)
            new_key = "restored." + key
            pipe.restore(new_key, 0, obj)

        pipe.execute()


if __name__ == '__main__':
    allkeys = ['users:18245', 'users:12328:answers_by_score', 'comments:18648']
    migrate_keys(allkeys, host="localhost", port=6379)