Redis Change Proposal: Configuration Improvements

Redis already has one of the most extensive set of configuration options available to any data store. But we can do better. I proposed these changes at the Redis Developer Day 2015 in London this week but this is the greater detail version.

Change Summary And Raison d’être

You can configure it via the traditional config file method, through the API, and, though lesser known, via the command line where you prepend the directive name with ‘--’ to turn it into a command switch. But with an increase container prevalence and to meet growing ideas in operational management Redis can benefit from some improvements in this space. This proposal is a collection of changes which bring these benefits to Redis users and those who need to support it in an operational capacity.

Some of these changes are connected with non-configuration changes. As a result there will be some overlap in this proposal with some of the others.

Environment Based Configuration

A methodology surrounding how to design software with operations in mind is known as the “12 Factor App” methodology. Redis is, though unintentionally, already incorporating several of these. The “factor” being addressed here is environmentally configured software. In this factor software reads configuration from the shell’s environment variables.

As to the priority order for values supplied in more than one location would be:

Environment Variables -> Command Line -> Config file

Now order of the first two are potentially contentious. Normally the command line would override the environment variables. However, I think there is a reason for inverting these two: Docker, or more specifically the Dockerfile. When running Redis via Docker the Dockerfile has to be set up in a way to allow you pass command line options. If the Dockerfile is not set up this way you can’t pass them in. Normally for this priority this would be irrelevant. However, often these Dockerfiles set CLI options in the Dockerfile itself.

If we run with the traditional CLI > ENV route it means you are locked into what the Dockerfile indicates. By prioritizing the ENV over the CLI we get around this problem, ensuring that all non-file configuration options are available at container run time, regardles of choices made at container creation time.

That is to say that if you have a config variable with “port 6379”, and you launch with ‘--port 7000’, that Redis instance will bind to port 7000. I you then added the environment variable for it and set that to 8000, it would listen on port 8000. This priority sequence allows you to have sane defaults which can be customized at runtime. Redis further brings in the idea of sane defaults by not needing a config file at all, thus providing full coverage. I’d propose the prefix be REDIS_ and SENTINEL_. Obviously as hyphens or dashes are not allowed in ENV vars, conversion from an underscore would need to happen. As such to set the set-max-intset-entries via the environment you would set REDIS_SET_MAX_INTSET_ENTRIES.

From an operations point of view, the ability to configure based on environment variables provides flexibility while reducing issues - something you don’t see much of. With Redis modifying it’s configuration file you don’t want a configuration management system stomping over your changes. Rather than have custom config files you have environment variables set at runtime thus eliminating this point of operational contention. Another growing aspect where environmental based configuration is a significant win is in containerization.

Redis containers in Docker provide a nice mechanism to dynamic configuration: the environment. You may not want to have everything in a configuration file - or pass it via command line. Thus, adding environment variables as a route to configuration would be most excellent.

Announce IP, Announce Port

As we have in sentinel, Redis needs the ability to be told what IP and/or port to use when it connects to a master as a slave, or interacts with Sentinel - and for the same reasons. Behind a NAT your Redis instance sees different connectivity information than it actually has for off-host connectivity. In order to handle this we need to follow Sentinel’s footsteps and add ‘announce-ip’ and ‘announce-port’. These need to be settable via all configuration mechanisms (environment, command-line, file, and API). Note that changing these at runtime after a slaveof directive or command will mean the slave needs to inform the master of it’s true connectivity. There is an altertnative I’ll discuss later in this proposal.

Instance Name

This is a simple variable useful for identifying a specific instance. This is distinct from a RunId in that it persists across restarts. It will need to be set via all regular configuration mechanisms. As Sentinel (and the Redis) is already very discoverable, being able to name instances becomes extremely handy in dynamic and/or large deployments. It will improve the ability of Redis management tools to discover and report on Redis instances, pods, and clusters. Note: there is already some cases of a “name” being associated with an instance but they are currently IP:PORT combinations.

Another way to handle this could be to consider this information a type of metadata for the instance. I’m of two minds on this subject. Which is better depends on how deeply Redis internally (including in Sentinel and/or Cluster) the instance name gets used. If it becomes heavily used by Redis itsself, it should be configuration item. Otherwise it is probably a better fit for the next seciton: metadata.

Per-Instance Metadata

Another way of accomplishing the instance name is to have a ‘metadata’ command which acts like the config command but sets or gets metadata about the instance instead. This data would only be accessible via the meta command. With this command you could do ‘meta set name roslave-01’ to set the name. You could also do things such as ‘meta set business-group operations’ and ‘meta set zone a’ to further classify the instance.

Why a metadata store in Redis? Because Redis is already highly discoverable and being able to essentially tag an instance with additional data extends the usefulness of this capability. It becomes highly useful for Redis management systems. The configuration of metadata should be handled the same as configuration data: you should be able to specify it via all normal configuration means: file, CLI options, Environment, and the ‘meta’ command in the API.

Why not store in Redis?

FLUSHALL / FLUSHDB
An admin may not want their users to have access to that data via normal commands. By using a dedicated command it could be renamed or could be restricted if/when we get multi-user or multi-role capability.

Real-life example

Say you are doing multi-regional availability. Sure, Redis isn’t built for it but you need it. How can you ensure your sentinels pick based on DC first? Sure, you could manually configure DCs two and three to a different slave-priority, but what happens when the “master” DC dies? You have to go reconfigure everything. So perhaps you’d like to store decision making information in the Redis instances themselves and implement your own version of Sentinel. By being able to assign metadata in each instance you could indeed do this - and without standing up a separate datastore or overloadig an existing one.

Anyone who has poked around ElastiCache may have noticed there is a key they don’t control. IMO this should not be somewhere the user has access to - if for no other reason than it can skew their code’s results which calculates or iterates over the keys in a database or produce unintended results such as a flush not resulting in a dbsize of 0. With an inbuilt metadata store this becomes a reality.

Configuration Sync From Master to Slave(s)

Currently if you connect to a master and change a config variable such as persistence or memory optimization settings this is only done on the master. Redis needs to have the ability to push certain changes to one or more slaves, rather than client management code needing to go do it for you.

Option One

There are a few ways to do it, though they are not mutually exclusive. One is to be able to specify a list of directive to always sync to all slaves. For example:

config sync all hash-max-ziplist-entries hash-max-ziplist-values config sync slave-01 save

The first tells Redis that when a config set is executed for the hash-max-ziplist-* settings to then push said changes to all slaves. The second tells Redis to push to slave-01 any changes made to the save setting.

Another option is for the form to be one of exclusions:

config sync all save,hash-max-ziplist-entries hash-max-ziplist-values config nosync slave-01 save config sync slave-02 set-max-intset-entries

In this form all slaves except slave-01 will have changes made to save or the hash-max-ziplist-* settings replicated to them. Slave-01 will get the hash-max-ziplist settings but NOT the save changes. Slave-02 will ALSO get the set-max-intset-entries changes. Not all directives should be replicated, resulting in a blacklist of sorts. Some key examples include any announce-* changes, name, slave settings, etc..

Another Possibility

Another way to go about it is to add a ‘sync’ option to the config set command. For example:

config set save ‘’ sync all

config set save ’60 100’ sync slave-01 slave-03

This could even be done in tandem with the config sync/config nosync options to all per-invocation syncs. This option is likely quicker to implement and does allow you to determine on a given config get command where if and where to sync it. On the other hand, it means your code always has to take this into account, whereas with the other option it is set by a policy.

The first option also enables configs to be pushed on startup and handles cases where a management tool already exists and makes changes but shoudln’t be making the sync decisions.

Meta Data Sync

While most metadata would be specific to that instance there are cases where it should be replicated. For those cases I propose we do the same thing for metadata. Real example: in Sentinel we name each pod. Yet that information is not available in the instances in said pod. This means you can’t easily reverse-discovery your setup.

Additionally, and thanks to Salvatore for spotting it, if Sentinel were to set the metadata key ‘sentinel-name’ (or whatever we decide to call it) on the master when you call sentinel monitor it could be used by Sentinel for cross-checking configuration. For example Sentinel could interrogate a new master only to find it already has that field set and uses a different name. In that case it could refuse to pile on, returning an error

Post-Start Initialization Phase

This change provides a window of time after startup wherein Redis does not serve or accept data. Configurable via a config parameter (such as readiness-delay) such that 0 disables it (by virtue of the time-to-wait being 0) and a number of (milli?)seconds specified means the server will wait for post-start initialization and configuration commands. Also a new config sub-command to be introduced such as config ready will end the delay and place the server in server mode.

The purpose of this is to enable the administrator to set various things that can, or should, not be specified in the config file - or should be overridden at run-time. We have a few examples of this already: Redis/Sentinel behind a NAT, Redis being reconfigured as a slave or master.

One of the items Salvatore discussed this year was the concept of a “protected restart” mode to prevent a scenario where a diskless replication enabled setup could lose all of it’s data through a failed restart. This mode is quite similar to the one I propose here in that it doesn’t serve or accept data or data changes while active. After some discussion we arrived at the idea that since this state can be queried if you are using this mode and, as you should, make use of the config ready command to indicate it should start you should query for this before doing so. This will ensure both cases are covered without require Redis to always do Yet More Checks.

Example Scenario

A Redis server is run behind a NAT-ed IP address, such as in a VM or container. You do not know what IP and port will be assigned to the service when starting it. With new ‘announce-ip’ and ‘announce-port’ commands and readiness-delay long enough to discover them, the code launching the instance in a container can discover the client-facing IP:PORT pair, then call:

launch instance, discover connectivity
call on the new instance: config set announce-port 7654
call on the new instance: config set announce-ip 192.1.2.3
(obtain master Ip through some mechanism such as Docker’s API)
call on the new instance: slaveof 192.1.2.1 6379
call on the new instance: config ready

Additional Thoughts on Config Mode

With some time to think more about this I wonder if the config ready should have a counterpart config notready for cases where you need to go the other way around. Perhaps you have found an issue which requires reconfiguration. Being able to stop data serving could give you time to reconfigure without needing to restart.

An Idea From The Void

To throw out a wild idea, I also brought up the possibility to storing and accessing configuration in a backing store such as Consul as an option. While it isnt part of the forthcoming RCP, it is an idea that holds a lot of merit as it allows a lot of new capabilities and interaction between Redis nodes both in pods as well as clusters but also for proxies and client/server information as well.