How to build high availability redis architecture?

Wen GuobingArchitect’s Secret circleYesterday

Author:Wen Guobing, who worked in cool dog music, is now 37 mutual entertainment DBA. Current major areas of concern: database automation operations and maintenance, high availability architecture design, database security, mass data solutions, and open source technology in the Internet applications

1 notes preceding the text of a book or following the title of an article

Redis It is an open source, network-enabled, memory-based and persistent log-based, Key-Value database written in ANSI C and provides a multilingual API.

Nowadays, the data of Internet business is growing at a faster speed, and the data types are becoming more and more abundant, which puts forward higher requirements for the speed and ability of data processing. Redis is an open source, non-relational, in-memory database that has a disruptive experience for developers. The design process from beginning to endThe high performance is fully considered, which makes Redis the fastest NoSQL database nowadays.

Considering high performance, high availability is also an important consideration. Internet 7×24 uninterrupted service, failover at the fastest speed during the failure period, can bring the smallest loss to the enterprise.

So, what are the high availability architectures in practical applications? What is the pros and cons of architecture? How should we choose? What are the best practices?

Two, Sentinel (sentinel) principle

Before explaining the Redis High Availability scenario, let’s take a look at what the Redis Sentinel principle (https://redis.io/topics/sentinel) looks like.

Sentinel The cluster finds master through a given configuration file and monitors master when it starts. By sending info information to master, all slave servers below the server are obtained.
Sentinel Clusters send Hello messages (once a second) to the monitored master and slave servers via command connections, including Sentinel’s own IP, port, id, and so on, to declare their presence to other Sentinels.
Sentinel Clusters receive Hello messages sent by other Sentinels through subscription connections to discover other Sentinels that monitor the same master server; command connections are created between clusters for communication because the master and slave servers already send and receive the heLLO information intermediary, Sentinel will not create subscription connection.
Sentinel Clusters use the ping command to detect the state of an instance, which is judged to be offline if there is no reply within a specified time (down-after-milliseconds) or an incorrect reply is returned.
When the failover master-standby switch is triggered, failover does not occur immediately, and most Sentinel authorizations in Sentinel are required before failover, or failover, occurs.Ntinel will get the authorization of the designated quorum Sentinel and enter the ODOWN state successfully. If you configure 2 quorum in the 5 Sentinel, wait until 2 Sentinel.Think master is dead and execute failover.
Sentinel Send the SLAVE OF NO ONE command to the slave selected as master, and select slave on the condition that Sentinel first ranks according to the priority of slaves, the smaller the priority, the higher the ranking. If excellentWith the same precedent, look at the subscript of the replicate, which receives more replicate data from the master, which is ahead. If the priority and subscript are the same, the process ID is chosen smaller.
Sentinel When authorized, it will get the latest configuration number (config-epoch) of the outdated master, which will be used for the latest configuration when failover is executed and broadcast to other Sentinels.The other Sentinel updates the configuration of the corresponding master.

1 To 3 is the automatic discovery mechanism:

At a frequency of 10 seconds, an info command is sent to the monitored master to obtain the master’s current information based on the reply.
Send a PING command to all redis servers, including Sentinel, at a rate of one second, to determine if the server is online by replying.
At a frequency of 2 seconds, the slave server sends the current Sentinel master message to all monitored masters.

4 It is a detection mechanism. 5 and 6 are failover mechanisms, and 7 is updating the configuration mechanism. [1]

Three, Redis high availability architecture

After explaining the principle of Redis Sentinel, we will explain the commonly used Redis.High availability architecture。

Redis Sentinel Cluster + intranet DNS + custom script
Redis Sentinel Cluster + VIP + custom script
Packaging client directly connected to Redis Sentinel port

JedisSentinelPool，Suitable for Java
PHP Self encapsulation based on phpredis

Redis Sentinel Cluster + Keepalived/Haproxy
Redis M/S + Keepalived
Redis Cluster
Twemproxy
Codis

Next, I will explain with the pictures and texts one by one.

1、Redis Sentinel Cluster + intranet DNS + custom script

Redis Sentinel Cluster + intranet DNS + custom script

The above is a plan for online application. The bottom layer is Redis Sentinel cluster, acting on Redis master-slave, Web terminal connecting internal network DNS to provide services. The internal network DNS is allocated according to certain rules, such as xxxx.redisCache / queue. port. xxxx. xxxx, the first segment denotes business abbreviation, the second segment denotes a Redis intranet domain name, the third segment denotes a Redis type, cache denotes a cache, queue denotes a queue, and the fourth segment tableThe Redis port shows fifth or sixth segments representing the main domain name of the intranet.

When a primary node fails, such as a machine failure, a Redis node failure, or a network unreachable, the Sentinel cluster invokes the client-reconfig-script configuration script to modify the intranet domain name of the corresponding port. Internal network domain name corresponding to portTo the new Redis master node.

Advantage：

Second level switching, complete the handover operation in 10s.

Script customization, architecture controllable

For application transparency, front-end need not worry about what happens at the back end.

Shortcomings:

The maintenance cost is slightly higher, and Redis Sentinel cluster recommends more than 3 machines.

Depending on DNS, there is analytic delay.

Sentinel Mode is not available for short time service.

Service can not be accessed through external network access.

2、Redis Sentinel Cluster + VIP + custom script

Redis Sentinel Cluster + VIP + custom script

This scheme is slightly different from the previous one. The first scheme uses the intranet DNS, and the second scheme transforms the intranet DNS into the virtual IP. The bottom layer is Redis Sentinel cluster, acting on Redis master-slave, Web terminal through VIP.Provide services. When deploying Redis master slave, you need to bind the virtual IP to the current Redis master node. When a primary node fails, such as a machine failure, a Redis node failure, or a network unreachable, the Sentinel cluster invokes cliThe script configured by ent-reconfig-script drifts VIP to the new master node.

Advantage:

Second level switching, complete the handover operation in 5S.
Script customization, architecture controllable
For application transparency, front-end need not worry about what happens at the back end.

Shortcomings:

The maintenance cost is slightly higher, and Redis Sentinel cluster recommends more than 3 machines.
Using VIP to increase maintenance costs, there is IP confusion risk.
Sentinel Mode is not available for short time service.
3.3 Packaging client directly connected to Redis Sentinel port

3、Packaging client directly connected to Redis Sentinel port

Some services can only access Redis through the extranet. Neither of the two schemes mentioned above is available, so this scheme is derived. The Web uses the client to connect to a port on one of the machines in one of the Redis Sentinel clusters, and then retrieves the current portThe main node is then connected to the real Redis main node for the corresponding salesman operation. It is important to note that Redis Sentinel ports and Redis main nodes require open access. If front-end business uses Java, there is Jedis.SentinelPool can be reused; if the front-end business uses PHP, it can be re-encapsulated on the basis of phpredis.

Advantage:

Service detect failure in time
DBA Low maintenance cost

Shortcomings:

Rely on client support Sentinel
Sentinel Servers and Redis nodes need access to open access.
Intrusive to applications

4、Redis Sentinel Cluster + Keepalived/Haproxy

Redis Sentinel Cluster + Keepalived/Haproxy

The bottom layer is Redis Sentinel cluster, acting on Redis master-slave, Web side providing services through VIP. When a primary node fails, such as a machine failure, a Redis node failure, or a network unreachable, the handover between Rediss passesRedis Sentinel internal mechanism ensures that VIP handover is guaranteed through Keepalived.

Advantage:

Second level switching
Application transparency

Shortcomings:

High maintenance cost
Existence of brain fissure
Sentinel Mode is not available for short time service.

5、Redis M/S + Keepalived

Redis M/S + Keepalived

This scheme does not use Redis Sentinel. This scheme uses native master-slave and Keepalived, VIP switching is guaranteed by Keepalived, and the switch between Redis masters and slaves needs custom script implementation.

Advantage:

Second level switching
Application transparency
Simple deployment and low maintenance cost

Shortcomings:

Script needs to be implemented to switch functions.
Existence of brain fissure

6、Redis Cluster

Redis Cluster

From: http://intro2libsys.com/focused-redis-topics/day-one/intro-redis-cluster

Redis 3.0.0 It was officially released in April 2, 2015, more than two years ago. Redis cluster adopts P2P mode without centralization. Divide key into 16384 slot, and each instance is responsible for part of slot. Client request correspondingData, if the instance slot does not have the corresponding data, the instance will be forwarded to the corresponding instance. In addition, Redis clusters synchronize node information through Gossip protocol.

Advantage:

Component all-in-box is easy to deploy and saves machine resources.
Performance is better than proxy mode.
Data available in automatic failover and Slot migration
The official cluster scheme is guaranteed by renewal and support.

Shortcomings:

The architecture is relatively new with few best practices.
Multi key operations support Limited (drive can curve save the country)
To enhance performance, the client needs to cache routing table information.
Node discovery and reshard operation are not automated enough.

7、Twemproxy

Twemproxy

From: http://engineering.bloomreach.com/the-evolution-of-fault-tolerant-redis-cluster

Multiple isomorphic Twemproxy (with the same configuration) work at the same time, accepting client requests and forwarding them to the corresponding Redis according to the hash algorithm.

Twemproxy The scheme is relatively mature. Our team used this plan for a long time, but the effect was not very satisfactory. On the one hand, location problem is difficult, on the other hand, it is not very friendly to the support of automatic culling nodes.

Advantage:

Development is simple, and application is almost transparent.
Long history, mature program

Shortcomings:

Agent impact performance
LVS And Twemproxy will have node performance bottlenecks.
Redis Capacity expansion is very troublesome.
Twitter The scheme has been abandoned internally, and the new architecture is not open source.

8、Codis

Codis

From: https://github.com/CodisLabs/codis

Codis ZooKeeper is an open source product with many components. ZooKeeper stores routing tables and proxy node metadata, distributes commands for Codis-Config, is an integrated management tool with a Web interface for use by Co Co.Dis-Proxy is a stateless proxy compatible with Redis protocol; Codis-Redis is redeveloped based on Redis version 2.8, and slot support is added to facilitate data migration.

Advantage:

Development is simple, and application is almost transparent.
The performance is better than Twemproxy.
It has graphical interface, easy expansion, convenient operation and maintenance.

Shortcomings:

Agents still affect performance
Too many components require a lot of machine resources.
The Redis code has been modified, resulting in a failure to synchronize with the official.
The development team is ready to promote reborndb based on Redis transformation.

Four. Best practices

The so-called best practices are the best practices for specific scenarios.

The following are the main plans:

Redis Sentinel Cluster + intranet DNS + custom script
Redis Sentinel Cluster + VIP + custom script

The following is the best practice summed up in actual combat.

Redis Sentinel Cluster recommends > = 5 machines.
Different large businesses can use a set of Redis Sentinel clusters to represent all ports under the business.
According to different business division, Redis port range is good.
The custom script is recommended to be implemented by Python, extending and facilitating.
Custom scripts need to be aware of the current Sentinel role.
Custom script pass-in parameters: & lt; service_name & gt; service_name & gt; role & gt; & lt; Comment & gt; & lt; from_ip & gt; from_port & gt; to & lt;_ip> < to_port>
Custom scripts require remote SSH manipulation of the machine. It is recommended to use the paramiko library to avoid repeating the establishment of SSH connections and consume time
To speed up SSH connection, it is recommended to close the following two parameters
UseDNS no
GSSAPIAuthentication no
WeChat or mail alerts suggest fork a process to avoid blocking the main process.
Automatic switching and failover. All operations are recommended to be completed within 15s.

How to build high availability redis architecture?

Leave a Reply Cancel reply