[original] experience in microcosmic application of caching of distributed system (1) [design details]

Microcosmic application experience of caching of distributed system (1) [basic details]

Preface

　　In recent months, I have been busy with trifles, and I haven’t had much leisure in recent years. Busy in the fall of 2018 has entered, have to sigh that time is always like a white space, do not know what has been harvested and what has been lost. Recently, I took a break and bought two books which are not related to technology. One of them is Yann.Martel wrote The High Mountains of Portugal, and found that reading this book requires some patience, a deep metaphor of life, and enough blank space for interested friends to read. Well, the following is the return.Try to write some practical experience and thinking about cache technology in the work.

text

　　In distributed Web programming, the key technologies to solve high concurrency and internal decoupling can not be separated from cache and queue, and the cache role is similar to all levels of CPU cache in computer hardware. Today’s slightly larger Internet projects, even in the initial beta development, are reserved for design.But in many application scenarios, it also brings some high cost technical problems, which need careful trade-offs. This series focuses on server-side cache related technologies in distributed systems, and also mentions the details of your own thinking combined with discussions among friends. If there is anything wrong with this article, please correct it.

　　In the first article, I try to talk about the basic design and application of cache itself as much as possible in detail, as well as the related operational details (the specific application is mainly exemplified by Redis).

　　First, the classification and basic characteristics of caching are mentioned.

　　　　1.1 A distinction

　　　　　　Caches can be divided into many different categories based on different conditions. Local caches and Distributed caches are a common category, and both of them contain many fine classes themselves.

　　　　　　Local is not the local server where the program is located (strictly speaking), but the internal storage space within the program itself. Distribution is more emphasized on storing on one or more servers outside the process and communicating with each other. In the design and application of specific software projects, most of them areThe time is mixed.

　　　　　　（Of course, individuals think that understanding the nature of caching is the most important, and that conceptual categorization is only a division of different understandings.)

　　　　1.2 Some technical cost

　　　　　　When designing a specific project architecture, the development cost of using the former (local cache) alone is undoubtedly extremely low, mainly considering local memory load or minimal disk I / O impact. The latter is designed to facilitate efficient sharing and management of cached data between distributed programs.Considering the memory load of the server in which the cache is located, the network I/O, the CPU load and the disk I/O cost in some scenarios should be fully considered in the design. At the same time, the overall stability and efficiency should be avoided and balanced as much as possible in the concrete design. These are not only the hardware components of the cache server.This and technical maintenance. The underlying issues that need to be carefully considered include the trade-offs of inter-cache communication, network load, and latency.

　　　　　　In fact, if you understand the nature of caching, you should know that any storage medium can act as an efficient cache role and perform project integration and inter-cache clustering in appropriate scenarios. Common mainstream Memcached and Redis belong to the latter category, and can even include, for example, NoSqL Designed document databases like MongoDB (but from a role perspective, and narrowly divided into disk-based repositories, it’s important to note that each has its own expertise). These third parties need to solve some problems to slow down the project integration and cache cluster. Even project iteration to the later stage.Often, the operation and maintenance with higher professional knowledge are involved, and the logical design and code implementation in the development will also increase a certain amount of work. So sometimes in the design of specific projects, on the one hand to reserve as much as possible, on the other hand, we have to be as streamlined as possible according to the actual situation.

　　　　　　Additionally, there is no perfect closed-loop for node data interaction, especially inter-service communication, in the limited technical learning and practice of individuals, and in theory it is also a “current stage” of “high consistency” balancing (probably the same as life, huh, deviation).

　　Two, some design details about caching database structure.

　　　　（Since Redis 3.x is currently used in most cases of personal work, the following features are used as references if they are relevant.

　　　　2.1 Example (Instance)

　　　　　　According to the business scenario, the common data and business coupling data must be used separately. If it is a single instance, DB can be considered. When you’re using Redis, DB has data isolation in Redis, but there are no strict permission restrictions, so delimiting libraries is just an option.In Cluster clusters, the default single library is maintained, but in practice I’ll try to adjust it to the size of the project, and in which development phase it’s reserved for design.

　　　　　　Additionally, as a caching product that relies heavily on server memory, if persistence is enabled (as mentioned later) and when supporting concurrent services, a large number of preemptions occur on the server hardware resources, consider whether the instance is partitioned for storage in combination with persistence policy configuration. LastingIn essence, memory data is written synchronously to hard disks (flops), and disk I / O is limited. Forced write blocking not only causes thread blocking and service timeouts, but also causes additional exceptions and even spills over other underlying dependent services. Of course, my suggestion is that if conditions permit, it is best to design at the early stage of the project.Plan and confirm.

　　　　2.2 Cache “table” (Table)

　　　　　　Generally, there is no intuitive table concept in traditional RDBMS (often in the form of key-value pairs “KV”), but structurally, key-value pairs themselves can be assembled into various table structures. In general, I will make a database table diagram, and then analyze when to store strings, when to store pairsImage, then use the cache key (KEY) to divide the table and field (column).

　　　　　　Assuming you need to store a login server table data with fields (columns): name, sign, addr, you can consider splitting the data structure into the following forms:
　　　　　　　　{ key : “server:name” , value : “xxxx” }
　　　　　　　　{ key : “server:sign” , value : “yyyy” }
　　　　　　　　{ key : “server:addr” , value : “zzzz” }

　　　　　　It should be noted that in distributed caching products, such as Redis, there are many data structures (such as String, Hash, etc.), and it is necessary to select the corresponding cache storage data structure according to the data association and the number of columns. The related storage space and time complexity are completely different.And this is hard to feel at an early stage.

　　　　　　At the same time, even if the cache memory settings are large enough, there is a lot of surplus, but also need to consider similar RDBMS single table capacity issues, the number of control entries can not grow indefinitely (such as predicting that storage entries can easily reach millions of levels), “branch table” design ideas are common.

　　　　2.3 Cache key (Key)

　　　　　　The above mention is based on the cache key to design the table, and here is a separate description of the key related personal specifications. If the key length is short enough, if the same business module is associated, it must be designed to begin with the same identity (code name) for easy lookup and statistical management.
For example, user login server list:
　　　　　　　　{ key : “ul:server:a” , value : “xxxx” }
　　　　　　　　{ key : “ul:server:b” , value : “yyyy” }

　　　　　　In addition, each individual business system may consider configuring a unique universal prefix identifier, which is not necessary here, but can be ignored if different libraries are used in practice.

　　　　2.4 Cache value (value)

　　　　　　There is no average size for the values in the cache (in this case, a single entry), but Size is naturally smaller and better (if Redis is used, the value of an operation has a greater impact on the overall Redis response time, not just network I / O). If the storage space occupied direct10M+, it is proposed to consider whether the associated business scenarios can be split into hot and non hot data.

　　　　2.5 Persistence (Permanence)

　　　　　　In general, persistence is not directly related to the cache itself, and can be roughly imagined as a hard disk-oriented memory-oriented. But in today’s Web projects, some business scenarios are highly cached dependent, and persistence can help improve the quick recovery of cached service restarts on the one handOn the one hand, it provides storage features in specific scenarios. Of course, persistence is bound to sacrifice some performance, including the preemption of CPU and the impact of hard disk I/O. Most of the time, however, the advantages outweigh the disadvantages. It is recommended that when caching is applied, persistence should be matched as much as possible, regardless of its use.The mechanism is implemented by the third party.

　　　　　　If Redis is used, it has its own persistence strategy, including AOF and RDB, which I configure most of the time (and of course, the latest official version itself provides a hybrid mode). If in some non high concurrent scenarios, or in some small and medium projects managementModule, only as an optimization means, to determine that does not need to be durable, can also be set to close directly, saving performance overhead loss, but it is recommended that the instance in the program to do a good job of annotation, to ensure that the instance of public use.

　　　　2.6 Elimination (Eliminate)

　　　　　　If the cache grows indefinitely, even if a short Expiration is set, at some point in time, a large amount of data with high concurrency will reach the peak of available memory in a relatively short time, at which time there will be a large number of delays and errors in the program’s interaction with the cache server, or even to the cache server.The server itself has brought serious instability. So in the production environment, try to configure the maximum memory limit for the cache and the appropriate elimination strategy.

　　　　　　If Redis is used, self selection strategy is more flexible. Personal design is that when the data presents a power-law distribution, there is always a large number of data access is low, I choose to configure allkeys-lru, volatile-lru, the least access to the data to pan outJig. If the cache is used as a log application, then I usually configure no-enviction early in the project and volatile-ttl later. Of course, I’ve also seen a special business design where caching is used directly as a lightweight persistent database, andIt’s the terminal, and it’s beginning to feel strange, but it turns out that it’s very business-designed (for example, almost no complex logic or strong transactions). So it makes sense that traditional design shouldn’t be constrained, since the architecture is always composed and changed in real time based on business.

　　Three, the basic CURD of cache and other related (here I mainly discuss the first level cache).

　　　　3.1 New (Create)

　　　　　　If there is no special business requirement (as mentioned above), the insertion time must be set. At the same time, try to ensure the randomness of expiration. If it’s batch caching, the personal approach is to make sure that the set expiration time is at least decentralized, in order to reduce the risk and impact of caching avalanches (about these IIt will be tried out later in the expansion.

　　　　　　For example, a batch cached object is a result set with 100,000 entries and a cache time base of 60 * 60 * 2 (sec), which is now required to be cached simultaneously. My approach is to generate a random number by default, such as random (range 0-1000), and the expiration time is set to(60*60*2 + random).

　　　　3.2 Amendment (Update)

　　　　　　Update a cached data to see if it is necessary to readjust the expiration date. In many cases, such as synchronization between caches, it is recommended to delete the cache directly instead of updating it. Modification operations are often related to synchronous operations between DBs, which are more sophisticated and require a trade-off between distributed thingsQuestions will be written in subsequent articles.

　　　　3.3 Read (Read)

　　　　　　If multiple caches exist and the amount of data is determined to be small, be sure to use a pattern that strictly matches the key, and try not to use wildcards. Although the key data for sending instructions becomes longer, it avoids unnecessary search performance losses in the cache.
　　　　　　For example, simply believing in Redis’s own storage optimization, using keys patterns without considering time complexity, and causing a large number of thread blockages (independent of master-slave replication here). If the compromise is replaced by scan paging, it is not a realization of “worry free”.It is necessary to set a lower capacity in the encapsulation of the program code. Secondly, it is important to control potential problems such as data illusion in the program logic.

　　　　　　In addition, we can add an analogy to a scenario to operate the large table in DB, and the hot data in the hit area will be distributed later.

　　　　3.4 Delete / empty (Delete / Clear)

　　　　　　Delete the cache, there are usually two ways to remove the cache directly and set the time expiration (not always sliding increase expiration) without any details. (I’ve heard of a special business situation where the same kind of data is requested in batches, and there’s no high requirement for instantaneity, set expiration times, and slightly spread the time.)

　　　　　　Empty cache, I do not currently apply in the project, and even do not advocate direct use. But when applied, two places need to be considered carefully. One is the timing of the cleanup, and the other is the timeliness of the cleanup (in Redis, flushdb or flushall, there is a certain amount of blockage)

　　　　3.5 Lock / signal (Locking)

　　　　　　Itself has nothing to do with caching. It belongs to some concurrent feature implementations, and has some applicable scenarios. This has some atom based implementations in Redis, but it has nothing to do with the discussion in this series. Last year, I wrote a related article to share, see: Shopping Mall System under the single inventory control series miscellaneous notes (2) https://Www.cnblogs.com/bsfz/p/7824428.html), but not here.

　　　　3.6 Publish subscribe (Publish-Subscribe)

　　　　　　Why mention this action related to production and consumption (Produce-Consume)? The mechanism itself does not belong to the cache itself, but is more related to message queues. The reason is that today’s mainstream cache products areWith this feature, many scenes are easy to use, simple to configure and efficient. However, it often leads to abuse. The key is that unnecessary strong coupling also reduces the overall flexibility and performance, and the scalability is also limited. Of course, this is my current view.

　　　　　　My advice is: if there is no special scenario application, try not to use it. At least I would not recommend publishing and subscribing using the cache itself as a priority. Even in a cache cluster system, more details are needed. The recommended way is to use other specialized middleware, such as MQ based product replacement.Case. Specific candidates are excellent open source works such as RabbitMQ, Kafka, and so on, including the recent two years of domestic Ali research and development of RocketMQ and so on, but the individual is still using more RabbitMQ. Of course, we haven’t talked much about it here.Scene selection, the right scenario to choose the most appropriate technical options.

epilogue

　　This article is written here, and the next one will try to expand on the related topics.

　　PS：Due to the limited personal ability and experience, I also continue to learn and practice, if there are any inadequacies in the article, please correct.

【Reserved Placement: Microapplication Experience of Caching in Distributed Systems (2) [Interactive Scene] https://www.cnblogs.com/bsfz/p/9568951.html]

End.

Leave a Reply Cancel reply