MySQL performance optimization

MySQLDatabase optimization can be considered from the following aspects:

  • SQLStatement optimization
  • Database parameter configuration
  • Database architecture
  • Hardware upgrade

First, SQL statement optimization.

1、Open MySQL database slow query

Modify the MySQL configuration file my.cnf to add the following:

After the modification is completed, restart MySQL and execute the following statement to see if the modification is effective.

2、UsemysqldumpslowCommand narrowing

  • Most visited5individualSQLSentence
  •   mysqldumpslow -s c -t 5 /usr/local/mysql/data/slow-query.log
  • The most recordset is returned.5individualSQLSentence
  •   mysqldumpslow -s r -t 5 /usr/local/mysql/data/slow-query.log
  • Return according to time5It contains left join.SQLSentence
  •   mysqldumpslow -t 5 -s t -g “left join” /usr/local/mysql/data/slow-query.log

Explain:

  • -s,Indicates which way to sort.c、t、l、rThey are sorted according to the number of records, time, query time and number of records returned.ac、at、al、arRepresents the corresponding reverse order.
  • -t,yestop nThat is to say, how much data is returned to the front.
  • -g,A regular matching pattern can be written later, which is not sensitive to case.

3、How to analyze SQL query statements

(1)First use the mysqldumpslow command to find the SQL statements that need to be optimized.

(2)Then replace the N in the SQL statement into a specific field value and analyze it with EXPLAIN.

The meaning of each column is:

  • table:Table name
  • type:Represents how MySQL finds the required rows in the table, also known as the “access type.” Common types are as follows, from good to bad
  •     -NULL:Best
  •     -const:Primary key and index
  •     -eq_reg:Primary key and index range lookup
  •     -ref:Connection lookup (join)
  •     -range:Index range lookup
  •     -index:Index scanning
  •     -All:Full table scan
  • possible_keys:Index that may be used
  • key:Shows that the index actually used by MySQL in query is displayed as NULL without indexing.
  • key_len:The shorter the index, the better.
  • ref:Which column of the index is used, the constant is better.
  • rows:Represents MySQL estimating the number of rows to read to find the required record based on table statistics and index selection
  • Extra:using filesort、using temporary(Optimization is often required when using order by.
  •     -Only index:It means that information is retrieved only from information in the index number, which is faster than scanning the whole table.
  •     -Using where:Where restrictions are used.
  •     -Impossible where:It doesn’t need where, but it usually doesn’t look out.
  •     -Using filesort:Extra sort
  •     -Using temporary:Temporary tables are used.
  •          –Using filesortandUsing temporaryIt means that the query is very laborious and needs to be optimized. Indexes of where and order by often fail to take care of both, and if the index is determined by where, order by will inevitably lead to using filesort, depending on whether filtering and sorting are cost-effective or sorting first.Filter calculation

(3)From the result of EXPLAIN, we can see that the value of type is ALL, that is, full table scan. Extra’s values of Using where and Using file show that the query is laborious, and this SQL statement is to be optimized.

Optimize the scheme: add index to category_id, comments, views, and analyze it with EXPLAIN again, as follows:

typeIt turned out to be “range”, which is more acceptable than before. Rows also changed from 3 to 1, but the using filesort used by Extra is still unacceptable. The comments condition here is to query a range and cause MySQL to be unable to take advantage of it.The index retrieves the subsequent views, i.e. the index after the range type query field is invalid, so we delete the comments index to see the effect, as shown below:

At this point, we found that the type was promoted to ref, and the Using file in Extra disappeared, indicating a significant improvement in performance. Although rows has changed from 1 to 2, the performance of Using filesort has been greatly improved.

Two, parameter configuration

1、Max_connections

MySQLThe maximum number of connections, which can be raised if the server’s concurrent requests are large, is based on the machine’s support, because MySQL will provide a buffer for each connection if the number of connections increases, which will overhead more memory, so you need to adjust the value appropriatelyYou can’t increase the setting value arbitrarily.

View the current maximum number of connections.

SeeMySQLThe maximum number of connections that have actually been achieved in the past is as follows:

The ideal setting is:max_used_connections/ max_connections * 100%≈85%,If max_used_connections are equal to max_connections, then max_connections are set too low or above the server load limit, and below 10% is set too large.

How to set max_connections: Modify the MySQL configuration file, add the following, and then restart MySQL, as shown below:

2、back_log

MySQLThe number of connections that can be temporarily saved, and if MySQL’s connection data reaches max_connections, the new request will be placed on the stack waiting for a connection to release resources, the number of which is back_log. If the number of waiting connections exceeds back_loG, will not be granted connection resources, will report: unauthenticated user | xxx. xxx. xxx. XXX | NULL | Connect | NULL | login | NULL pending connection processTimeout. When the main MySQL thread gets a lot of connection requests in a very short time, back_log works.  

back_logThe value can not exceed the size of the listening queue for TCP/IP connection. Invalid if exceeded, view the size command of the listening queue for the TCP/IP connection on the current system: cat/proc/sys/net/ipv4/tcp_max_syn_backlog. aboutThe Linux system is recommended to be integers less than 512.

When we look at the host process list and find a large number of processes to connect to, we need to increase the back_log or the value of max_connections.

You can view the settings of back_log through the following commands:

By configuring the number of back_log in the my.cnf file, the following figure is shown.

3、wait_timeoutAnd interactive_timeout

wait_timeout:This refers to the number of seconds that a MySQL connection will be forced to close after it has been idle for more than a certain period of time, waiting before closing a non-interactive connection. The default wait-timeout value of MySQL is 8 hours. Wait_timeout if the settings are too small, then connect.Closed quickly, so that some durable connections do not work, if set too large, easy to cause the connection to open too long.

It makes sense to set this value, for example, if your site has a large number of MySQL link requests (each MySQL connection is memory overhead), if you have a large number of connection requests that are idle for reasons of your program, occupy memory resources in vain, or cause MySQL to exceedThe maximum number of connections but failed to create a new connection resulted in an error of “Too many connections”. You can check the status of your MYSQL before setting it up (you can use the show processlist), if you often find a lot of S in MYSQLThe LEEP process needs to modify the wait-timeout value.

interactive_timeout:This refers to the number of seconds that MySQL needs to wait before closing an interactive connection, such as when we manage MySQL on the terminal, using an interactive connection, and if no action exceeds the time set by interactive_time, it will automaticallyDisconnect, the default is 28800, can be adjusted to 7200.

What is interactive connection and what is non interactive connection? To put it bluntly, connecting to the database through the MySQL client is an interactive connection, and connecting to the database through JDBC is a non-interactive connection.

View the settings of wait_timeout and interactive_timeout

By configuring the timeout time of wait_timeout and interactive_timeout in the my.cnf file,

As you can see from the above two graphs, the value of wait_timeout seems to differ from the configuration time because:

  • For non-interactive connections, similar to JDBC connections, the value of wait_timeout inherits from the server-side global variable wait_timeout
  • For interactive connections, similar to MySQL client single connections, the value of wait_timeout inherits from the server-side global variable interactive_timeout

Because I’m using Secure CRT to connect Linux servers into MySQL, so it’s an interactive connection, so the value of wait_timeout is inherited from the value of interactive_timeout, which is 3600 seconds.

For applications, if you use connection pools or long connections, and there is no F5 (F5 may set a timeout to kill sessions that are too long), then set two timeouts as long as possible to avoid the application of connection pools constantly reconnecting dataLibrary. If it’s a short connection, you need to shorten the time of the two timeouts, because the maximum number of connections to the database is limited, many connections are never released, waste database resources, and may have “MySQL: ERROR 1040: Too many”Connections’s mistake.

4、key_buffer_size

key_buffer_sizeSpecifies the size of the index buffer, which determines the speed of index processing, especially the speed of index reading. Key_buffer_size only works on MyISAM tables, even if you don’t use MyISAM tables, but the interior temporary disk table is MyISAM tables, use this value as well.Whether key_read_requests or key_reads can be set up until key_baffer_size is reasonable is as follows:

There are six index read requests, three of which are not found in memory to read the index directly from the hard disk

key_cache_miss_rate(Index probability of missed cache) = key_reads/key_read_requests * 100%

key_cache_miss_rateIt is good below 0.1% and if below 0.01%, the key_bufferl_size is allocated too much and can be reduced appropriately.

View the key_buffer_size settings as follows:

View the size of the key_buffer_size parameter, in the form of B.
key_buffer_sizeIt is a parameter that has the greatest influence on the performance of MyISAM tables.

By configuring the key_buffer_size size in the my.cnf file, the following figure is shown:

5、innodb_buffer_pool_size

For InnoDB tables, innodb_buffer_pool_size serves the same purpose as key_buffer_size does for MyISAM tables. InnoDB uses this parameter to specify the size of memory to cache data and indexes. IThe size of nnodb_buffer_pool_size is directly related to the performance of the InnoDB storage engine, so if we have enough memory, set this parameter as large as possible. To put it simply, when we manipulate a InnoDB table, we return it.All data, or any index block used in the process of fetching data, will walk through this memory area. For a separate MySQL database server, the maximum value can be set to 80% of physical memory.

innodb_buffer_pool_size The bigger the value, the better that can guarantee that most of the read operations use memory instead of hard disk. The typical values are 5-6GB (8GB memory), 20-25GB (32GB memory), 100-120GB (128GB memory).

We can use (Innodb_buffer_pool_read_requests — Innodb_buffer_pool_reads) / Innodb_buffer_pool_read_requeSTS * 100% calculates the cache hit ratio and optimizes the innodb_buffer_pool_size parameter according to the hit ratio, as shown below:

Hit ratio = (1945-315) / 1945 * 100% = 83.8%. Obviously, we’re going to increase the innodb_buffer_pool_size as follows:

6、query_cache_size

MySQLThe execution process of the Select statement:

  • First do grammar and permission check.
  • Then go to the database cache to see if there is an execution plan for this SQL statement, if there is a direct return of the results to the application
  • If not, the execution plan for this SQL statement is generated.
  • Store the execution plan in the database cache.
  • The database is executed in the database according to the execution plan.
  • Go to the hard disk / disk to find this data, then put it in the cache and return it to the application.

Using query caching, MySQL stores query results in the cache, and in the future, for the same SELECT statement (case-sensitive), the results will be read directly from the buffer, thus omitting all the subsequent steps, greatly improving performance. A SQL query starts with SELECT.Then the MySQL server will try to use the query cache for it. Two SQL statements, as long as the difference is even one character (columns such as different case, one more space, etc.), will use different caches.

Of course, Query Cache also has a fatal flaw, that is, any change in the data of a table can cause all the Select statements that reference the table to invalidate the cached data in Query Cache. So, when our data change very frequently,In fact, the use of Query Cache may not be worth the candle.

First, we get the relevant information about query caching through the following commands:

  • Qcache_free_blocks:The larger the number of adjacent memory blocks in the query cache, the more fragments the cache has. FLUSH QUERY CACHE collates the fragments in the cache to get a free block.
  • Qcache_free_memory :Query cache size of the remaining memory, through this parameter we can more accurately observe the current system query cache memory size is enough, need to increase or too much.
  • Qcache_hits:The number of hits in the cache can be used to verify the effectiveness of our query cache. The larger the number, the better the cache effect.
  • Qcache_inserts:The number of cache misses, meaning that the new SQL request was not found in the cache, had to perform query processing, query processing to insert the results into the query cache. The more times this happens, the less effective the query cache is applied. Of course, after the system has just started, check.The query cache is empty, which is normal.
  • Qcache_lowmem_prunes:How many Query are cleared out of Query Cache because of insufficient memory. Through the combination of “Qcache_lowmem_prunes” and “Qcache_free_memory”, we can get a clearer understanding of Quer in our systemIs the memory size of the Y cache really large enough, and is it very frequent that Query is swapped out because of insufficient memory? It’s best to look at this number for a long time, and if it’s growing, it means that it’s probably very fragmented or has very little memory. (above Qcache)_free_blocks and Qcache_free_memory can tell you what kind of situation it is.
  • Qcache_not_cached:The number of queries that are not suitable for caching is usually because these queries are not SELECT statements or use functions like now ().
  • Qcache_queries_in_cache:The amount of data currently cached
  • Qcache_total_blocks:The number of cache blocks, if this value has been increasing, indicates that the cache works.
  • For writing operations, opening qcache has no effect.
  • If more read operations, qcache operation will improve performance.

Let’s check the related configuration of query_cache again.

  • query_cache_limit:Queries beyond this size will not be cached.
  • query_cache_min_res_unit:The minimum size of the cache block, the configuration of query_cache_min_res_unit is a “double-edged sword”, the default is 4KB, setting a large value is good for large data queries, but if your queries are small data queries, it is easy to cause memory fragmentation and waste.
  • query_cache_size:Query cache size.
  • query_cache_type:Cache type, determine what query to cache, note that this value can not be set arbitrarily, must be set to a number, optional items and instructions are as follows:
  •     -Set to 0 (OFF): close the Query Cache function, and do not use Query Cache in any case.
  •     -Set to 1 (ON): Turn on Query Cache, but when prompted by SQL_NO_CACHE used in the SELECT statement, Query Cache will not be used.
  •     -Set it to 2 (DEMAND): Turn on the Query Cache function, but only if the SQL_CACHE prompt is used in the SELECT statement.

Query CacheUse requires multiple parameters, the most critical of which are query_cache_size and query_cache_type, which set the memory size for the cache ResultSet and which scenario to use Query Cache in. In the past, query_cache_size is generally 256MB in size if it is not a MySQL database for caching basically unchanged data.

If the value of Qcache_lowmem_prunes is very large, it indicates that the cache is often cleared out of Query Cache, and if the value of Qcache_hits is also very large, it indicates that the query buffer is used very frequently, and the buffer size needs to be increased. according toThe hit ratio (Qcache_hits/(Qcache_hits+Qcache_inserts)*100) is generally not recommended to adjust the value of query_cache_size. 256MB may be about the same. Large configurationThe static data can be enlarged appropriately.

Modify the my.cnf file to set the cache size and cache type.

 

7、thread_cache_size

The server thread cache, which indicates that the number of threads saved in the cache can be reused. When disconnection occurs, if there is still room in the cache, the client thread will be placed in the cache, and if the thread is re-requested, the request will be read from the cache, if the cache is empty or new.Ask, then the thread will be recreated, if there are many new threads, adding this value can improve system performance. By comparing the Connections and Threads_created state variables, you can see how this variable works. Set the rules as follows: 1The GB memory configuration is 8, the 2GB configuration is 16, the 3GB is configured to be 32, 4GB or higher, and the configuration is larger.

The number of connections trying to connect to MySQL, regardless of whether the connection is successful, is shown below.

  • Threads_cached:Represents the number of idle threads in the thread cache at this moment.
  • Threads_connected:Represents the number of currently established connections, because a connection requires a thread, it can also be considered as the number of threads currently used.
  • Threads_created:Represents the number of threads that have been created since the last service was started. If the Threads_created value is found to be too large, it indicates that the MySQL server has been creating threads, which is also resource-intensive and can appropriately increase thread_cache_siz in the configuration fileE value.
  • Threads_running:Represents the number of threads currently active (non sleep state). It doesn’t mean the number of threads in use. Sometimes the connection is established, but the connection is in the sleep state.

See how many times the database has been connected to boot.

  • (Connections –  Threads_created) / Connections * 100 %
  • The hit rate of the connection thread pool is used to judge whether the setting value is appropriate. The hit rate is more than 90% and the setting is reasonable.

Modify the my.cnf file to configure the server cache thread.

8、thread_concurrency

thread_concurrencyThe correctness of the value of thread_concurrency has a great influence on the performance of mysql. In the case of multiple CPUs (or multi-cores), missetting the value of thread_concurrency will cause Mysql to fail to make full use of multi-CPUs (or multi-cores), and can only be used at the same time.A CPU (or core) is working.

thread_concurrencyIt should be set to 2 times the CPU core number. For example, if you have a dual-core CPU, thread_concurrency should be 4; if you have two dual-core CPUs, thread_concurrency should be 8.

According to the above calculation method, you can know that there are four CPUs, each CPU is 8 cores, according to the above calculation rules, thread_concurrency should be set to: 4 * 8 * 2 = 64

Modify the my.cnf to modify the thread_concurrency configuration.

Note that this parameter has been marked out of date in MySQL 5.6.1 and has been removed in MySQL version 5.7.2.

Three, MySQL database reference configuration

1、InnoDBTo configure

#InnoDBThe buffer pool for storing data dictionaries and internal data structures is already large enough for 16MB.
innodb_additional_mem_pool_size = 16M

#InnoDBIt is used to cache data, index, lock, insert buffer, data dictionary, etc.
#In the case of a dedicated DB server with an InnoDB engine dominated scenario, 50% - 70% of the physical memory can usually be set.
#If it is a non dedicated DB server, you can first try to set 1/4 into memory. If there is any problem, then adjust it.
#The default value is 8M, which is very inappropriate, which is why many people think InnoDB is not as good as MyISAM.
innodb_buffer_pool_size = 4G

#InnoDBShared tablespace initialization size, the default is 10MB, very pit X, changed to 1GB, and automatic expansion.
innodb_data_file_path = ibdata1:1G:autoextend

#If you do not know this option, it is recommended to set to 1, can better protect the reliability of data, has a certain impact on performance, but controllable
innodb_flush_log_at_trx_commit = 1

#InnoDBLog buffer, which is usually set to 64MB, is enough.
innodb_log_buffer_size = 64M

#InnoDB redo logSize, usually set 256MB is enough.
innodb_log_file_size = 256M

#InnoDB redo logFile group, usually set to 2 is enough.
innodb_log_files_in_group = 2

#Enable InnoDB's independent table space mode for easy management.
innodb_file_per_table = 1

#The InnoDB status file is enabled for administrators to view and monitor.
innodb_status_file = 1

#Setting the transaction isolation level to READ-COMMITED improves transaction efficiency and generally meets transaction consistency requirements
transaction_isolation = READ-COMMITTED

2、Other configuration

#Set the maximum number of concurrent connections. If the front-end program is PHP, it can be increased appropriately, but not too large.
#If the front-end program uses the connection pool, it can be adjusted appropriately to avoid the large number of connections.
max_connections = 100

#Maximum number of connection errors, can be properly increased to prevent frequent connection errors, the front-end host is rejected by MySQL
max_connect_errors = 100000

# Opening slow query function
slow-query-log = on
# Slow query log storage path and file name
slow_query_log_file = /usr/local/mysql/data/slow-query.log
# Set slow query threshold and recommend minimum one second.
long_query_time = 1
# Lists query statements that do not use indexes.
log-queries-not-using-indexes = on

#Set the maximum temporary table, which is allocated for each connection. It is not appropriate to set max_heap_table_size as large as tmp_table_size
max_heap_table_size = 96M
tmp_table_size = 96M

#Each connection will be allocated some sort of cache, such as sorting, connection, and generally set to 2MB is enough.
sort_buffer_size = 2M
join_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 2M

#Open query cache
query_cache_size = 256M
query_cache_type=1
#If you're a DB with an InnoDB engine as the main engine, the key_buffer_size for the MyISAM engine can be set to a smaller size, 8MB is enough #If MyISAM engine is the main form, it can be set up larger, but not more than 4G. #Here, it is strongly recommended not to use the MyISAM engine, and the default is to use the InnoDB engine. key_buffer_size = 8M #Set the connection timeout threshold. If the front-end program uses a short connection, it is recommended to shorten these 2 values. #If the front end program uses a long connection, you can directly annotate the two options, using the default configuration (8 hours). interactive_timeout = 120 wait_timeout = 120

Four, architecture optimization

With the increase of data volume, pressure and flow, simple architecture can not guarantee the performance of the database. We need to try to improve the performance of the database by master-slave replication, read-write separation, sub-database tabulation.

1、Master-slave replication

  • MasterChange records to binary log (binary log).
  • SlaveCopy the binary log log of master to its relay log (Relay log).
  •   SlaveStart a worker thread — I/O thread. The I/O thread opens an ordinary connection on master and starts binlog dump process. Binlog dump process reads events from binary log of masterIf it keeps up with master, it will sleep and wait for master to generate new events. The I/O thread writes these events to the relay log.
  • SlaveRedo the events in the relay log
  •   SQLThe thread reads events from the relay log and replays them to update the data in the slave to match the data in the master. As long as the change thread is consistent with the I / O thread, the relay log is usually located in the OS cache, so the cost of the relay log is small.

2、Separation of reading and writing

Looking at the introduction of master-slave replication above, we have the following questions:

Question 1:masterFor write operations, slaves passively perform the same operations and maintain data consistency. Can slaves take the initiative to write?

  • Assuming that slave can actively write, slave cannot notify the master, which results in inconsistency between master and slave data. Therefore, slave should not write operations, at least on the slave replicated database can not be written. realOn the other hand, the concept of separation of reading and writing has been revealed here.

Question 2:In master-slave replication, you can have N slave, but these slave can’t write operations. What do they want?

  • Data backup: Similar to high availability, once the master is down, slave can be pushed up and slave promoted to master.
  • Remote disaster recovery: for example, master in Beijing, the earthquake has been hung up, then slave in Shanghai can continue. It is mainly used to implement scale out.
  • Load sharing: you can scatter the read tasks to slaves. Most likely, a system has far more read operations than write operations, so write operations are sent to the master and read operations to slaves.

Question 3:Master-slave replication has master, slave1, slave2,… And so on so many MYSQL databases, then such as a JAVA WEB application should be connected to which database?

  • Of course, we can do this in the application, insert / delete / update, which updates the database, with connection (for master) and select with connection (for sla)VES) operation. Then our application has to complete how to select one from slaves to execute the select, such as a simple round robin algorithm.
  • In this case, the application completes the routing of the SQL statement and is closely related to the master-slave replication architecture of MYSQL. Once the master hangs, some slaves hang, the application changes. Can you make application and MYSQL master slave replication architecture without?What are too many relationships? You can see the following picture:
  • To find a component, application program only needs to work with it to complete the proxy of MYSQL and route SQL statements. MySQL proxy is not responsible. How to pick one from many slaves? You can give it to another component.(for example, haproxy) to complete. This is called MYSQL READ WRITE SPLITE, MYSQL’s read and write separation.

3、Sub database table

Sub table strategy:

We know that each machine has its own physical upper limit no matter how well configured it is, so when our application has reached or far exceeded a certain upper limit for a single machine, we have to look for help from other machines or continue to upgrade our hardware, but the common solution is to add more machines.Share the pressure. At the same time, we have to consider the growing business logic, whether our machines can meet the demand through linear growth. Therefore, the use of database sub library table can instantly improve the performance of the system.

The design and operation of most databases are basically related to the user’s ID, so using the user ID is the most commonly used routing strategy for the library. The user’s ID can be used as an important field throughout the whole system. Therefore, using the user’s ID, we can not only facilitate our enquiry, but also be able toThe data are allocated evenly to different databases. (of course, there are many ways to divide tables according to categories and so on.

When the data is relatively large, the data table operation, first to determine how many tables need to be allocated to the data average, that is, table capacity. Suppose there are 100 tables to store, then when we store data, we first take the user ID module operation, according to user_Id%100 gets the corresponding table to store and query operation. The schematic diagram is as follows:

Query before the table: Select * from order where user_id = XXX

Query after the table is divided into: Select * from order_ (user_id%100) where user_id = XXX

Determine which table to query based on the result of user_id’s query.

Sub Library Strategy:

Database tabulation can solve the problem of data query efficiency when the amount of data in a single table is very large, but it can not bring efficiency improvement to the concurrent operation of the database, because the essence of tabulation is still an operation on a database, it is easy to be limited by database IO performance. So how to make the database IO?It is obvious that the performance of a single database can be well solved by dividing the data into different libraries.

The implementation of this strategy is very similar to that of the tabulating strategy, and the simplest is that routing can be done in a modular manner. In the previous example, the user ID is modeled so that a specific database can be obtained, as follows:

We put all user-related order information for user_id% 100 = 0 (0,100,200) into the first DB table, and all user-related order information for user_id% 100 = 1 (1,101,201) into the second DBIn the table, by analogy, the user_id% 100 = 99 (99, 199, 299) user-related order information is put into the 100th DB table, which completes the library operation.

Sub Library Strategy:

In the above introduction, the database table can solve the query performance problem of single table massive data, and the database can solve the concurrent access pressure problem of single database. But sometimes we need to consider both of these issues at the same time, so we need to do both tabular and library operations on a single table.At the same time, expanding the concurrent processing ability of the system and improving the query performance of single table is the sub-database table we use.

There are two common strategies in the split table strategy, which are as follows:

The intermediate variable = user_id% (the number of libraries * the number of tables in each library) uses mod () function.The library ordinal number = rounding (the number of tables in the intermediate variable / Library) uses floor () function.Table number = intermediate variable%. The number of tables in each library.Using mod () function

If there are 256 databases and 1024 tables in each database, the user_id of the user = 100, according to the above routing strategy, we can obtain:

Intermediate variable = 100% (256*1024) = 100Library ordinal = rounding (100 / 1024) = 0Table number = 100%1024 = 100

In this case, for user_id = 100, it will be routed to the 100th table of the 0th database, as shown below:

The performance of query and concurrency has been improved by the above operations, but there are still some things to be noted, such as: things that originally cross tables become distributed things; because records are split into different databases and different tables, it is difficult to do multi-table Association queries, and can not be ignored.A routing field is used to query data. After the sub-database table, if we need to further expand the system lineup (routing policy changes), it will become very inconvenient, we need to re-migrate the data.

Test strategy:

In performance testing, it is impossible for us to apply so many machines for testing. So, how do we test the database structure using the split table mode? We only need to apply for a database server where the data is imported (with one of the online environments) according to the tabulation strategyThe database must be consistent. Then, the user_id used to run the business only uses the user_id that will appear in the database, and the number of concurrencies can be converted according to the maximum number of concurrencies on-line, so that the test results are reliable, and this set of environmental tests passed, and there is no problem on-line.

Database migration considerations:

How do we verify that we want to migrate a database with this tabulated structure from one database to another, such as from Oracle to MySQL? The following aspects can be considered:

  • Dismantling Library Rule validation: is data allocation correct for each library?
  • Is there any data missing and whether the total data is consistent?
  • Is the data in the sub library correct?
  • Is data wrong?
  • If the character set is correct, will Chinese display a garbled code?

Five, hardware optimization

1、disk

MySQLA large number of complex query operations are performed every second, and the read-write volume of the disk can be imagined. Therefore, disk I/O is generally considered to be one of the biggest constraints on MySQL performance. For systems with an average daily access of more than 1 million PVs, MySQL performance is limited by disk I/O.Very low! Several solutions can be considered to address this constraint: with RAID-0+1 arrays, be careful not to try RAID-5, MySQL will not be as efficient on RAID-5 arrays as you might expect.

  • Commonly used RAID level introduction:
  •   RAID0:Also known as ribbons, is to link multiple disks to a hard disk. This level IO is the best.
  •   RAID1:Also known as mirroring, requires at least two disks, and each disk stores the same data.
  •   RAID5:It also combines multiple (at least three) hard disks into one logical disk. When data is read or written, parity check information is created, and parity check information and corresponding data are stored on different disks. When a disk data of RAID5 is damaged, the remaining data and corresponding parity check are used.Information to recover damaged data.
  •   RAID-0+1:It is the combination of RAID0 and RAID1. At the same time, it has two levels of advantages and disadvantages. The general recommendation database uses this level.

2、CPU

For MySQL applications, the S.M.P. architecture is recommended for multi-channel symmetrical CPUs, such as two Intel Xeon 3.6GHz CPUs. Now I recommend using a 4U or more server to specialize in database servers, not just for MySQ.L.

3、Memory

Physical memory is recommended not to be less than 2GB for a Database Server using MySQL, and more than 4GB is recommended for physical memory, but memory is a negligible problem for today’s servers, with high-end workThe server basically has more than 16G of memory.

Leave a Reply

Your email address will not be published. Required fields are marked *