[51CTO.com original article] What are the data requirements behind remote control technology? How to use cache technology to store real-time data? What to do if the new cache system is unstable? The MySQL InnoDB memcached plugin has been out for a long time, but there are no examples of using it in production environments in China. We specially invited Mr. Zhang Xiaofeng, technical director of Sunflower, to share his experience live. Live broadcast room: QQ group 370892523, ② group 312724475, ③ group 542270018, ④ group 627843829 Content introduction: Introduction to application background, data requirements of Sunflower remote control technology, history of cache optimization, from memcached to ttserver, memcached, Ttserver, MySQL InnoDB memcached plugin, etc. Yesterday I saw someone in the group say that operation and maintenance is to copy the programmer's code to the server. I think this is not the case at all, at least not here. An operation and maintenance person who can only install programs is unqualified. He must be able to optimize the system structure based on the existing program, propose current deficiencies and optimization plans to developers, and provide development needs for operation and maintenance services. 1. Application Background As a company that provides various Internet services and has a huge number of users, Oray has been practicing various new technologies and new architectures. In terms of caching, we have many applications from memcached, ttserver, redis, etc., among which redis is deeply integrated in our DNS system. The MySQL InnoDB memcached plugin has been out for a long time, and I haven't seen any examples of using it in production environments in China on the Internet. Today I will tell you about my experience with it. But a large number of users use our Peanut Shell just to remotely control computers, so in 2010, after more than a year of hard work, we launched the Sunflower remote control product. The basic function of this product is to allow users to remotely manage and control all their computers without having to worry about technical knowledge such as IP ports. This product mainly relies on the following technologies: Manage user host lists through a relational database; Use persistent connections to keep the victim online; P2P communication technology transmits control signals and image signals; The optimized algorithm reduces user bandwidth usage and improves image quality as much as possible; Other peripheral technologies, such as HTML5 plug-in-free remote control, remote startup, etc. We will not discuss the client, operating system and related remote control technology issues today. Sunflower is not a simple C/S structure software. We need to interact with the client in real time like a chat server. The number of online clients has been growing rapidly, and our system, operation and maintenance, and development teams have been constantly iterating and growing. 2. Data requirements for Sunflower remote control technology As mentioned above, Sunflower uses a relational database to store which hosts a certain user owns and the specific information about these hosts; at the same time, we also need to temporarily store some key real-time data: Host authentication information Host online status How to connect to the host In fact, in the first few months after the release of Sunflower, we put them all in the relational database. At that time, the main consideration was not the performance of the server, but the running of the entire system. However, our database could not cope with it later. This experience is not long, and to be honest, there is nothing much to say. 3. History of Cache Optimization Since it is not suitable to store data in relational databases, we began to use various caching technologies to store this real-time data. (III.1) From memcached to ttserver (III.1.1) memcached The first generation of host status data is cached. We put it in memcached. The entire client login process is as follows (various error handling and exceptions as well as various auxiliary architectures, such as load balancing or backup, are omitted): After putting the status and other data that needs to be accessed frequently into the cache, the big framework is still basically like this. The API is responsible for all interactions with the persistent DB, and the long connection is only responsible for communicating with memcached. This also avoids having too many roles involved in reading and writing our DB. In addition, at this time we only have one memcached server, because we calculated that 16G of memory can store information of approximately hundreds of millions of hosts. After experiencing two memcached crashes, we also crashed. Memcached data is completely stored in memory. After the crash, all hosts will become offline and can only be solved by restarting all servers. Restarting all servers means that all previously online clients have to log in again. This process will be extremely long, taking hours. (III.1.2) ttserver We need to improve it, and naturally, we thought of ttserver, which can recover data after crash and restart and has master-slave synchronization function. We can automatically restore the lost data from the DB when the client logs in; Since ttserver is fully compatible with memcached communication protocol, in order to avoid global disasters, we launched the new system quickly after completing the multi-cache service optimization. The structure of the new cache system looks like this: The fully stacked design can theoretically be expanded drastically, but we did not realize several major problems with ttserver: ttserver does not support key expiration. It is necessary to enable table database mode and implement it through Lua script. However, the performance of ttserver in this mode is quite poor and it becomes unstable when the data is large. We also encountered this instability by chance: since it automatically swaps infrequently read and written data to disk, it is not as easy to crash as memcached, but it will occasionally freeze. How severe is the freeze? It takes hundreds of milliseconds to get the result when you type get manually. We have made many optimizations to ttserver, but it still doesn't help. The first two freezes were solved by restarting, but later, we had to completely delete the files it saved to restore performance. Isn't this going back to the memcached era? What should we do? (3.2)MySQL InnoDB memcached plugin When we encountered the ttserver crisis, nothing could make me rack my brains more. I went to various communities to research every day. By chance, I saw that MySQL actually supported the memcached plug-in. This is really a magical combination: The performance and expansion of traditional relational databases in the big data era are inseparable from the two major themes of memory and distribution. In traditional relational databases, Oracle's Timesten and SQL Server's Hekaton are both combined with in-memory databases, but in fact there are few outstanding application scenarios. Embedding MySQL in NoSQL can complement each other in performance, management and analysis, which is a more meaningful combination. MySQL 5.6.6 and later began to embed memcached support. The performance of newer versions of MySQL 5.7 has been greatly improved. Tests have shown that the QPS can reach more than one million in a 48-core read-only environment. Borrow the official MySQL structure diagram: Let's first look at the installation of the memcached plugin: Download mysql5.7.17 source package wget http://cdn.mysql.com//Downloads/MySQL-5.7/mysql-community-5.7.17-1.el6.src.rpm Because it is a plugin, to enable the memcached plugin function, you need to add it during compilation and installation: -DWITH_INNODB_MEMCACHED=ON After the installation starts, perform the configuration: Memcached plugin related configuration table: The following parameters usually need to be set in the MySQL configuration file my.cnf (note that they should be set after starting the plugin, otherwise mysqld will report an error when starting up). They are mainly used to configure the memory size allocated to memcached and to enable writing binlog. If other ports need to be specified, such as 11212, add '-p11212' to the loose-daemon_memcached_option parameter The default values of daemon_memcached_r_batch_size and daemon_memcached_w_batch_size are 1, and it is recommended to set them to 1: MySQL memcached plugin features: 1. Data is directly read and written to the InnoDB storage engine without going through the SQL layer or parsing and compiling. 2. Memory cache data is managed by MySQL buffer pool. 3. Data can be stored in multiple tables, and multiple columns of data can be merged into one value. 4. You can query, analyze and maintain data through SQL. (Add an index to the expiration time field and delete expired data through mysql job) 5. You can use MySQL's flexible master-slave architecture. Performance test comparison: We used 4 cores for testing and comparison. The performance of MySQL memcached plugin is comparable to that of ttserver (hash mode), and the QPS can reach 70,000/second, which is more than 3 times higher than that of ttserver (btree mode). There are some flaws in the memaslap test, which are only provided for general reference: In fact, due to network latency, QPS performance will be greatly reduced, so the application should be deployed in the same intranet to reduce the network latency bottleneck, easily cope with hundreds of millions of QPS requests per day, and still perform stably under concurrent QPS access of nearly 10,000/s. Here is one of the MySQL Memcached operations: Under the design and control of one of our very good DBAs, the new architecture went online smoothly: As you can see, there is an additional role in the picture, magent. In this architecture optimization, we added a layer of magent between the application and MySQL memcached. This thing is introduced everywhere on Baidu, but the original version actually has many bugs, such as the buffer size problem. The benefits of adding this thing are: 1. HA is achieved by modifying magent. When one of the ma backend caches goes down, ma can automatically switch. System administrators can finally sleep peacefully at night. 2. MySQL memcached plugin does not support multi get/set (will be supported in future versions). Solved by magent. 3. Due to some bugs in MySQL memcached plugin, 5.7.18 has made some fixes, but they are not perfect. With the help of magent, it is easier to control and maintain. Advantages of this architecture: Strong scalability. Gradually improve the ability to cope with high concurrency, adopt a multi-master architecture, and cooperate with magent to achieve high availability. The optimizations we made: For versions before MySQL 5.7.17, it is not recommended to set daemon_memcached_r_batch_size greater than 1, as it is easy to encounter bugs and cause MySQL to crash. It is also recommended to set innodb_api_bk_commit_interval to a slightly larger value. The default value is 5. If there is a get session, restarting the daemon_memcached plugin will also cause MySQL to crash. Make sure there are no other sessions when restarting the plugin. Disallow flush_all permission operations: update cache_policies set flush_policy='disabled'; end In fact, all the details mentioned above, or even the details that are not detailed enough, are all about the optimization of the cache architecture. In the actual work of software and hardware development and architecture design for so many years, I have deeply realized the importance of architecture. No matter how NB the code and algorithm, or how NB the hardware, if the system architecture is not good, your operation results will be just as bad. The architecture here not only includes the server level and system operation and maintenance, but also the architecture is everywhere, from software architecture to hardware architecture, from website architecture to database architecture, from communication architecture to overall service architecture, including the cache architecture mentioned in this article, and even down to the class encapsulation and mutual calling of the C++ development layer, there are architectures and their optimization. Before doing any deployment and actual development, we must have the subconscious awareness of doing a good architecture, and consider a good architecture before building it. Of course, we will inevitably step on pits, but don't be afraid, just keep improving. With this attitude, we have launched innovative products such as the remote control hardware "Sunflower Control" without Internet. With the experience gained along the way, whether the product extends from software to hardware, we also have our own set of optimization solutions, which we will continue to share when we have the opportunity. Q&A 1. Q: PHP-watson-Guangzhou : A: Sunflower-Technical Director-Shanghai : Redis's single-threaded design mechanism can only use one core, and the CPU utilization is limited. I think it is not as good as memcached in high concurrency, and its performance is not very good when the data is relatively large. Its memory usage management is not as good as memcached, and its performance drops significantly when the value is greater than 1k. Q: Python operation and maintenance development_howhy : A: Sunflower-Technical Director-Shanghai : Q: Python operation and maintenance development_howhy : A: Sunflower-Technical Director-Shanghai : Q: Database Management-Ya Shen-Guangzhou : A: Sunflower-Technical Director-Shanghai : [51CTO original article, please indicate the original author and source as 51CTO.com when reprinting on partner sites] |
<<: Summary of social sharing solutions for iOS
Official data shows that BYD sold 49,765 passenge...
According to foreign media reports, a hard Brexit...
We know that protecting WeChat account security r...
Before answering this question, you should think ...
A complete analysis of the Kuaishou information f...
01 Skin care "misunderstandings" Daily ...
[[334870]] Android, developed by Google, is the w...
Community operation is actually user operation , ...
With the launch of Axe Technology's "Tom...
Preface In the previous article, we briefly learn...
Recently, a new variant has attracted the attenti...
This is the most unfamiliar scenery in Sichuan, a...
Resource Introduction of C4D Zero-Base Introductor...
“If you don’t eat dumplings during the winter sol...
There are hot topics in the marketing industry ev...