In recent years, in the field of DDD, we often see the concept of CQRS architecture. I personally wrote an ENode framework specifically for implementing this architecture. The idea of the CQRS architecture itself is actually very simple, which is read-write separation. It is a very easy-to-understand idea. Just like when we use the master and slave of MySQL database, data is written to the master, and then queries are queried from the slave. The synchronization of the master and slave data is the responsibility of the MySQL database itself. This is a kind of read-write separation at the database level. There are actually a lot of introductions to the CQRS architecture. You can Baidu or Google it yourself. Today, I mainly want to summarize the similarities and differences between this architecture and the traditional architecture (three-tier architecture, DDD classic four-tier architecture) in terms of data consistency, scalability, availability, scalability, and performance. I hope to summarize some advantages and disadvantages to provide a reference for everyone when making architecture selection. PrefaceSince the CQRS architecture itself is just an idea of read-write separation, there are many ways to implement it. For example, data storage is not separated, but only read-write separation is implemented at the code level, which is also a manifestation of CQRS; then the read-write separation of data storage, the C end is responsible for data storage, the Q end is responsible for data query, and the data on the Q end is synchronized through the Event generated by the C end, which is also an implementation of the CQRS architecture. The CQRS architecture I will discuss today refers to this implementation. Another very important point is that we will also introduce the two architectural ideas of Event Sourcing+In Memory on the C end. I think these two ideas can be combined with the CQRS architecture to maximize the value of the CQRS architecture. Data consistencyIn traditional architecture, data is generally strongly consistent. We usually use database transactions to ensure that all data modifications of an operation are in one database transaction, thereby ensuring strong consistency of data. In distributed scenarios, we also hope for strong consistency of data, that is, to use distributed transactions. However, as we all know, the difficulty and cost of distributed transactions are very high, and the throughput of systems using distributed transactions will be relatively low, and the availability of the system will also be relatively low. Therefore, in many cases, we will also give up strong consistency of data and adopt eventual consistency; from the perspective of the CAP theorem, it means giving up consistency and choosing availability. The CQRS architecture fully adheres to the concept of eventual consistency. This architecture is based on a very important assumption, that is, the data seen by users is always old. For a multi-user system, this phenomenon is very common. For example, in the scenario of flash sales, before you place an order, you may see the quantity of the goods on the interface, but when you place the order, the system prompts that the goods are sold out. In fact, if we think about it carefully, it is indeed the case. Because the data we see on the interface is taken from the database, once it is displayed on the interface, it will not change. But it is very likely that other people have modified the data in the database. This phenomenon is particularly common in most systems, especially high-concurrency WEB systems. Therefore, based on this assumption, we know that even if our system achieves strong data consistency, users are likely to see old data. Therefore, this provides us with a new idea for designing the architecture. Can we do this: we only need to ensure that the data based on all addition, deletion, and modification operations of the system is up-to-date, and the queried data does not have to be up-to-date. This naturally leads to the CQRS architecture. The C-side data remains up-to-date and has strong data consistency; the Q-side data does not have to be up-to-date, and can be updated asynchronously through C-side events. Therefore, based on this idea, we began to think about how to implement both ends of CQ specifically. Seeing this, you may still have a question, that is, why must the data on the C-side be up-to-date? This is actually very easy to understand, because if you want to modify the data, you may have some modified business rule judgments. If the data you are based on is not up-to-date, it means that the judgment loses its meaning or is inaccurate, so the modifications based on the old data are meaningless. ScalabilityIn the traditional architecture, each component is strongly dependent on each other, and methods are directly called between objects; while the CQRS architecture is event-driven. From the micro-aggregate root level, the traditional architecture is that the application layer coordinates multiple aggregate roots through procedural code to complete the entire business operation in a transactional manner at one time. The CQRS architecture, on the other hand, is based on the idea of Saga and uses an event-driven approach to ultimately achieve the interaction of multiple aggregate roots. In addition, the CQ ends of the CQRS architecture also synchronize data asynchronously through events, which is also a manifestation of event-driven. At the architectural level, the former is the idea of SOA, and the latter is the idea of EDA. SOA is a service calling another service to complete the interaction between services, and the services are tightly coupled; EDA is a component subscribing to the event message of another component and updating the component's own status based on the event information. Therefore, in the EDA architecture, each component will not depend on other components; components are only associated through topics, and the coupling is very low. The coupling of the two architectures is discussed above. Obviously, the architecture with low coupling must have good scalability. Because of the idea of SOA, when I want to add a new function, I need to modify the original code; for example, originally service A called two services B and C, and later we want to call another service D, we need to change the logic of service A; while in EDA architecture, we do not need to change the existing code. Originally, there were two subscribers B and C subscribing to the messages generated by A, and now we only need to add a new message subscriber D. From the perspective of CQRS, there is also a very obvious example, which is the scalability of the Q end. Suppose that our original Q end was only implemented using a database, but later the system's access volume increased, and the database update was too slow or could not meet high-concurrency queries, so we hope to add a cache to cope with high-concurrency queries. That is easy for the CQRS architecture. We only need to add a new event subscriber to update the cache. It should be said that we can easily increase the data storage type of the Q end at any time. Database, cache, search engine, NoSQL, log, and so on. We can choose the appropriate Q-end data storage according to our business scenarios to achieve the purpose of fast query. All this is thanks to the fact that our C end records all model change events. When we want to add a new View storage, we can get the latest status of the View storage based on these events. This kind of scalability is difficult to achieve under the traditional architecture. AvailabilityAvailability, whether it is traditional architecture or CQRS architecture, can achieve high availability, as long as we make each node in our system without a single point. However, in comparison, I think CQRS architecture has more room for avoidance and choice in terms of availability. In traditional architecture, because read and write are not separated, availability must be considered together, which is more difficult. Because of the traditional architecture, if a system has a large number of concurrent writes during the peak period, such as 2W, and a large number of concurrent reads, such as 10W. Then the system must be optimized to support such high-concurrency writes and queries at the same time, otherwise the system will crash during the peak period. This is the shortcoming of the system based on the idea of synchronous call. There is nothing to smooth the peak and fill the valley, save the extra requests at the moment, and the system must be able to process them in time no matter how many requests it encounters, otherwise it will cause an avalanche effect and paralyze the system. However, a system will not always be at its peak. The peak may only last for half an hour or one hour; but in order to ensure that the system does not crash during the peak period, we must use enough hardware to support this peak. Most of the time, such high hardware resources are not needed, so it will cause a waste of resources. Therefore, we say that the implementation cost of the system based on synchronous call and SOA thinking is very expensive. In the CQRS architecture, because the CQRS architecture separates reading and writing, availability is equivalent to being isolated in two parts for consideration. We only need to consider how the C end solves the availability of writing and how the Q end solves the availability of reading. I think it is easier for the C end to solve availability because the C end is message-driven. When we make any data modification, we will send a command to the distributed message queue, and then the backend consumer processes the command->generates domain events->persistent events->publishes events to the distributed message queue->*** events are consumed by the Q end. This link is message-driven. Compared with the direct service method call of the traditional architecture, the availability is much higher. Because even if the backend consumer that processes the command is temporarily hung up, it will not affect the front-end Controller to send the command, and the Controller is still available. From this perspective, the CQRS architecture has higher availability in data modification. But you may say, what if the distributed message queue is hung up? Haha, yes, this is indeed possible. But generally distributed message queues belong to middleware, and generally middleware has high availability (supports clustering and master-slave switching), so compared with our application, the availability is much higher. In addition, because the command is first sent to the distributed message queue, the advantages of the distributed message queue can be fully utilized: asynchronization, pull mode, peak shaving and valley filling, and queue-based horizontal expansion. These features can ensure that even if the front-end Controller sends a large number of commands instantly during peak hours, the back-end application that processes the commands will not crash, because we pull commands based on our own consumption capacity. This is also the advantage of CQRS C-end in terms of availability, and in essence, it is also the advantage brought by the distributed message queue. Therefore, from here we can realize that the EDA architecture (event-driven architecture) is very valuable, and this architecture also reflects the idea of Reactive Programming, which is currently popular. Then, for the Q side, it should be said that there is no difference from the traditional architecture, because both need to handle high-concurrency queries. How this was optimized before is still how it is optimized now. But as I emphasized in the scalability above, the CQRS architecture can more conveniently provide more View storage, database, cache, search engine, NoSQL, and the updates of these storages can be carried out in parallel without dragging each other down. The ideal scenario, I think, should be that if your application needs to implement complex queries such as full-text indexing, you can use search engines such as ElasticSearch on the Q side; if your query scenario can be satisfied by a data structure such as keyvalue, then we can use NoSql distributed cache such as Redis on the Q side. In short, I think that with the CQRS architecture, it will be easier for us to solve query problems than with the traditional architecture because we have more choices. But you may say that my scenario can only be solved with a relational database, and the query concurrency is also very high. There is no way, the only way is to disperse the query IO, we divide the database into sub-libraries and sub-tables, and make the database one master and multiple backups, and query on the backup machine. In this regard, the solution is the same as the traditional architecture. Performance and scalabilityI originally wanted to write about performance and scalability separately, but then I thought about how they are related, so I decided to write about them together. Scalability means that when a system has good performance (throughput, response time) when 100 people visit it, it also has good performance when 1 million people visit it. This is scalability. The pressure on the system when 100 people visit it and 1 million people visit it are obviously different. If our system can improve the service capability of the system by simply adding machines in terms of architecture, then we can say that this architecture has strong scalability. Let's think about the performance and scalability of the traditional architecture and CQRS architecture. When it comes to performance, people usually think about where the performance bottleneck of a system is. As long as we solve the performance bottleneck, the system means that it can achieve scalability through horizontal expansion (of course, horizontal expansion of data storage is not considered here). Therefore, we only need to analyze where the bottlenecks of traditional architecture and CQRS architecture are. In traditional architecture, the bottleneck is usually in the underlying database. Then our general practice is that for reading: usually using cache can solve most query problems; for writing: there are many ways, such as sharding, or using NoSQL, and so on. For example, Alibaba has adopted a large number of sharding solutions, and in the future it should all use the high-end OceanBase to replace the sharding solutions. Through sharding, a database server may have to withstand 100,000 high-concurrency writes at peak times. If we put the data on ten database servers, then each machine only needs to bear 10,000 writes. Compared to bearing 100,000 writes, writing 10,000 now seems much easier. Therefore, it should be said that data storage is no longer a bottleneck for traditional architecture. The steps of data modification in traditional architecture are: 1) fetch data from DB to memory; 2) modify data in memory; 3) update data back to DB. A total of 2 database IOs are involved. Then in the CQRS architecture, the time spent on both ends of CQ is definitely longer than that of the traditional architecture, because the CQRS architecture has a maximum of 3 database IOs, 1) persistence commands; 2) persistence events; 3) updating the read database based on events. Why the most? Because the step of persisting commands is not necessary, there is a scenario where persistence commands are not needed. The purpose of persisting commands in the CQRS architecture is to perform idempotent processing, that is, we must prevent the same command from being processed twice. In which scenario do we not need to persist commands? That is, when the command is created when the aggregate root is created, the persistence command can be omitted, because the version number of the event generated by the creation of the aggregate root is always 1, so we can detect this duplication based on the event version number when persisting the event. Therefore, we say that if you want to use the CQRS architecture, you must accept the eventual consistency of the CQ data, because if you complete the operation processing by completing the update of the read database, the time used for a business scenario is likely to be longer than that of the traditional architecture. However, if we end with the processing of the C end, the CQRS architecture may be faster, because the C end may only need one database IO. I think there is one important point here. For the CQRS architecture, we pay more attention to the time it takes for the C end to complete the processing; and it doesn’t matter if the Q end is a little slower, because the Q end is only for us to view the data (eventual consistency). If we choose the CQRS architecture, we must accept the disadvantage of a little delay in the Q end data update, otherwise we should not use this architecture. Therefore, I hope that everyone must fully realize this when selecting the architecture according to your business scenario. In addition, when talking about data consistency, we mentioned that traditional architecture uses transactions to ensure strong data consistency. The more complex the transaction, the more tables will be locked in one transaction, and locks are the enemy of system scalability. In the CQRS architecture, a command will only modify one aggregate root. If multiple aggregate roots need to be modified, Saga is used to implement it. This bypasses the problem of complex transactions and achieves maximum parallelism and minimum concurrency through the idea of eventual consistency, thereby improving the overall system throughput. So, in general, both architectures can overcome performance bottlenecks. As long as the performance bottleneck is overcome, scalability is not a problem (of course, I have not considered the problem of system unavailability caused by data loss. This problem is a problem that all architectures cannot avoid. The only solution is data redundancy, which will not be expanded here). The bottleneck of both is data persistence, but because most systems in traditional architectures store data in relational databases, they can only adopt the solution of sharding. For the CQRS architecture, if we only focus on the bottleneck of the C-end, since the things that the C-end needs to save are very simple, that is, commands and events; if you trust some mature NoSQL (I think document databases such as MongoDB are more suitable for storing commands and events), and you have enough ability and experience to operate and maintain them, then you can consider using NoSQL for persistence. If you think NoSQL is unreliable or cannot be fully controlled, you can use a relational database. But you also have to make efforts, such as being responsible for sharding to save commands and events, because the data volume of commands and events is very large. However, some cloud services such as Alibaba Cloud have already provided DRDS, a database storage solution that directly supports sharding, which greatly simplifies the cost of storing commands and events. Personally, I think I will still adopt the sharding solution for a simple reason: to ensure reliable data landing, maturity, and controllability, and to support the landing of read-only data, it is not difficult for the framework to support sharding. Therefore, through this comparison, we know that in traditional architecture, we must use sharding (unless Alibaba can use OceanBase); while CQRS architecture can give us more choices. Because persistent commands and events are very simple, they are all unmodifiable read-only data, and are friendly to kv storage. You can also choose document-based NoSQL. The C end always adds new data, but does not modify or delete data. ***, it is about the bottleneck of the Q end. If your Q end also uses a relational database, then just optimize it as the traditional architecture. The CQRS architecture allows you to use other architectures to implement Q, so there are relatively more optimization methods. ConclusionI think both traditional architecture and CQRS architecture are good architectures. Traditional architecture has a low threshold, and many people understand it, and because most projects do not have a large amount of concurrent writes and data. So it should be said that most projects will be OK with traditional architecture. However, through the analysis of this article, everyone also knows that traditional architecture does have some shortcomings, such as scalability, availability, and performance bottleneck solutions, which are weaker than CQRS architecture. If you have other opinions, welcome to criticize, and communication can make progress, hehe. So, if your application scenario is high-concurrency writing, high-concurrency reading, and big data, and you want to perform better in scalability, availability, performance, and scalability, I think you can try CQRS architecture. But there is another problem. The threshold of CQRS architecture is very high. I think it is difficult to use without mature framework support. As far as I know, there are not many mature CQRS frameworks in the industry. The Java platform has axon framework and jdon framework; the .NET platform and ENode framework are working in this direction. So, I think this is one of the reasons why there are almost no mature cases of using CQRS architecture. Another reason is that using CQRS architecture requires developers to have a certain understanding of DDD, otherwise it is difficult to practice, and DDD itself is difficult to apply in practice without understanding it for a few years. Another reason is that the core of CQRS architecture is very dependent on high-performance distributed message middleware, so it is also a threshold to select a high-performance distributed message middleware (Java platform has RocketMQ), and I personally developed a distributed message queue EQueue for .NET platform, haha. In addition, if there is no support from a mature CQRS framework, the coding complexity will also be very complicated, such as Event Sourcing, message retry, message idempotent processing, event sequence processing, and concurrency control. These problems are not so easy to solve. If there is framework support, the framework will help us solve these purely technical problems. Developers only need to focus on how to model, implement domain models, how to update the reading library, and how to implement queries. Then it is possible to use CQRS architecture, because this may be simpler than traditional architecture development and can obtain many benefits brought by CQRS architecture. |
<<: Understanding neuromorphic computing: from basic principles to experimental verification
>>: How to elegantly design data stratification in a big data environment
If you want to know what is the most popular oper...
Mazda is known as the BMW of Japan. But when BMW ...
On April 10, the "Opinions of the CPC Centra...
Recently, we learned that Lexus will introduce a ...
O2O is a big market, including hotels, shops, res...
The login process is very basic and critical for ...
Full text insect terror warning! On a rainy morni...
Q: How to become a WeChat Mini Program agent? A: ...
After more than a decade of rapid development and...
There are similarities between the activities on ...
Douyin advertising, Douyin promotion Dou+ is Douy...
When the word "Momo" comes into view, a...
now Cars have entered thousands of households Fro...
Course Catalog: 01 【Editorial】Resolve workplace d...
If you want to obtain more traffic and promote su...