1. BackgroundThe vivo official mall has undergone seven years of iteration, gradually evolving from a monolithic architecture to a microservice architecture. Our development team has accumulated a lot of valuable technology and experience, and has a very deep understanding of the e-commerce business. At the beginning of last year, the team took on the task of building the O2O mall, as well as the gift platform that was about to be established, and the official mall's online purchase and offline store delivery needs, all of which required the construction of underlying products, transactions and inventory capabilities. In order to save R&D and operation and maintenance costs and avoid reinventing the wheel, we decided to adopt a platform-based approach to build the underlying system, using general capabilities to flexibly support the personalized needs of upper-level businesses. A complete set of e-commerce platform systems, including trading platforms, product platforms, inventory platforms, and marketing platforms, came into being. This article will introduce the architectural design concepts and practices of the trading platform, as well as the challenges and reflections in the continuous iteration process after launch. 2. Overall Architecture2.1 Architecture GoalsIn addition to high concurrency, high performance, and high availability, we also hope to achieve the following:
2.2 System Architecture (1) Trading platform in the overall architecture of e-commerce platform (2) Trading platform system architecture 2.3 Data Model3. Key Solution Design3.1 Multi-tenant design(1) Background and objectives
(2) Design plan
Through the above mapping relationship, storage resources can be flexibly allocated to each tenant, and tenants with small data volumes can reuse existing libraries and tables. Example 1: Before the new tenant joins, there are 4 databases * 16 tables. The new tenant has a small order volume and low concurrency. The existing database 0 and table 0 are directly reused. The mapping relationship is: tenant code -> 1,1,0,0 Example 2: Before the new tenant joins, there are 4 databases * 16 tables. The new tenant has a large number of orders but low concurrency. Eight new tables are created in the original database 0 to store them. The mapping relationship is: tenant code -> 1,8,0,16 Example 3: Before the new tenant joins, there are 4 databases * 16 tables. The new tenant has a large number of orders and high concurrency. Use new 4 databases * 8 tables to store them. The mapping relationship is: tenant code -> 4,8,4,0 Calculation formula for the database table to which the user order belongs Library serial number = Hash(userId) / table number % library number + starting library number Table serial number = Hash(userId) % table number + starting table number Some friends may ask: Why do we need to divide by the number of tables when calculating the library number? What is wrong with the following formula? Library serial number = Hash(userId) % library quantity + starting library number Table serial number = Hash(userId) % table quantity + starting table number The answer is that when there is a common factor between the number of libraries and the number of tables, there will be a skew problem. Dividing by the number of tables first can eliminate the common factor. Taking 2 databases and 4 tables as an example, a number that is equal to 1 modulo 4 must also be equal to 1 modulo 2. Therefore, there will be no data in table 1 of database 0. Similarly, there will be no data in table 3 of database 0, table 0 of database 1, and table 2 of database 1. The routing process is shown in the following figure: (3) Limitations and countermeasures
Problem: After the database and table are sharded, the database auto-increment primary key is no longer globally unique and cannot be used as an order number. In addition, many interactive interfaces between internal systems only have order numbers, not user identifiers, which are shard keys. Solution: As shown in the figure below, refer to the snowflake algorithm to generate a globally unique order number, and implicitly include the library table number in it (two 5-bits store the library table number respectively). In this way, the library table number can be obtained from the order number in the scenario where there is no user ID.
Question: The management backend needs to query all orders that meet the conditions in pages based on various filtering conditions. Solution: Redundantly store a copy of the order data in the search engine Elasticsearch to meet the needs of fast and flexible queries in various scenarios. 3.2 State Machine Design(1) Background
(2) Objectives
(3) Plan
/**
3.3 General Operation Triggers(1) Background There are usually such delay requirements in business, and we used to scan and process them through scheduled tasks.
(2) Objectives
(3) Plan Design a general action trigger. The specific steps are as follows:
The configuration of the trigger includes:
3.4 Distributed TransactionsDistributed transactions are a classic problem for trading platforms, such as:
How do we ensure data consistency in a microservice architecture? First, we need to distinguish the consistency requirements of business scenarios. (1) Strong consistency scenario For example, calls to the inventory and coupon systems when creating and canceling orders may lead to overselling of inventory or duplicate use of coupons if strong consistency cannot be guaranteed. For strong consistency scenarios, we use Seata's AT mode to handle it. The following diagram is taken from Seata's official document. (2) Eventual consistency scenario For example, after payment is successful, the delivery system is notified to ship the goods, and after receipt is confirmed, the points system is notified to issue points. As long as the notification can be successful, it does not need to succeed and fail at the same time. For the eventual consistency scenario, we use the local message table solution: the asynchronous operations to be executed in the local transaction are recorded in the message table. If the execution fails, it can be compensated by a scheduled task. 3.5 High Availability and Security Design
Use the Hystrix component to add circuit breaker protection to dependent external systems to prevent the impact of a system failure from expanding to the entire distributed system.
Through performance testing, we can identify and resolve performance bottlenecks, understand the system throughput data, and provide a reference for the configuration of current limiting and circuit breaking.
Before any order update operation, it will be restricted by database row-level lock to prevent concurrent updates.
All interfaces are idempotent. If an exception such as a timeout occurs when the upstream calls our interface, you can safely retry.
Only a very small number of third-party interfaces can be accessed through the external network, and they are all protected by whitelists, data encryption, signature verification, etc. Internal systems interact using intranet domain names and RPC interfaces.
By configuring the error log alarm of the log platform and the service analysis alarm of the call chain, combined with the monitoring and alarm functions of the company's middleware and basic components, we can detect system anomalies in the first place. 3.6 Other considerations
Considering the team's non-agile organizational structure and lack of domain experts, we did not adopt
During big sales and promotions, especially when hot items are on sale, traffic may trigger traffic restrictions, causing some users to be turned away. Because it is impossible to accurately estimate traffic, it is difficult to expand capacity in advance. The concurrency can be increased through active downgrade solutions, such as switching from synchronous warehousing to asynchronous warehousing, from DB query to cache query, and only querying orders in the last six months. Considering that the business complexity and data volume are still in the early stages, and the team size is difficult to support, these designs have long-term plans, but have not yet been done. (The principle of appropriateness of architecture, you can use a sledgehammer to kill a nut if you want). IV. Summary and OutlookWhen designing the system, we did not blindly pursue cutting-edge technologies and ideas, nor did we directly adopt mainstream solutions in the industry when facing problems. Instead, we selected the most appropriate method based on the actual situation of the team and the system. A good system is not designed by a master at the beginning, but is gradually iterated as the business develops and evolves. The trading platform has been online for more than a year and has been connected to three business parties. The system runs smoothly. New businesses within the company with trading/commodity/inventory needs, as well as existing businesses that encounter system bottlenecks and need to be upgraded, can reuse this capability. With the increase in the number of upstream business parties and the iteration of versions, the demand for platform systems is continuous. The functions of the platform have been gradually improved and the architecture has been continuously evolving. We are separating the fulfillment module from the trading platform and further decoupling it to prepare for the sustainable development of the business. |
<<: iOS 16.5 update push, this feature will be disabled
>>: How to avoid Android startup stack trap
Most of the bandwidth at home and the data traffi...
Lei Jun, chairman of Xiaomi, opened a Xiaohongshu...
New regulations on Pingbo account creation tutori...
Valuable content and users' demand for conten...
A sudden epidemic in 2020 not only affected every...
I've been working on something related to sen...
The reasons behind the popularity may become one ...
How much does it cost to attract investment throu...
How much does a game server cost per month? How m...
Which Zhejiang Mobile high-bandwidth server renta...
Zhou Wenqiang's "Top Financial Thinking ...
【Wuwei Financial School】Li Dong's Bull Stock ...
Let’s take a look at how to achieve conversion ra...
There are no two identical leaves or people in th...
If someone tells you about Weihai, you may not kn...