Design and implementation of the Spring Festival wallet large traffic reward system entry and display

Design and implementation of the Spring Festival wallet large traffic reward system entry and display

Author: Zhao Jingmu

The ByteDance Open Platform-Wallet Team is responsible for the accounting, display and use of the ByteDance 2022 Spring Festival event reward link for the eight terminals. The following is an introduction and summary of this work. First, the business background and technical architecture are introduced as a whole, and then the specific implementation solutions for each difficulty are explained. Finally, an abstract summary is made, hoping to guide subsequent activities.

1. Background, Challenges and Goals

1.1 Business Background

(1) Support for eight terminals: The 2022 ByteDance product Spring Festival activities need to support the interoperability of rewards for eight terminal APP products (including Douyin/Douyin Volcano/Douyin Express/Xigua/Toutiao/Toutiao Express/Tomato Novel/Tomato Free Listening). Users can participate in the activities on any of the above terminals, and the rewards obtained can be withdrawn and used on other terminals.

(2) Various ways to play: mainly including card collection, red envelope rain on friend page, red envelope rain, card collection prize draw and fireworks display, etc.

(3) Multiple rewards: Reward types include cash red envelopes, subsidized video red envelopes, commercial advertising coupons, e-commerce coupons, payment coupons, consumer finance coupons, insurance coupons, credit card coupons, HEYTEA coupons, movie tickets, dou+ coupons, TikTok cultural and creative coupons, avatar pendants, etc.

1.2 Core Challenges

(1) Design and implement a high-traffic solution for reward entry and display interconnection on eight terminals, with a maximum estimated reward issuance rate of 3.6 million QPS.

(2) Multiple reward scenarios and varied gameplay; multiple reward types, more than 10 types in total. Connect to multiple downstream systems.

(3) Provide all-round support in terms of reward system stability, user experience, fund security, and basic operational capabilities to ensure the smooth progress of the event.

1.3 Final Goal

(1) Reward Accounting: Design and implement a reward accounting system that interconnects eight rewards, connects to multiple downstream reward systems, smooths out differences between different reward downstreams, shields the underlying reward accounting details from the upstream, and designs a unified interface protocol to provide to the business upstream. Provide a unified error handling mechanism, accounting idempotency, and reward budget control.

(2) Reward display/use: Design and implement the activity wallet page, support the display of user rewards on eight terminals, and support users to view, withdraw (cash), use cards/pendants, etc.

(3) Basic abilities:

[Basic SDK] Provides basic SDKs such as querying red envelope balance, accumulated income, whether users have received rewards in Spring Festival activities, etc. for business parties to query and use.

[Budget Control] Connect with the upstream reward issuance algorithm strategy to achieve inventory control capabilities for large-volume card and voucher entries and prevent over-issuance.

[Withdrawal Control] After multiple rounds of rewards are distributed on New Year’s Eve, users will be provided with the ability to withdraw funds in a grayscale manner and the ability to process funds that have not yet been credited to the account at the time of withdrawal.

[Operation Intervention] The flexible operation configuration capability of the activity page supports the rapid release of announcements and timely reaching users. In order to deal with black swan events, it supports the ability to reissue batches of cards, coupons and red envelopes.

(4) Stability assurance: In scenarios with large amounts of incoming transactions, the core paths of the wallet are ensured to be stable and complete. Common stability assurance measures such as resource expansion, current limiting, circuit breaking, downgrading, bottom-up support, and resource isolation are used to ensure the core experience of user rewards.

(5) Fund security: In scenarios with large amounts of incoming transactions, we use mechanisms such as idempotence, reconciliation, monitoring, and alarms to ensure fund security and that all user assets are paid out.

(6) Activity isolation: The data of reward collection and display in the three stages of internal testing activities, grayscale release activities and official Spring Festival activities are isolated and do not affect each other.

2. Product requirements introduction

Users can participate in ByteDance's Spring Festival activities on either end to get rewards. Taking the Douyin red envelope rain cash red envelope account entry scenario as an example, the specific business process is as follows:

Log in to TikTok → Participate in the event → Event wallet page → Click the withdraw button → Enter the withdrawal page → Make a withdrawal → Withdrawal result page. You can also enter the event wallet page from the wallet page.

Core scenarios for reward distribution:

  • ​Collecting cards: Various types of card coupons will be issued when collecting cards and drawing cards. The card collection koi will also issue large cash red envelopes, and bonuses and coupons will be issued when the card collection draws;
  • Red Envelope Rain: Send red envelopes, card coupons, and video subsidy red envelopes, with the highest QPS for red envelopes and card coupons being 1.8 million respectively;
  • Fireworks display: send red envelopes, coupons and avatar pendants.

3. Design and implementation of wallet asset middle platform

In the 2022 Spring Festival event, UG is mainly responsible for the implementation of the gameplay of the event, including specific event-related business logic and stability assurance such as card collection, red envelope rain and fireworks festival. The wallet orientation is to realize the related tasks of reward entry, reward display, reward use and fund security in large-traffic scenarios. Among them, the asset middle platform is responsible for the reward distribution and reward display.

3.1 The overall architecture diagram of the Spring Festival Asset Center is as follows:

The core system of the wallet asset platform is divided as follows:

Asset order layer: converge the eight-end reward entry links, provide a unified interface protocol to connect with the reward distribution functions of upstream activity business parties such as UG, incentive middle platform, video red envelopes, etc., and at the same time shield the upstream from the logic processing of the downstream reward business, and support budget control, compensation, and order number idempotence.

Activity wallet API layer: converge eight-terminal reward display links and support large traffic scenarios

3.2 Design of Asset Order Center

Core distribution model:

illustrate:

  • The activity ID uniquely distinguishes an activity. This Spring Festival is assigned a separate parent activity ID.
  • The scenario ID corresponds to a specific reward type one by one, defining the unique configuration of the reward issued under the scenario. The capabilities that can be configured by the scenario ID include: issuing reward bill copy; whether compensation is required; current limit configuration; whether inventory control is performed; whether reconciliation is required. Pluggable capabilities are provided for optional access by businesses.

Effect:

  • Implementing configuration isolation between different activities
  • The configuration of each activity is in a tree structure, so that one activity can issue multiple rewards, and one reward can issue multiple reward IDs.
  • One reward ID can have multiple distribution scenarios, supporting personalized configuration for different scenarios

Order number design:

The asset order layer supports the idempotence of awarding rewards in the order number dimension. The order number design logic is ${actID}_${scene_id}_${rain_id}_${award_type}_${statge}. From the order number design level, it is guaranteed that there will be no over-issuance, and users can only receive rewards for each scene once at most.

4. Solving the core difficult problems

4.1 Difficulty 1: Supporting intercommunication of reward data between eight terminals

As mentioned in the previous article, there are eight product terminals participating in the 2022 Spring Festival event. Among them, the Douyin and Toutiao apps are different account systems, so the rewards cannot be interconnected through user IDs. The specific solution is that the Byte Account Center connects the account systems of the eight terminals to generate a unique actID for each user (the mobile phone number has the highest priority. If the mobile phone numbers logged in on different terminals are the same, the actIDs on different terminals are consistent). Based on the unique actID provided by the Byte Account Center, the wallet side has designed and implemented a general solution that supports the entry, viewing and use of rewards on the eight terminals. That is, each user's reward data is bound to the actID, and entry and query are realized through the actID dimension, which can realize the interconnection of rewards on the eight terminals.

The schematic diagram is as follows:

4.2 Difficulty 2: Realization of reward entry in high-level scenarios

Every year, the distribution of cash red envelopes is the most important part of the Spring Festival activities, and this year is no exception. There are several reasons for this:

  • The estimated maximum flow of cash red envelopes is 1.8 million TPS.
  • Cash red envelopes themselves are of high value, and the funds need to be kept safe.
  • Users are very sensitive to cash, and cost issues must be considered while ensuring user experience and functional integrity.

To sum up, cash red envelopes face relatively large technical challenges.

Sending red envelopes is actually a transaction behavior, and the flow of funds is from the company's cost to the personal account.

(1) From a technical perspective, the idempotence of the order number dimension must be supported. Multiple requests for the same order number will only be recorded once. The order number generation logic is ${actID}_${scene_id}_${rain_id}_${award_type}_${statge}, which ensures that there is no over-issuance from the order number design level.

(2) To support high concurrency, there are two traditional solutions:

The above two traditional technical solutions have obvious shortcomings. So what is the solution that can save resources relatively and ensure user experience?

The red envelope rain token solution was finally adopted. The specific solution is to use asynchronous accounting plus a small amount of distributed storage and a more complex solution to achieve it. The following is a detailed introduction.

4.2.1 Red Envelope Token Solution:

This Spring Festival event has a scenario of sending red envelopes with extremely large traffic under the activities of red envelope rain/card collection and prize drawing/fireworks festival. As mentioned above, the highest estimated QPS for awarding prizes is 1.8 million QPS. According to the existing account entry design, a large amount of storage and computing resources are required to support it. Based on the estimated number of red envelopes to be issued/the maximum acceptable issuance time of the product, it is calculated that the minimum TPS that the wallet actually supports for entry is 30 million, so there is a process of suppressing orders in the actual issuance.

Design goals:

When there is a large gap between the estimated amount of money distributed to users (1.8 million) and the actual amount of money received (30 million), the core user experience is guaranteed. Users cannot perceive the process of order suppression when viewing and using the front-end page, that is, the viewing and use experience cannot be affected. The relevant displayed data includes balance, accumulated income and red envelope flow, and use includes withdrawal, etc.

Specific design plan:

In a high-traffic scenario, each time we send a red envelope to a user, we will generate an encrypted token (using asymmetric encryption, including metadata of the red envelope: red envelope amount, actID, and issuance time, etc.), which will be stored on the client and server (disaster recovery and mutual backup), and each user will have a token list. Each time a red envelope is sent, the account status of the token will be recorded in Redis, and then the cash red envelope flow, balance and other data that the user sees on the activity wallet page are the result of merging the accounted red envelope list + token list - accounted/accounting token list. At the same time, in order to ensure that the user's withdrawal experience does not perceive the red envelope order suppression process, the unaccounted token list will be forced to be accounted when entering the withdrawal page or clicking on the withdrawal, to ensure that the balance of the user's account is the total amount to be accounted when withdrawing, and the user's withdrawal process is not blocked.

The schematic diagram is as follows:

Token data structure:

The token uses the pb format. After a single test, it was verified that the storage consumption is actually half that of using json, saving the bandwidth and storage cost of the request network; at the same time, the CPU consumption of serialization and deserialization is also reduced.

 // Red envelope rain token structure
type RedPacketToken struct {
AppID int64 `protobuf: varint,1,opt json: AppID,omitempty ` // Terminal ID
ActID int64 `protobuf: varint,2,opt json: UserID,omitempty ` // ActID
ActivityID string `protobuf: bytes,3,opt json: ActivityID,omitempty ` // Activity ID
SceneID string `protobuf: bytes,4,opt json: SceneID,omitempty ` // Scene ID
Amount int64 `protobuf: varint,5,opt json: Amount,omitempty ` // Red envelope amount
OutTradeNo string `protobuf: bytes,6,opt json: OutTradeNo,omitempty ` // Order number
OpenTime int64 `protobuf: varint,7,opt json: OpenTime,omitempty ` // Prize opening time
RainID int32 `protobuf: varint,8,opt,name=rainID json: rainID,omitempty ` // Red envelope rain ID
Status int64 `protobuf: varint,9,opt,name=status json: status,omitempty ` //Account status
}

Token state machine flow:

Before the account is actually credited, the status will be in the processing (2) state. If the account is successfully called, the status will be successful (8). If there is no failure in sending the red envelope, subsequent retries will be successful.

Token security guarantee:

Asymmetric encryption algorithms are used to ensure that the client stored in the token is not cracked as much as possible. The encryption algorithm is a secret warehouse, which restricts access by others. At the same time, if the token encryption algorithm is cracked by the black industry in extreme cases, it can be monitored and detected by alarm, and can be downgraded.

4.2.2 The activity wallet page displays the red envelope flow

Demand background:

The red envelope flow displayed on the activity wallet page is a merger of three data sources: cash red envelope deposit flow, withdrawal flow, and c2c red envelope flow. It is arranged in reverse order by creation time, needs to support paging, and can be downgraded to ensure that the user experience does not perceive the cash red envelope order suppression process.

4.3 Difficulty 3: Stability assurance of multiple reliance on reward chain

The downgrade diagram of the red envelope sending process is as follows:

According to historical experience, the more complex the functions implemented, the more dependencies there will be, and the higher the corresponding stability risk will be. So how to ensure the stability of a highly dependent system?

Solution:

The most basic function to ensure cash red envelopes is to record the red envelopes received by users, while supporting idempotency and budget control (to avoid over-issuance). The idempotency design of the red envelope account strongly relies on the database to maintain transaction consistency. However, if an extreme situation occurs, there may be problems with the intermediate links. If it is a weak dependency, it needs to be downgraded without affecting the main process of issuance. The shortest path for sending red envelopes in the wallet direction is to rely on service instance computing resources and MySQL storage resources to realize cash red envelopes.

Illustration of the strong and weak dependence of sending red envelopes:

4.4 Difficulty 4: Budget control for large-volume card issuance

Demand background:

The fireworks display will start at 7:30 pm on New Year’s Eve during the Spring Festival. This is a scenario where coupons are issued in a concentrated manner with a large flow of traffic. The wallet side cooperates with the algorithm strategy to control the inventory of card and coupon issuance to prevent over-issuance.

Specific implementation:

(1) The wallet asset center maintains the consumption and issuance amount of each card or coupon template ID.

(2) Before each card or voucher is issued, the algorithm strategy will read the wallet SDK to obtain the consumption and total inventory of the card or voucher template ID. At the same time, a threshold will be set. If the remaining amount of the card or voucher is less than 10%, the voucher will not be issued (using a backup voucher or blessing message as a backup).

(3) At the same time, the wallet asset center accumulates the consumption of each voucher template ID in the voucher issuance process (using the Redis incr command to atomically accumulate consumption), and then compares it with the total active inventory. If the consumption is greater than the total inventory, it will be rejected to prevent over-issuance. This is also a bottom-up process.

Specific flow chart:

Optimization direction:

(1) When using Redis for counting under high traffic conditions, a single key may have a hot key problem, which needs to be solved by splitting the key.

(2) Operating Redis in a high-traffic scenario will result in a timeout problem. When the problem is returned to the upstream for processing, the upstream will continue to retry issuing coupons, which will consume more inventory and reduce the number of coupons issued. The actual inventory for this Spring Festival event was increased by 5% based on the estimated inventory to alleviate the problem of insufficient issuance caused by timeouts.

4.5 Difficulty 5: Ensuring the read and write stability of hot keys in high QPS scenarios

Demand background:

The fireworks display event will begin at 7:30 pm on New Year's Eve, showing the real-time cumulative total amount of all red envelopes and fireworks display red envelopes distributed. The maximum traffic is estimated to be 180wQPS for reading and 30wQPS for writing.

This is a typical scenario with extremely large traffic, hot keys, insensitivity to update delays, and non-strong data consistency (numbers are always accumulated). At the same time, disaster recovery and degradation processing must be done well. Finally, the actual amount displayed in the activity and the expected amount of product distribution have an error of less than 1%.

4.5.1 Solution 1

The SDK access method is provided, and the resources of the main venue machine instance are reused. The most common way to read and write a single key under high QPS is to use Redis distributed cache to implement it, but the single key reading and writing will hit one instance. The bottleneck of a single instance after stress testing is 3w QPS. Therefore, one optimization is to split multiple keys and then use local cache as a backup.

Specific writing process:

The design is to split 100 keys. Each time a red packet is sent, the number is accumulated using the incr command according to the requested actID%100. Because idempotence cannot be guaranteed, no retries will be made if the timeout occurs.

Reading process:

Similar to the writing process, the local cache is read first. If the local cache value is 0, the key values ​​of each Redis are read and accumulated together and returned.

question:

(1) Splitting 100 keys will cause read diffusion problems, requiring more Redis resources to be applied, and the storage cost is relatively high. In addition, there may be a read timeout problem, and it cannot be guaranteed that all keys will be read successfully at one time, so the returned results may be less than the previous one.

(2) In terms of disaster recovery solutions, if you apply for Redis backup, you will also need more storage resources and additional storage costs.

4.5.2 Solution 2

Design ideas:

Optimize based on the implementation of solution 1, and consider the continuous accumulation of numbers, cost savings and disaster recovery solutions. In the write scenario, merge write requests through the local cache for atomic accumulation, and return the value of the local cache in the read scenario to reduce the use of additional storage resources. Use Redis to implement centralized storage, so that everyone reads the same value in the end.

Specific design plan:

Each Docker instance will execute scheduled tasks when it starts, which are divided into Redis reading tasks and Redis writing tasks.

Reading process:

  • The local scheduled task is executed once a second to read the value of a single key in Redis. If the obtained value is greater than the local cache, the local cache value is updated.
  • The SDK exposed to the outside world can directly return the value of the local cache.
  • There is one problem that needs attention. There is no data in the first second of each instance startup, so the read will be blocked and will not return until there is data.

Writing process:

  • Because all reads are from the local cache (the local cache does not expire), you only need to handle concurrent writes.
  • The local cache write variable uses go's atomic.AddInt64 to support atomic accumulation of local write cache values.
  • Each time a scheduled task to update Redis is executed, the local write cache is first copied to the amount variable, then the value of amount is atomically subtracted from the local write cache, and finally the value of amount is added to the Redis single key, so that the value of the Redis single key is accumulated.
  • The disaster recovery solution is to use a backup Redis cluster and perform double writing when writing. Once the host cluster fails, a configuration switch is designed to support reading the backup Redis. The data consistency of the two Redis clusters is achieved through scheduled tasks.

The traffic of Redis calls in this solution is proportional to the number of instances. After investigation, the number of instances of the read-side service is 20,000 for the main venue, and the number of instances of the write-side service is 8,000 for the asset middle platform. Therefore, the actual QPS that Redis needs to support is 28,000/scheduled task execution interval (in seconds). Stress testing has verified that a single Redis instance can support 20,000 get and 8k incr operations for a single key, so the execution time interval of the scheduled task is set to 1 second. If the number of instances is larger, you can consider extending the execution time interval.

The specific writing flow chart is as follows:

4.5.3 Solution Comparison

    in conclusion:

    Considering the implementation effect, resource cost and disaster recovery, we finally chose Solution 2 to go online.

    4.6 Difficulty 6: Smooth switching between parent activities and child activities

    Demand background:

    In order to ensure the final online effect and delivery quality of this Spring Festival event, it was actually divided into three stages.

    (1) The first stage is the internal staff testing stage.

    (2) The second stage is the external rehearsal stage, in which some external users are selected to verify the Spring Festival activity functions (grayscale release). This is also the most effective means to discover exposed problems and verify the corresponding solution mechanisms, and the impact is controllable.

    (3) The third stage is the formal Spring Festival activities.

    However, the product requirement is that these three stages are independent stages, including the isolation of users obtaining rewards, displaying and using rewards.

    Technical Challenges:

    There are multiple upstream calls to the wallet to issue rewards, and the wallet has multiple reward businesses downstream, so the communication cost of making changes together is high, the probability of configuration errors is relatively high, and the changes cannot be made synchronously, which will pose a major technical security risk.

    Design ideas:

    As the only entry point for reward entry, the wallet asset middle platform converges the implementation of the entire activity configuration switch. A hierarchical configuration of parent activities and child activities is designed. The upstream request parameters uniformly transmit the parent activity ID to represent the Spring Festival activity. The wallet asset middle platform decides which child activity configuration to use for awarding according to the request time, so as to meet the product requirements of different activities in different time periods. It reduces the communication cost, reduces the probability of configuration errors, and can switch synchronously, which greatly improves the efficiency of R&D and testing.

    Schematic diagram:

    4.7 Difficulty 7: Fund security in large-volume scenarios

    Wallet Direction did three things during this Spring Festival event to ensure the safety of funds for the large-volume and large-budget cash red envelopes:

    1. Interception of overall budget control for cash red envelope distribution
    2. Blocking the upper limit of the amount of a single cash red envelope
    3. Funds reconciliation in the scenario of sending red envelopes with large traffic volume
    • Hourly reconciliation: supports h+1 hourly reconciliation for red envelope rain/card collection/fireworks red envelope distribution, and sets a bottom-line h+2 reconciliation for some scenarios.
    • Quasi-real-time reconciliation: The red envelope data that has been received by Red Envelope Rain is checked against the wallet asset middle platform and the activity side for quasi-real-time reconciliation

    Multi-dimensional verification diagram:

    Quasi-real-time reconciliation flow chart:

    illustrate:

    Quasi-real-time reconciliation monitoring and alarms can promptly detect any abnormal account entries. If an alarm is detected, an emergency plan will be implemented.

    5. Common Pattern Abstraction

    After the design and implementation of the huge traffic event during the Spring Festival, I have some conclusions and experiences to share with you.

    5.1 Disaster Recovery Downgrade Level

    In high-traffic scenarios, disaster recovery must be done well to ensure the final online effect of the event. Refer to the industry's common implementation solutions, such as downgrade, current limiting, circuit breaking, resource isolation, and estimate the storage usage based on the estimated number of participants and effect of the event.

    5.1.1 Current Limitation

    (1) For flow control, we applied nginx inbound flow control at the API layer, distributed inbound flow control, and distributed outbound flow control. These flow control devices are all public middleware at the ByteDance company level and have been verified by large traffic volumes.

    (2) We first conducted an actual single-instance stress test. We then expanded the capacity of the service based on the traffic that the single instance could handle and the estimated traffic for the Spring Festival event. We also configured the TLB inflow, inflow limiting, and outflow limiting in detail based on the downstream capacity.

    Current limiting target:

    Ensure the stability of its own services, prevent unexpected external traffic from bringing down its own services, prevent avalanche effects, and ensure core business and user core experience.

    Simple cluster current limiting is the current limiting in the instance dimension. The QPS of each instance current limiting = total configured current limiting QPS / number of instances. For multiple machines with low QPS, it may be inaccurate. Actual stress testing is required and the configuration value must be adjusted in a timely manner.

    For distributed inbound and outbound flow control, there are two usage methods as follows. Each method supports high and low QPS. The only difference is the SDK usage and function. Generally, low QPS requires high precision and uses the redis counting method. The user provides its own redis cluster. High QPS requires low precision and degenerates to single instance flow control of total QPS/tce instance number.

    5.1.2 Degradation level

    For high-traffic scenarios, each core function must have a corresponding degradation plan to ensure the stability of the core link in emergencies.

    (1) We have made sufficient operational plans for the Spring Festival rewards and activity wallet pages. There are a total of 26 downgrade switches, which can be used to protect the driver at critical moments to prevent single point problems from affecting the core links.

    (2) Taking the cash red envelope link as an example, the final complete downgrade solution for the wallet direction is to rely only on Docker and MySQL. Other dependencies can be downgraded. If there is a problem with the MySQL master, you can contact us urgently to switch to the master. Although the last one was not used, the premise must be designed well to ensure that the activity is foolproof.

    5.1.3 Resource Isolation Level

    (1) Improve development efficiency and avoid reinventing the wheel. Because the wallet asset platform also supports the daily needs of Douyin asset distribution, this Spring Festival event also reused the existing interface and code process to support award distribution.

    (2) At the same time, for this Spring Festival event , cluster isolation was performed at the service level, a dedicated event cluster was created, and the underlying storage resources were isolated, so that event traffic and regular traffic did not affect each other.

    5.1.4 Storage Estimation

    (1) Not only must we consider and verify whether Redis or MySQL storage can handle the corresponding traffic, but we must also estimate whether storage resources are sufficient based on the actual acquisition and distribution of data.

    (2) For ByteDance's Redis component, you can expand vertically (add storage to each instance, up to 10G) or horizontally (up to 500 instances in a single data center). Because Redis is synchronized in three data centers, only the storage limit of one data center is considered when calculating storage. Sufficient buffer must be left, because horizontal expansion is a very slow process. In the event of insufficient storage resources, you can only remove dependent storage in advance by configuring switches, which needs to be designed in advance.

    5.1.5 Stress Testing

    For this Spring Festival event, we conducted a full-link stress test on the wallet reward deposit and the event wallet page. Here are some experience summaries.

    Before stress testing, you need to build a monitoring dashboard for the entire stress testing link to detect problems in a timely and convenient manner during the stress testing process.

    For MySQL database, before the start of high-traffic official activities such as red envelope rain, a small-traffic stress test is performed to warm up the database, and the link is established in advance before the peak traffic to reduce the time spent on large-scale link establishment during the official activities and ensure the stability of the red envelope link database level.

    During the stress test process, stress test standards must be transmitted to support full-link identification of stress test traffic and perform special logical processing without interfering with normal online business.

    No special processing is performed on stress testing traffic, and the stress testing traffic processing process remains consistent with the online traffic.

    During stress testing, it is necessary to verify whether computing resources and storage resources can withstand the estimated traffic.

    • Sort out the stress testing plan, set a reasonable initial flow based on historical experience, gradually increase the stress testing flow, and observe various stress testing indicators in real time.
    • The storage resource stress test data must be isolated from the online data. For MySQL and Bytekv, a stress test table is created. For Redis and Abase, a stress test key is added with a stress test prefix based on the online key.
    • Stress test data should be cleaned up in a timely manner. Redis and Abase have short expiration times and expiration mechanisms that are more convenient to handle. If you forget to set the expiration time, you can delete the stress test data by writing a script to identify the stress test prefix.

    After the stress test, you should also pay attention to whether the various indicators of storage resources meet expectations.

    5.2 Microservices Thinking

    In daily technical design, everyone will abide by the microservice design principles and specifications, split different modules according to system responsibilities and core data models, improve development iteration efficiency and do not affect each other. However, microservices also have their drawbacks. The functions for scenarios with extremely large traffic are also relatively complex and will go through multiple links, which consumes a lot of computing resources. This Spring Festival event asset middle platform provides an SDK package instead of RPC to aggregate microservice links and provide basic capabilities to the outside world, such as querying balances, determining whether users have received rewards, and forcing accounts. The maximum access traffic is tens of millions, which saves tens of thousands of CPU core computing resources compared to the use of microservice architecture.

    6. Future evolution of the system

    • Sort out upstream and downstream demands and pain points, optimize the design and implementation of the asset middle platform, improve basic capabilities, optimize the service architecture, and provide one-stop services so that the parties involved in the activities can focus more on the research and development of the event business logic.
    • Strengthen the construction of real-time and offline data dashboard capabilities to make the reward distribution data display clearer and more accurate.
    • Strengthen configuration and documentation construction to reduce the docking costs of docking activities internally and improve the access efficiency of activity business parties externally.

    <<:  Old users are the real love! iOS 15.4 battery life test: new phones get worse, old phones get better

    >>:  The high-end development of the Android camp is stuck

    Recommend

    How to repay the "sleep debt" incurred by working night shifts for a long time?

    It is normal for medical staff to stay up late at...

    As early as 60,000 years ago, did Australian people achieve "nut freedom"?

    Macadamia nuts are the only native Australian pla...

    What programmers must know about the evolution of the front-end

    After carefully sorting out the front-end technol...

    QQ big update, voice progress bar is here!

    On April 13, QQ welcomed the update to version 8....

    How to write a new media marketing promotion plan?

    What role does new media marketing play? How to s...

    Short video planning and operation guide!

    During last year's National Day, Zhang Ce and...

    Summary of various bottlenecks encountered on the road of Android learning

    [[197644]] Preface For most junior students, this...

    Institutional Courses - Key Dialogues between Superiors and Subordinates

    Institutional Courses - Key Dialogues between Sup...

    As video regulation tightens, operators urgently need to change their tactics

    Before the Internet Security Day on April 29, the...

    Get cutting-edge courses_New Energy Vehicles 2021 Baidu Cloud Download

    Get cutting-edge courses_New Energy Vehicles 2021...