As companies pay more and more attention to data, A/B testing has been widely used in various scenarios and functions to find breakthroughs, growth opportunities and reduce investment risks. What is A/B testing? A/B testing is an evaluation method that uses some objective indicators to compare different solutions to measure which path has the best effect. Its advantage lies in the real environment. It verifies different design solutions through user behavior data and business data generated by some users, and finally analyzes and evaluates the best solution before formally applying it. There are countless scenarios for A/B testing, so how do we conduct scientific A/B testing? Original flavor provides two key points: grouping and evaluation . Next, use the 7-step method to lock in the A/B test evaluation strategy. Step 1: A/B testing strategy development A/B testing is always based on strategy. Only after a clear strategy is established can we find a user group to verify whether the strategy is effective and use reasonable indicators for evaluation and analysis. In this step, usually there are three steps: strategy proposal , strategy scoring , and strategy determination. Regardless of the application scenario, everyone will have their own ideas in the A/B test strategy formulation step, which will lead to the generation of many strategies, but it is not necessary to put each strategy online as an experimental group test, otherwise it will cost a huge amount of money in the early stages of material preparation, solution implementation, etc. When making decisions within the team, the ICE model can be used to score various indicators, and then the scores of the three factors of each strategy can be summed up to find the strategies with the highest scores for A/B testing. Step 2: Test target (evaluation indicator) selection In strategy evaluation, evaluation indicators are very important. How to choose indicators? Here we need to base on the OSM model , start from the big goal ( O ), find the strategy ( S) that can achieve the goal, and use reasonable indicators ( M ) to track whether the strategy can achieve the business goal. During the test, the estimated experimental sample size also needs to be controlled ① If the number of samples is too small , the results are easily interfered by abnormal samples, resulting in the results not being universal. ② If the number of samples is too large and the test traffic is too much, it will increase the cost of trial and error and affect the subsequent judgment. During the test, the estimated experimental periodicity also needs to be controlled ① If the test time is too short , there will not be enough samples in the experimental group, making it difficult to draw valid conclusions. ② If the testing time is too long , it will incur the cost of maintaining multiple versions online, making it difficult to control the situation Here is a recommended AB testing tool - A/B test sample calculator. You can estimate the data after the test experiment is converted by entering relevant parameters. Here you can adjust the number of samples according to your own rhythm: Factors that affect the number of samples required for an experiment: ▲Conversion rate of the original version The original version has a lower conversion rate, which means the signal is weaker and requires more samples ▲Conversion rate of the new version The smaller the difference between the expected and original conversion rates, the higher the sensitivity of the test required, and therefore the more samples are needed. ▲Statistical significance requirements It is generally recommended that at least 95% statistical significance be required. The higher the statistical significance requirement, the more certainty is needed in the results, so the more samples are needed. (Statistical significance: tells the experimenter how likely it is that the conversion rate of the optimized version is different from the conversion rate of the original version. In other words, it can answer whether the changes in the optimized version really have an impact on the conversion rate) Step 3: Scientific diversion Whether the traffic distribution of A/B test is uniform is an important factor affecting the experimental results. The general distribution method is based on the unique code that can identify the user, such as user ID or device ID, and the user is randomly divided into different "buckets" through an algorithm. ⚪For example, if there are 60 users, get the IDs of these 60 users, divide them according to the user IDs, and randomly and evenly distribute these 60 users into 6 "buckets" After completing the "bucket" diversion, all you need to do is select the corresponding traffic from these "buckets" according to the experimental requirements and enter the test group. The basic principle of A/B testing is to control variables. In the diversion, it is necessary to ensure that the samples are evenly distributed, that is, the population characteristics of different "buckets" are evenly distributed. It is not possible to divide the experimental groups into groups, such as experiment A is full of elderly people or experiment B is full of girls. The conclusions and data measured in this way will affect the marketing decision, and the diversion is meaningless. Step 4: A/A Testing In order to ensure the uniform distribution of samples and eliminate the impact of sample differences, an A/A test is usually performed before an A/B test. You can also set aside a portion of traffic during an A/B test and perform an A/A test at the same time. A/A testing, as the name implies, means that the strategies issued in the experiment are consistent Under this premise, compare whether there is a significant difference between each group. If there is a significant difference, then there must be a problem in at least one of the experiment's traffic diversion, tracking points, or data statistics. Therefore, the significance of A/A test operation is to increase the credibility of the experimental conclusions of A/B test. On the one hand, we should solve the user identification and user diversion problems in time to ensure the accuracy of data. On the other hand, we can eliminate the attribute interference of sample users, ensure the consistent distribution of user characteristics, and ensure that the experimental differences are only caused by variables. Let’s use the example of the conversion rate of a movie membership product to explain A/A testing. The following figure shows the A/A test results of the paid conversion rate project on the paid page of a movie membership product. During the investigation, it was found that there was a problem with the unique identification of users. After the correction, an A/A test was conducted. Finally, the differences between the groups of users were not significant, so it can be considered that the user feature distribution is basically the same. Step 5: Strategic Delivery In the real environment of an enterprise, there will be many A/B Test experiments, so the strategic delivery needs to first determine the relationship between different experiments. ① Orthogonal experiment : the experiments do not affect each other. For example, Experimental Group 1 is an experiment to test different button colors, and Experimental Group 2 is an experiment to test different advertising algorithms. The button color of Experimental Group 1 will not affect the effect of the advertising algorithm of Experimental Group 2, so Experimental Group 1 and Experimental Group 2 are orthogonal experiments. ② Mutually exclusive experiments : There is mutual influence between experiments. For example, Experiment 1 tests the effect of temperature control frequency limiting strategy on temperature, and Experiment 2 tests the effect of temperature control brightness reduction on temperature. Both Experiment 1 and Experiment 2 affect temperature, so Experiment 1 and Experiment 2 are mutually exclusive. The global traffic is basically of fixed size. It is impossible to conduct only one experiment for each traffic group at the same time. Otherwise, traffic starvation is likely to occur. Therefore, it is necessary to reasonably control variables in the strategic delivery, select a fixed North Star indicator, and then reasonably break down and subdivide the goals, conduct delivery tests, and select the optimal path strategy as the final solution for A/B testing. Step 6: Data Monitoring I won’t go into detail about this because each company has different data monitoring tools, some have their own data testing dashboards or data monitoring tools provided by third-party service providers. The data dashboard of AB testing does not need to be too complicated. The purpose is to quickly show the trend of key indicators of each group and whether they meet statistical significance. Step 7: Strategy Results Analysis and Implementation After the A/B test is completed, the data dashboard can be used to determine whether the test is significant, that is, whether the strategy has an impact. The impact is not necessarily positive. Generally speaking, for experimental results: significantly positive > significantly positive > significantly negative > not statistically significant . Don’t be afraid of negative significant results, at least this result tells us what we shouldn’t do. After completing an A/B test, the business side needs to scale up the significantly effective strategy, apply it to more people, find the optimization point of the strategy, and conduct iterative A/B testing. For strategies that are significantly ineffective, it is necessary to analyze the reasons for the ineffectiveness and perform strategy iteration and optimization. In this way, A/B testing is continuously conducted based on the original conclusions. Each test is a process of "stepping up". As the number of tests increases, the benefits will continue to increase, the team's self-confidence will gradually increase, the investment cost will be reduced, and iterative growth will be achieved. |
<<: WeChat reading product operation analysis!
>>: Five social marketing strategies for 2022
I don’t know if you’ve felt this way recently. Th...
This chapter will take the recent mini program fi...
For third-party optimizers, who deal with clients...
In an era where marketing is rampant, whoever cap...
Since last year, many merchants should have disco...
Course Highlights 1. Pure dry goods and high-inte...
I have mentioned this point many times in my prev...
There is nothing wrong with chasing hot topics , ...
Awa C4D zero-based practical training course will...
A friend previously left a message asking Qinggua...
In large-scale marketing activities, community op...
Community is a concept that is familiar to all fr...
Live streaming has become a standard sales method...
You know APP development, but do you know APP oper...
12 aerobics classes Sun Jing team Zhang Dandan da...