Data analysis for event promotion, learn it now

PSM is not suitable for all marketing scenarios. Generally speaking, it is suitable for scenarios with sufficient sample size, significant experimental effects and reliable tendency modeling. In some scenarios, it is difficult to define the scope of the control group. At this time, if all users (excluding users in the experimental group) are used as the selection range of the control group, the final error may be large.

01 Introduction

In actual evaluation work, not all marketing activities have been tested with AB experiments, and not all marketing activities that have been tested with AB experiments can accurately evaluate the effectiveness of the activities. The more typical situations are as follows:

Scenario 1: A big promotion is going to be held in the banner position, but the time until the launch of the activity is short. If customized development by R&D is required for AB experiments, and R&D scheduling resources are not available, operations will often operate the activity for all users. When evaluating the input-output of the promotion afterwards, there is no strict comparison between the experimental group and the control group. If the users who directly participated in the activity are used as the experimental group and the majority of users who did not participate in the promotion are used as the control group, then there will be great differences between the experimental group and the control group themselves (those who participated in the promotion are generally more active and sensitive to subsidies). In this case, how should the evaluation be conducted to obtain a more reasonable conclusion?
Scenario 2: A live streaming platform launched a new live streaming feature and conducted an AB experiment, in which the experimental group could see the new feature, but the control group could not. However, in the subsequent evaluation, it was found that although the data of users in the experimental group could see the new function and clicked to use the function showed good performance, the penetration rate of users who clicked to use the new function was very low. If the experimental group and the control group of the AB experiment were directly compared, no significant conclusion could be drawn. So does this new function of the live broadcast room really have no effect?

02 Introduction to PSM Method

In order to solve the marketing activity efficiency evaluation in the above two scenarios, a control group is often matched according to the actual situation. Propensity Score Matching (PSM) is one of the matching methods, which enables a more reasonable comparison between the experimental group and the control group.

The PSM method is generally used in fields such as medicine, public health, and economics. For example, if the research question is the impact of smoking on public health, if a randomized controlled experiment is to be conducted, a large number of experimental users should be recruited and then randomly assigned to the smoking group and the non-smoking group. This experimental design is not easy to implement and does not conform to scientific research ethics.

In this case, observational research is the most appropriate research method. However, when faced with the most easily available observational research data, if no adjustments are made, it is easy to reach wrong conclusions. For example, comparing the healthiest people in the smoking group with the worst people in the non-smoking group, and concluding that smoking has no negative impact on health.

From a statistical perspective, this is because observational research does not use a random grouping method and cannot weaken the impact of confounding variables between the experimental and control groups based on the law of large numbers, which can easily lead to systematic biases. PSM is used to solve this problem and eliminate interference factors between groups.

The definition of PSM is very intuitive, it is the "tendency" of a user to belong to the experimental group. Users with different characteristics should have equal probability of being intervened. In theory, if we match each experimental group user with a user with the same score in the control group, we can get homogenous experimental and control groups, and we can pretend to have done an A/B experiment, and then we can compare the groups at will.

In actual work, if the PSM method is verified through multiple periods of observation to be more suitable for certain marketing scenarios, the PSM model can be commercialized. Operations personnel do not need to submit requirements to the algorithm every time; they can get the final result through simple input.

1) Input

Determining the sample set is the most important step in PSM, which includes determining the users in the experimental group and the users in the control group. The users in the experimental group are generally selected from users who are reached by the strategy or users who have truly experienced the core strategy. They are defined specifically according to the characteristics of the marketing campaign. The control group is given a range, and through modeling, users with user characteristics similar to those of the experimental group are selected from the given control group range as the real control group.

Generally, the range of users selected for the control group should be users who have the tendency to participate in the activity but did not participate. For example, users who are also exposed to a certain activity page and experience the activity are the experimental group, and users who are exposed but did not experience the activity are the selected control group.

2) PSM modeling

First, we need to estimate the propensity score: this step is directly a modeling problem. The independent variable is the user feature variable. Feature preprocessing is done as needed, and LR or other more complex models, such as LR + LightGBM, are applied to estimate the propensity score.
Secondly, propensity score matching is performed: based on the propensity score of each user, a nearly homogeneous control group is matched for the current experimental group users. When there are enough users, a simple approach is to perform one-to-one matching without replacement: for each user in the experimental group, we find a user in the control group with the closest propensity score and pair them up. During the matching process, the score difference of the paired users can be restricted to not exceed a certain threshold. If they are not a good match, they will be abandoned to prevent matching users who are "too dissimilar" together.
Model output and evaluation: The output of the model includes experimental users, the matched control group users, and the propensity score. The evaluation indicators include the AUC of the model on the training set (the higher the value, the more accurate the propensity modeling is; generally, AUC ≥ 0.85 is considered to be effective) and the matching relationship value of each feature dimension (the higher the value, the better the matching relationship of this feature dimension).

3) Effect calculation

By constructing a control group with similar user characteristics to the experimental group through PSM, the logic of effect calculation will be similar to that of AB experiment.

03 PSM Method Practice

Taking the scenario 1 mentioned above as a case, we will analyze the effectiveness of marketing activities without AB experiments.

Determine the scope of the sample set

Experimental group: User A1 who clicks into the promotion page and receives the red envelope, exports the uesr_id details as the input for the PSM experimental group, assuming the number of users is 10,000
Control group range: User B1 who was exposed to the promotion page but did not receive the red envelope, export the uesr_id details as the PSM control group selection range input, assuming the number of users is 50,000

PSM Modeling

From the control group range B1, user B2 with similar user characteristics to A1 is constructed through PSM modeling. One experimental group user finds one control group user with similar characteristics, so the number of users of B2 is also 10,000. The model's AUC=0.89, and other feature matching values are good.

Result calculation

Reliability: AUC>0.85, the model effect is good, and the matching results can be used as a reference.
Calculation of activity subsidy efficiency: The sample size of the experimental group A1 is 10,000, the total GMV contributed is 500,000 yuan, and the total investment cost is 50,000 yuan; the sample size of the control group B2 constructed by PSM is 10,000, the total GMV contributed is 300,000 yuan, and the total investment cost is 25,000 yuan. The final calculated input-output ratio of the promotion activity is ΔGMV/Δcost = (500,000-300,000)/(50,000-25,000) = 8. Then we can conclude that the input-output ratio of the promotion activity is 8, that is, 1 yuan of investment brings 8 yuan of GMV.

04 Postscript

Therefore, it is recommended to conduct AB experiments when possible, and consider PSM when it is really impossible to do AB experiments. At the same time, PSM can be combined with DID+user segmentation to improve accuracy.

Author: A data person’s private land

Source: A data person’s private land

<<: What are the functions of Lanzhou’s garbage recycling mini program and how much does the garbage recycling mini program cost?

>>: Tips for building personal private domain traffic