APP new users and retention analysis!

APP new users and retention analysis!

01 New User

If you have read the first article in our series: "Data Literacy (1): What are the DAU and MAU we often talk about? 》You should know this sentence:

A solid understanding of data basics is the bridge of communication!

This sentence runs through our data literacy series. Let’s take a look at a scenario with new users so that everyone can once again deeply feel this sentence.

In order to promote the app, our operations colleagues sought cooperation with channel partners. When it came to settlement, they discussed using new users as an indicator for settlement. However, they had a dispute over the specific definition of new users:

  • Channel partners: As long as a user clicks the product download button on the promotion page of our channel, it will be recorded as a new user.
  • Operation Meow: That won’t work. It doesn’t make much sense if you click the button but don’t download the app. We record a successful download as one addition, and multiple clicks as one.
  • Product Dog: Wrong, wrong. Our app is so awesome, we should at least launch it once and record it so that they can experience it. Otherwise, the data quality will be low and have no reference value.
  • Engineering Lion: Stop it, you guys are just fantasizing. Without registration, we won’t have any data in the backend. Must be registered to be considered a new user.

Faced with such a scenario, it is difficult for us to say right or wrong. We are more concerned about how to reach a consensus on data indicators with each other!

We break down the word "new users" like explaining the meaning of a word: new = new + increase. Next we need to clarify two issues:

Q1: What is increase? At which node is the increase?

A1: Generally speaking, before a user develops a relationship with a product, they often go through the path shown in the following figure:

Users enter the channel page (such as Baidu's advertising page or Penguin's advertising page) through different channel connections; users click download on the channel page or enter the app store through the channel page to download; install and start the app and go to the app homepage; trigger corresponding activation actions (different businesses have different activation actions, such as successful registration, purchasing goods, or watching a video, etc.).

Theoretically, different nodes can be added at one time. Here, I will summarize the advantages and disadvantages of different nodes as new additions, as well as the suitable scenarios.

You can choose the node that suits you based on the summary in the table and your company's business.

Q2: How to determine whether it is new?

A2: This question is introduced by an example. Assuming that we use the installation and launch node as the increase, a user downloads an app and installs and launches it, then uninstalls it after two days, and then reinstalls and launches it. At this time, is he counted as a new user? Here, we generally have two judgment methods:

  1. Device-based: When a user first installs and launches the system, the device is recorded. If you install again, no record will be kept. For details on how to judge devices between different systems (ios, android, web), please see the introduction to the user part in the previous article "I'm no longer afraid of people asking me about DAU and MAU".
  2. Based on account association.

Use the account as the basis for judgment and compare it with the existing accounts in the background to see if this account has existed before.

The retention mentioned in this article refers to the retention of new users, which will not be elaborated below.

First, let’s take a look at how the Umeng platform defines retention.

Mr. Song will introduce a case to help everyone understand the definition.

The case is still a tragic app. It gained 100 new users on the first day of its launch, but no new users were gained afterwards. The following is the daily activity table for the first seven days after its launch:

We can conclude from the table that MAU=100. If you have any questions about this, please refer to the first article in the Data Literacy Series, "No More Worrying About People Asking Me About DAU and MAU~ Data Literacy Series (1)".

Question: How to calculate the seven-day retention rate of new users?

Two algorithms are given here.

Algorithm 1: (number of retained users on the seventh day / number of new users on the first day) * 100%

Algorithm 2: (number of retained users after deduplication from the second day to the seventh day / number of new users on the first day) * 100%

According to the definition of retention, "new users within a certain period of time who continue to use the application after a period of time are retained users." From this, we can conclude that retained users are a subset of new users over a period of time.

For this question, 100 new users were added on the first day of launch and no new users were added afterwards, so the active users in the days after the first day are all subsets of the new users on the first day, that is, the number of retained users on day X = the number of active users on day X, and the number of new users on the first day = the active users on the first day.

However, if there is no premise of "no new users after 100 new users on the first day", then the number of retained users on Day X ≠ the number of active users on Day X. The accurate statement should be the number of retained users on Day X = the number of active users on Day X from the new users on the first day.

This is a little confusing, let me give you a small example to help you understand.

(Assuming that there are 200 new users in May, 100 of these 200 users launched the app in June, and 80 of them launched the app in July, then the number of retained users in June is 100, and the number of retained users in July is 80.)

So which algorithm should be used?

If you remember carefully, Mr. Song has repeatedly emphasized that data analysis must be based on business and have a purpose (that is, the significance of the data indicator of retaining users).

Purpose Generally speaking, retention calculations and analyses have the following purposes:

  1. Observe the quality of users brought by different channels;
  2. Feedback on the effects of new features launched after version update. (This part of the function will involve the triggering of key user behaviors. It is a problem of accurate retention, which we will explain in later articles)

Here we will explain by distinguishing channel quality:

Suppose an app has two customer acquisition channels A and B, and both are launched on January 1. After 100 new users on that day, no new users are added. Given the number of daily active users of the two channels from January 1 to January 7, use Algorithm 1 [(number of retained users on the seventh day/number of new users on the first day)*100%] to calculate the two seven-day retention rates.

Some of you may think that the calculation using Algorithm 1 ignores the user data from the 2nd to the 6th, so the data calculated is inaccurate. In fact, this is not the case. We obtained the two data in order to compare the data and gain insights into business explosion points from the comparison. Because no matter whether it is channel A or channel B, we only use the data from the first and seventh days, and ignore the data from the 2nd to the 6th. The ignored information is consistent.

Because the single influencing factors are the same, it is relatively fair and reasonable to use Algorithm 1 for comparison.

Of course, even so, some friends may still ask, is there any way not to ignore the data from the 2nd to the 6th?

[The number of retained users after deduplication from the second day to the seventh day / the number of new users on the first day) * 100%] This calculation method takes into account the active users between the 2nd and 6th day, but is this calculation method suitable for evaluating channel quality?

We can see the following figure is a line graph about the seven-day daily active users of channels A and B. If we strictly follow the calculation of Algorithm 2, we will find that the retention rate of channel A is higher than that of channel B. In fact, we can see from the figure that the active user curve of channel B is closer to a natural and gentle decline, and the active users on the seventh day are also higher than those of channel A. Generally speaking, the user quality of channel B is higher than that of channel A.

Therefore, it is not OK to use Algorithm 2 to calculate the quality of the retention evaluation channel. The reason is that the introduction of data from the second and sixth days actually affects the judgment of the results.

Through the above cases, everyone should understand the difference between the two.

We can call Algorithm 1 the seven-day retention, and Algorithm 2 the seven-day retention.

Of course, there is rationality. Algorithm 2 is not without applicable scenarios. It is more suitable for some apps with specific usage cycles. For example, an app focuses on weekend parties, and most of the active users are concentrated on Saturdays and Sundays. If we calculate the seven-day retention rate of new users on any day of the weekdays (Monday to Friday), we will find that it is significantly lower than that on weekends.

In this case, only looking at the daily retention on the seventh day obviously cannot reflect the real situation. On the contrary, paying attention to the retention within seven days is more real and reliable.

Well, Mr. Song still uses a set of seven-day retention tables from the Umeng data platform. You can try to see whether Umeng uses Algorithm 1 or Algorithm 2.

Some friends may be confused, while some friends may intuitively think that Umeng uses Algorithm 1. In fact, the calculation method of the Umeng platform is very similar to Algorithm 1 but with some differences. Let’s call it Algorithm 3 for now.

(Number of retained users on the seventh day/number of new users on the 0th day)*100%.

Day 0 actually refers to the day when retention is calculated, which is the same day as the first day in Algorithm 1. As shown in the figure above, if we calculate the seven-day daily retention for 2018-08-01, the first day in Algorithm 1 and the 0th day in Algorithm 3 both refer to the number of new users on 08-01, which is 339 people. If you look at the above picture carefully, you will find that Umeng uses 1 day later for statistics, and 2 days later corresponds to the second and third days in algorithm one.

So why does Umeng use Algorithm 3? What are the benefits of this calculation method? I hope everyone will think about it.

(Here’s a hint: it’s related to the seven days of the week).

Reveal the answer: This is because by using Algorithm 3 we can avoid the interference of the week on the data.

For example, 2018-08-01 is Wednesday, and the seventh day using Algorithm 1 is Tuesday 08-07, while Algorithm 3 is Wednesday 08-08, seven days later. By using Wednesday's data at the same time, we can reasonably avoid the interference of today's day on the data.

So we have talked about three algorithms in total. Each algorithm has its own significance. The specific one should be selected according to your company's business and ensure that the same standard is adopted within the company.

Here, Mr. Song makes a table to summarize it for everyone, and you can save the picture for future use.

Based on this, we have almost finished talking about new additions and retention. Everyone should feel like they have suddenly been enlightened.

Author: Song Laoshi

Source: Product Manager Tucao

<<:  2022 Beijing Paralympic Winter Games schedule: What events are included? How many days will it be held?

>>:  Guangdiantong account establishment, targeted classification and path landing page production

Recommend

How to make data analysis report? How to make Baidu bidding statistics report?

Many people who have just started SEM find data a...

Analysis of Toutiao’s information flow directional system

This article will tell you about the targeting fu...

Case! 3 major steps for APP to acquire new users

A store without customers will close, and a produ...

How to do competitive product analysis report as a workplace rookie!

Recently, a netizen complained to Clippings that ...

Analysis of Douyin AARRR Traffic Funnel Model

Before formally analyzing the traffic funnel mode...

What is Baidu Wenku promotion? How to promote Baidu Wenku?

Document libraries are a major way to spread info...

Summary of bidding data analysis: How to analyze bidding data?

Summary of bidding data analysis: How to analyze ...

JD.com 618 Post-Battle Review: A Textbook Marketing Case

This year's 618 Mid-Year Shopping Festival fi...

Xiaohongshu’s strategy for creating hot products and case analysis!

This article mainly analyzes the brand’s marketin...