App Promotion, How to Identify Channel Fraud—Data Analysis

App Promotion, How to Identify Channel Fraud—Data Analysis

I wrote an article before the holiday, analyzing the effectiveness of channel delivery through statistical indicators (click the link to view it). Today I want to talk about how to identify channel cheating, analyze the effectiveness of channels, and anti-cheating methods. Welcome to criticize.

Some operators do channel delivery, and they deliver on every channel. The click-through rate is very high, but the activation rate is only in the single digits. It is also possible that the number of click activations is high, but the retention rate is low. All the money was spent, but there was no effect. I did the data analysis myself, but I couldn't come to any conclusion.

The prerequisite for our data analysis is to obtain reliable data. If the data is inaccurate, the conclusions drawn based on this data analysis are meaningless.

To obtain accurate data, we first need to choose a reliable statistical analysis platform. For platform selection, please refer to my previous article. Even when the platform is reliable, there may be some unreliable situations. As the saying goes, where there are rankings, there are ranking manipulations, and where there are data statistics platforms, there are data cheating workshops.

In the mobile Internet ecosystem, there are many unknown channel-boosting studios that contribute user data of equally low quality at very low prices.

The SDK of the early statistical analysis platform was based on plaintext jason data packets. Studios could easily forge these data packets with programs to simulate user data such as new users, active users, retained users, and duration. With the development of statistical analysis platforms, many analysis platforms have launched SDKs based on binary protocols, and developers can also call encryption switches by themselves. These technological advances have improved the security and data accuracy of the statistical platform. If the APP is upgraded to a secure protocol version of the SDK, it will be difficult for the brushing studio to brush the volume by directly simulating data packets.

As the saying goes, the devil is always one step ahead of the saint. The platform has its own methods, and the inflated traffic studio has its own ways. They may use a distributed manual method to increase the volume (the form can be referred to as a task-based points wall); or they may use a more intelligent method by writing program scripts, modifying real machine parameters, and driving real machine operation (interested students can learn about igrimace, an iOS volume-increasing tool). These behaviors are almost indistinguishable from real user behaviors, and it is difficult to distinguish these data technically.

In fact, experienced operators can still distinguish the difference between real and fake users through some data indicators.

Channel effect evaluation retention rate

Sometimes, channels will choose to import user data at important time points such as the next day, 7th day, and 30th day. We will find that the data of APP at key time points such as the next day, 7th day, and 30th day are significantly higher than those at other time points. In fact, the retention curve of real users is a smooth exponential decay curve. If you find that your retention curve has abnormal fluctuations with sharp rises and falls, it basically means that the channel has interfered with the data. It is conceivable that the quality of such users is very poor and has no commercial value.

The retention curve can not only help us judge the quality of the channel, but also provide a lot of reference suggestions on operation promotion and product optimization. Retention rate is so important, so how is it calculated?

The proportion of new users on a certain day who return n days later is the n-day retention rate for that day. For example, if we acquire 1,000 new users on February 1, 400 of these users return on February 2, and 200 return on February 8, then the next-day retention rate of the new users on February 1 is 40%, and the 7-day retention rate is 20%.

Retention rate is a common indicator in the industry for judging user quality. In the mobile Internet industry, if an APP’s next-day retention rate reaches 40%, a 7-day retention rate reaches 20%, and a 30-day retention rate reaches 10%, the retention rate of this APP is higher than the industry standard. Generally speaking, the retention rate of tool apps is higher than that of game apps, and the retention rate of high-frequency apps is higher than that of low-frequency apps. In addition to the application type, retention rate is also related to factors such as the APP's user experience and promotion methods.

User Terminal

Each channel has its own user base, and their user terminals will be different. For example, the top 10 models of users of Xiaomi App Store may all be Xiaomi phones, while the majority of users of Mobile MM may be users of mobile operators. Excluding those app stores with special channels, the user terminals of most channels are similar to the distribution of the entire mobile Internet terminals. We can understand these data by looking at mobile Internet data reports or data index products, and use these data as benchmarks to compare and analyze APP data.

We can focus on the properties of mobile devices such as device terminal, operating system, networking method, operator, and geographic location. I have listed some tips below, and you are welcome to discuss and criticize.

Method 1: Focus on the ranking of low-priced devices

You can focus on analyzing the new users in the channel or the device rankings of the activated users. If you find that a low-priced device is ranked unusually high, this situation is worth our special attention. These data can be found in the terminal attribute distribution of the statistical platform.

Especially since the iOS platform does not have a simulator, all user data needs to be triggered by a real device. Many studios that increase traffic will choose to purchase second-hand iPhone 5cs as real machines to increase traffic. A friend who does channel promotion fell into such a trap and found that 75% of the devices in a certain channel were iPhone 5c, which was more than the top 5 iOS devices. Then we found that the retention rate and other indicators of this channel were unsatisfactory, and finally found out that this channel used a large number of iPhone 5c to increase the volume.

Method 2: Pay attention to the proportion of new versions of operating systems

Through my years of work experience, I have found that many channel-boosting studios will have delays in adapting to the operating system version. Therefore, it is recommended that channel personnel compare the operating system distribution of channel users with that of all mobile Internet users when checking the operating system of channel users. If you find that there is no new version of the operating system (such as iOS 8.x) under a certain channel, one possibility is that the technology of the studio cooperating with this channel has not yet been adapted to the latest operating system.

Method 3: Pay attention to the usage of wifi network

Some friends asked me that the proportion of users using wifi has reached 90%. Is this proportion normal?

To answer this question, we first need to have some understanding of the current situation. Now we live in a high-speed network environment, and whether it is new users or active users, the proportion of WiFi usage is relatively large.

From the perspective of user behavior, if you pay attention to your friends, you will find that when downloading apps, people tend to use wifi (data traffic is expensive). In contrast, when launching apps, they are less sensitive to the current network. In other words, the proportion of wifi usage by new users will be greater than the proportion of wifi usage by starting users.

In addition, the usage ratio of Wi-Fi is also related to the type of application. If you are using an online video application, the wifi ratio may be above 90%.

If you are an APP with low traffic and you can find clues by comparing the WiFi data of new users and active users, it may be that the channel is playing tricks.

Method 4: Targeted delivery is also important

A friend who has been working in the industry for a long time shared with me an experience, saying that there is a lot of cheating in Fujian. When formulating the delivery strategy, we can focus on blocking areas with more cheating. This blacklist can also be customized based on the actual regional delivery effect of the APP.

In addition, we can also focus on certain areas when placing ads based on needs. For example, high-consumption areas such as Beijing, Shanghai and Guangzhou, and relatively blue ocean areas such as third- and fourth-tier cities. When reviewing the data, we need to verify whether the user is in line with our delivery strategy.

User Conduct

Method 1: Compare user behavior data

If an APP is developed for a long time, behavioral data such as visited pages, usage time, visit intervals, and usage frequency will tend to be stable. The behavioral data of different APPs are different. It is possible that a fake traffic studio can simulate seemingly real user behavior, but it is difficult to make it completely consistent with your app’s daily data.

The length of time or frequency of use of a channel by users that is too high or too low is worthy of suspicion. When we do channel data analysis, we can compare these data with the entire APP, or use the data from large application stores such as Android Market and App Store as benchmark data for comparison.

Method 2: Understand the hourly data curves of new users and active users

Many brushing studios falsify data by importing device data in batches or by starting it at a scheduled time. In this case, the new additions and startup curves will show steep increases and decreases. The increase and activation of real users is a smooth curve. Generally speaking, the number of new users and activations peaks after 6pm. Moreover, the trend of new additions will be more obvious than that of startups.

We can compare the time-sharing data from different channels to find anomalies. It should be noted that the comparison of such behavioral data needs to follow the single variable principle. That is, other than the different channels, all other factors in the experiment must be exactly the same. If we compare the active number of channel A on Wednesday with the active number of channel B on Saturday, there will definitely be a difference between the two data and they are not comparable.

Method 3: View the details of the page names visited by the user

Some studios will put the App key into other high-frequency apps. In this way, we may find that the data of channel users is very beautiful, but if we look closely, we can find that a large number of pages in the page names are not defined by ourselves. By comparing the page names, this form of channel cheating can be located.

If it is an Android app, this name is activity or fragment; if it is an iOS app, this name is a custom view. It doesn’t matter if you can’t remember this part. Remember to ask the developer for a list of specific page names, and compare it with the details of the pages visited by users in the statistics background to see the difference.

Conversion rate analysis

The analysis of conversion rate data can not only help us deal with channel cheating, but also help us judge the user quality of different channels and improve delivery efficiency.

Every APP has its own target behavior. For example, the target behavior of e-commerce applications is the user's purchase of goods. Game apps need to examine in-app payments. Social applications focus on user-generated content. Operations personnel need to define and design the target behavior of the application.

If a user is real traffic, he will go through the process of clicking, downloading, activating, registering, and triggering the target behavior. We can make these steps into a funnel model and observe the conversion rate of each step. The further back in the funnel we are, the more difficult it is to cheat, the more valuable the acquired users are to the system, and the higher the user cost we pay. Operations personnel need to monitor target behaviors and examine the conversion rates of target behaviors during channel promotion to increase the marginal cost of channel cheating.

Anti-cheat module

In addition to using ready-made statistical analysis tools, you can also apply for R&D personnel to develop your own anti-cheating modules. The anti-cheat module is similar to anti-virus software in principle. We can define some behavior patterns and add them to the blacklist library of the anti-cheat module. If a newly added device meets the defined behavior pattern, it will be judged as a cheating device. Each operator can define it according to his own APP. I have listed some common behavior patterns:

(1) Device number abnormality: frequent reset of IDFA

(2) IP anomaly: Frequent changes in geographic location

(3) Abnormal behavior: purchasing large quantities of discounted items, etc.

(4) Incomplete data package: only startup information is included, but no other user behavior information such as pages and events.

In conclusion:

As an operator, you need to be mentally prepared for long-term cooperation with channels. Making good use of data is the first step in a long journey. It is hoped that every operator can select appropriate channels through the use of data and increase the benefits of channel investment.

<<:  Yakeshi SEO Training: What traps should you avoid when exchanging friendly links?

>>:  King of Glory with 80 million daily active users: How to build a user system

Recommend

Why don’t pineapples have seeds?

Now is the season for pineapples. When eating pin...

How can an APP build a complete user growth system?

A complete user system is like raising a child, w...

"Zero-carbon agricultural products": Ding! Please check your "new green menu"

Organic food, also known as ecological or biologi...

【Smart Farmers】A little romance! Scientists develop pink cotton

Recently, the cotton molecular genetic improvemen...

How to install and set up the wordpress pageview plugin (wp-postview)?

As a webmaster, what I care most about is how man...

If you want to control gout, you'd better avoid these 8 foods

This article was reviewed by Zhu Hongjian, Chief ...

Automatic test input generation for Android: are we done yet?

Citation: SR Choudhary, A. Gorla, and A. Orso. Au...

3 tips for Tmall 618 marketing and attracting new customers!

The 618 Shopping Festival, the biggest e-commerce...

Don't buy them anymore! Cartoon-shaped hand warmers are not up to standard!

Electric hot water bottles, rechargeable hand war...