Data operation: customer portrait data analysis!

How can we use data to clearly profile existing users, find the core concerns of users in various industries, and conduct refined operations to increase user repurchases? How can we sort out the data clearly and compile indicators that can actually guide the business?

Business data is complex and numerous. How can we define core indicators of concern through massive data to guide user growth and conversion?

How to know the core path of user experience product through data? How to design a product’s onboarding guide to improve the user experience ? Guide more users to experience the core of the product and become users with “high conversion potential” ?

When operating users, how can we use data to clearly profile existing users, find the core concerns of users in various industries , and conduct refined operations to increase user repurchases?

These may be what many operators want to know when faced with massive amounts of data. We all know that data has powerful capabilities, and cleaned data can point out a clear path forward. As the saying goes, an operator who can't look at data is not a good product manager. As a user growth product operator who mainly analyzes qualitatively and quantitatively through data and user interviews, and then produces corresponding strategies to guide growth, today I will talk about several hard-core abilities to help improve operational capabilities through data and formulate operational strategies.

There are several stages in the process of data analysis, including tracking data, obtaining data, analyzing data, and producing feasible operational strategies. Each stage is difficult.

The following may be real scenarios for operations to extract data:

Operations: “I want to see how users use the new features that were recently released.
How is it used? Can you give me some data? ”
Development: “What data to look at?”
Operations: "I just want to see who has viewed the feature, who has purchased it, what are the characteristics of the users who have purchased it, and which ones can be used by our target users to promote it again."
Development: “What fields are needed?”
Operations: "What fields? Can you export what functions this customer has used, how they used it, whether they are deep users or lost users, and what industries they are in?"
Development: "It's OK to use any function, but define usage. Is it the number of times it's used or the money earned from using it? Or the duration of use?"
Operations: “All is fine.”
Developer: "Can you think clearly about what you are going to use it for? You can call it anything. I am the operator and you are the operator."

This is a very common situation and is understandable because the operation perspective is the business perspective, but the development perspective is the data perspective. This field does not include whether the user is active as you said. At this time, I will definitely think that I need a set of data that can clearly tell me what industry this user is in, what functions he uses, what business model he has, and what status he is in! !

This brings up a question: how can we clearly sort out the data and compile these indicators that can actually guide the business?

How to define user portraits through data?

Clearly define the desired indicator types, such as user life cycle indicators, product usage behavior indicators, user purchase behavior indicators, user ability behavior indicators, user natural person attribute indicators...
Communicate with the data as clearly as possible and obtain as detailed data as possible. Note that it is best not to have multidimensional data when extracting data! No multidimensional data!
Process the data and try not to ignore the indicators that may affect key behaviors. Use models or other advanced (pretend) means such as Excel to analyze macro data (user data for the entire industry or region) and onlooker data (data as detailed as a record for each user)
Based on the analysis results, a set of applicable indicator systems is obtained, and the indicators are automatically applied to each user.
The user portrait is initially completed and can be optimized later

Indicator definition: scenario-based definition facilitates the identification of indicators that need to be extracted

Before communicating with data or development to extract data, you first need to think about what kind of portrait results you want to get. You can boldly use assumptions here, for example

"I hope to see user A, a Beijing K12 institution user from Baidu Search. He has contacted us to renew his product for three years, but his operational capabilities are relatively weak. He has always used the same few functions and has not used any of our new functions. He mainly uses the live broadcast and exam functions. The number of users in his institution has remained stable at around 100,000, of which three are still under maintenance. They are used frequently during student holidays and exam times."

This is very clear. Generally, I will divide the data into two types, and then refine the relevant indicators based on the two types of data.

Each type of data here can be further subdivided into detailed data indicators. For example, user basic data can be refined in this way, and other indicator types can also be refined in this way. Indicators can be selected based on product attributes and the content that needs to be understood.

Data extraction - dimensionality reduction of multidimensional data

After clarifying the definition of indicators, we will find that some indicators may involve multiple dimensions and there is no way to compare and analyze them.

For example, a user successfully creates a certain type of product. The sales volume and sales volume of each product are different. How to comprehensively handle the usage of the product function? Here we need to process the data by dimensionality reduction, which can be done by weighted averaging, or taking the mode or median as representation, so as to reduce the situation of multi-dimensional comparison in comparative evaluation.

Data Analysis - Discovering What the “Most Important Indicator” Is

A user record has many associated data fields. What is the core difference between a paying user and a non-paying user? What is the key to getting users to pay ? What do users care about ?

This may require analysis to see clearly which independent variables are related to the dependent variable (user payment). Here I recommend an algorithm, the CHAID decision tree. This type of decision tree is specifically used to find out the core variables that affect the final result. In other words, with so many functions, so many user behaviors, and so many attributes, which type of user with which attributes and which type of user with which behavior are more likely to convert!

How is the decision tree algorithm calculated?

Assuming that we need to understand how users can pay, then whether or not to pay is the dependent variable to be examined, and it is also the value that the decision tree needs to predict based on the variable situation.

We divide the entire data set into a training set and a validation set according to 20% and 80%, that is, one part is used to train the model to let the model find characteristic factors from the data, and the other part is used for verification and prediction to determine whether the model and the selected characteristic variables are effective and how good the fit is.

Extract 2 given values from the independent variable and perform a chi-square test with the dependent variable; if the chi-square test shows that the relationship between the two is not significant, the two positive given values can be combined. Continue to reduce the number of values of the independent variable until all values of the independent variable are significant.

For example, there are 130 independent variables in our data, and we don’t know whether many of them are related to whether users pay, whether the number of users’ weekly active times is related to user payments, and whether users’ attempts at a certain feature are related to user payments. In this case, we can use the chi-square test of the decision tree to determine whether the independent variables and dependent variables are related by distance.

Find the most significant independent variable by comparison, and split the sample according to the final value of the independent variable, that is, to form multiple different trees (generally CHAID generates two tree nodes)

Finally, all decision points related to whether users pay or not are displayed. For example, if more than three live broadcast functions are created, the probability of payment is as high as 80%. The decision tree helps us eliminate irrelevant or insignificantly correlated independent variables and tells us what will lead to user conversion and payment. ,

Data operation: customer portrait data analysis!

What are the conditions for farmers’ loans in Chengdu? Can non-locals get loans to buy houses in Chengdu?

[Optimization Tips] Headline Information Flow: Do you know all these tips?

E-commerce operation: holiday promotion traffic field!

In-depth analysis of marketing: Why has Pinduoduo gained 300 million users in just three years?

The latest news on Shanghai’s lifting of lockdown in 2022: When will it be lifted? All unblocked on May 18th?

Anxin No Source 3.0, a collection of three explosive gameplays, worth 8888 yuan

The underlying logic of advertising

Three ways to increase users in the second half of Internet finance!

【Practical】A complete collection of commonly used auxiliary tools for APP promotion!

App channel promotion skills and strategies!

Recommend

Postpartum recovery course

How to effectively improve your product conversion rate? 3 case studies!

After working in operations for so many years, have you ever thought about the definition of operations?

In fact, not many people know how to do such effective QQ group promotion!

How much does it cost to create a children’s clothing mini program in Zhangjiakou?

Tips for attracting new users on Pinduoduo APP!

Douyu Live Product Analysis Report!

How to recall old users and improve retention rate?

Pinduoduo, the “Master of Fission”

20 forms of Internet advertising, 5 billing methods, and 10 ROI evaluation indicators

Dongzhe Daily 100 Project 1 Document Baidu Cloud Download

The marketing logic followed by Apple and Huawei (focus on 3 key points)

Fission Guide: What is traffic pool thinking?

Guangdiantong advertising introduction, Guangdiantong advertising placement

Guike Zhihu live streaming camp, 0 basics to help you earn 20,000 a month