In addition to PV and UV, product operations also need to pay attention to these indicators!

In the past six or seven years, I have been starting a business in the enterprise service field and have used many analytical tools : GA, Mixpanel, Heap, etc. The functions are very powerful, but I always feel that something is missing. We see overview indicators like PV/UV, but they can't guide us to do better. After obtaining what users do through these rough data, we also need to see how they do it and understand why they do it. We need real-time, comprehensive user behavior data. By analyzing the overall process of user behavior, we can find the key nodes of conversion and the core reasons for user churn, so as to help us prescribe the right medicine, find executable indicators, and implement them as optimization actions.

Today, I would like to share some of our explorations and solutions in this regard. 1. Huge Demand for User Behavior Analysis From the perspective of data composition alone, a complete closed-loop data source is mainly divided into three major parts: the first is user behavior data, the second is server log data, and the third is transaction data. Among them, except that transaction data is often stored in offline databases and obtained and analyzed through ETL, behavior data and log data are often similar. Complete user behavior data can basically cover most of the server log data, and it also contains a lot of information that is lacking in the log data. From the perspective of technological development, the front-end can be said to be the fastest growing in recent years. Many new things appear every month. The overall trend is to develop towards single-page applications and pursue user experience. At the same time, there are mobile applications that also generate a large amount of behavioral data, which will not have much interaction with the server. Therefore, as an application provider, we need to know how people in front of the screen use our products and understand the value behind user behavior. GrowingIO is currently used by nearly a thousand customers. I have summarized the analysis requirements that customers often ask us about, which can be roughly divided into three scenarios: The first scenario is: I organized an event and wrote an article. I want to know how effective it was and whether it brought me enough traffic, which is the measurement of marketing effectiveness. Some of our clients spend millions of dollars on SEM each year, but have no idea how much return they get from their money.

The second scenario is whether the user activation process is reasonable. After working hard to import traffic, have these traffic been converted into users? How many were converted in each step of the registration flow, how many were lost, and where did those that were not converted go? On this basis, how should we optimize, what is the effect after optimization, whether this week's conversion rate has improved compared to last week, what caused the difference, etc. The third scenario is whether these registered users stay and become a loyal user or even a paying user. The users who stayed stayed because of a reason. Is there a magic number that can greatly improve High user retention, for example: LinkedIn found that users who added 5 social connections in the first week had high retention; Facebook found that users who added 10 friends in the first week had high retention; Twitter found that users who had 30 followers in the first week had high retention; Dropbox found that users who installed more than two operating systems in the first week had high retention. These are the magic numbers found in retention analysis. 2. Traditional analytical methods that are complex and prone to error Ultimately, all analysis ultimately serves business, and business serves people. Therefore, user behavior analysis means that we need to establish an analysis system based on user behavior. In addition to understanding "who" of the user did "what" and "how" they did it, we can further understand "why" they did it, prescribe the right remedy, and convert it into optimized actions. Analysis is a long-term optimization process that requires us to continuously monitor changes in data. In addition to behavioral data indicators, there is another type of data indicator, which we call vanity indicators, such as PV, UV and other traffic overview data. These indicators are just for show and cannot guide us to do better. User behavior data indicators are another category, such as the user acquisition, user activation, and user retention we introduced above. Understanding these behaviors will correspond to an optimization process, so they are also called actionable metrics, which is the charm of user behavior data. Next, we need to start tracking user behavior, which can generally be divided into the following seven steps: 1. Determine the analysis scenario or goal . Determine a scenario or a goal. For example, we found that many users visited the registration page, but few of them completed the registration. Then our goal is to increase the registration conversion rate and understand why users did not complete the registration and which step blocked the users. 2. Think about what data we need to know to help us achieve this goal. For example, for the previous goal, we need to break down the data of each step from entering the registration page to completing the registration, the data of each input, and the characteristic data of people who completed or did not complete these steps. 3. Determine who will be responsible for collecting the data? Who is responsible for collecting this data? Usually it is our engineers. 4. When to evaluate and analyze? How to analyze the collected data and when to evaluate the collected data. 5. How to provide an optimized solution? After discovering the problem, how to come up with a solution. For example, whether it is a design improvement or an engineering bug. 6. Who is responsible for implementing the solution. Identify those responsible for implementing the plan. 7. How to evaluate the effectiveness of the solution? For the next round of data collection and analysis, return to the first step and continue iteration. Knowing is easier said than done. In this whole process, steps 2 to 4 are the key. I call the approach currently adopted by traditional service providers such as GA, Mixpanel, and Umeng the Capture model. By planting certain points on the client side, relevant data is collected to the cloud and finally presented on the cloud. For example, in the example in the picture, I believe many people have written similar code.

Capture mode is a very effective method for non-exploratory analysis. However, this also places very high demands on those involved in the entire process: Disadvantage 1: Experience-oriented Capture mode is highly dependent on human experience and intuition. It’s not that experience and intuition are bad, but sometimes we ourselves don’t know what is good. Instead, experience becomes a preconceived burden, and we need to use data to test and prove it. Disadvantage 2: High communication costs. In addition, an effective analysis result depends on the integrity and completeness of the data. After communicating with many companies, many complaints were that “even the log format cannot be unified”, let alone subsequent analysis. This is not a problem with specific people, but more of a problem of collaborative communication. The more people involved, including product managers , analysts, engineers, operations , etc., and everyone's professional fields are different, it is normal for misunderstandings to arise. I once talked to our CEO Simon. When he was leading the data analysis department at LinkedIn, LinkedIn specially formed a tracking point team of up to 27 people. They held meetings every day to unify the format and location of tracking points, and the meetings often lasted for several weeks.

Disadvantage 3: A lot of time is spent on data cleaning and data analysis code intrusion. In addition, due to the variability of demand, the tracking points are added in multiple times, lacking overall design and unified management, and the result is naturally extremely dirty. Therefore, a big part of our data engineers’ work is data cleaning, manually running ETL to generate reports. According to statistics, for the vast majority of analytical work, 70 to 80 percent of the time is spent on data cleaning and manual ETL, and only about 20 percent is spent on things that are truly of business value. On the other hand, as an engineer with obsessive-compulsive disorder, what I hate most is that a large amount of analysis code invades my business code. I dare not delete it or modify it. Over time, the entire code base becomes a mess. Disadvantage 4: Data omissions and errors in data collection . All of the above are still good. The most frustrating thing is that after the product is launched, it is discovered that data collection is incorrect or missed. After correction, the process has to be run again, and a week or two may pass. This is also why data analysis is so time-consuming, usually taking months, and is very inefficient. 3. Principle of data analysis without embedding points After experiencing countless painful nights, we decided to change our thinking, hoping to minimize human errors, which we call Record mode. Different from the Capture mode, the Record mode uses machines to replace human experience and automatically collect the full amount of user behavior data on the website or application. Because of automation, we control the format of the data from the source of the analysis process. From a business perspective, all data is divided into five dimensions: Who, the person behind the behavior and what attributes he or she has; When, when the behavior is triggered; Where, urban areas , browsers , and even GPS; What, that is, the content; and How, how it is accomplished. Based on the deconstruction of information, we ensure that the data is clean from the source. On this basis, we can completely automate ETL and trace back any data we need at any time. Going back to steps 2 to 4 of the previous process, we have reduced the number of participants from multiple parties to basically just one party. Whether it is a product manager, analyst or operator, they can use visualization tools to query and analyze data, truly achieving what you see is what you get. It not only supports PC, but also iOS, Android and Hybrid, and can perform cross-screen user analysis. As a user behavior analysis tool provider, GrowingIO not only needs to use it internally, but also needs to adapt to thousands of external websites and applications. Therefore, we did a lot of exploration in the implementation process: Automatic user behavior collection The GUI programs we are currently exposed to, whether they are Web Apps, iOS Apps or Android Apps, are all based on two principles, tree structure and event-driven model. Whether it is the DOM node structure on the Web or the UI control structure on the App, a complete tree structure is constructed and rendered on the page or screen. Therefore, by monitoring and detecting the tree structure, we can easily know which nodes have changed, when they have changed, and what changes have occurred. At the same time, when the user performs an operation, such as clicking the mouse or touching the screen, an event will be triggered, and the callback function bound to the event will be triggered to start executing. Based on these two points of understanding, it is clearer how to achieve no embedding in the SDK. As long as we can trigger the function we defined when the node changes or an event occurs, then I will know the multiple information of the event.

Data Visualization How to match the collected data with business goals. Our solution is our visualization tool. As mentioned earlier, any atomic data is broken down into five different classification dimensions. Therefore, when we do matching in the visualization tool, we are matching information of different dimensions. For example, a click on a link will match the content or jump address, which is What, and the click behavior, which is How. There is also its location information on the page, such as its hierarchical position in the tree structure, and whether it has some id, class or tag, all of which are used for data matching. We have developed an intelligent matching system and established a rule engine for element matching by learning from real user behaviors. Precisely because the full amount of data is collected, the entire matching system is like genetic evolution, with both the memory of past history and the ability to adapt to the evolution of new structures.

BI Business Analysis During the system design process and the entire Data Pipeline process, after the data is processed, we will first process the defined data in real time through Spark Streaming according to different priorities, and then perform offline pre-aggregation on the matched data every once in a while, which makes multi-dimensional analysis very flexible.

The purpose of collecting user behavior data is to understand the user's past behavior and use it to predict future events. There is no need to embed data points and data can be traced back at any time, allowing product managers to handle the entire process of user behavior analysis alone. GrowingIO hopes to provide a simple, fast and scalable data analysis product that can greatly simplify the analysis process, improve efficiency and directly reach the business. The foundation of all this is the point-of-care intelligent full-data collection that we have been developing since day one. Based on this, we optimize the product experience, achieve refined operations, and use data to drive user and revenue growth.

<<: 31 provinces and municipalities added 3 imported cases from abroad, and Beijing had no new cases for 8 consecutive days

>>: How does information flow copywriter play the numbers game? Click to use directly!

With 80 million users after one year of launch, what is Pinduoduo’s growth logic?

Blog

Miss Zuo's "Live Streaming Club Thirteen Lines" Unmanned Live Streaming Tutorial Attached: Tools, Documents, and Script Materials

Blog

How to check the popularity of keywords in Baidu promotion? Is there any tool?

Will the quarantine costs for college students in Shanghai returning home be free in 2022? Attached are the latest policies from various places!

At present, all 16 districts in Shanghai have ach...

In addition to PV and UV, product operations also need to pay attention to these indicators!

With 80 million users after one year of launch, what is Pinduoduo’s growth logic?

Miss Zuo's "Live Streaming Club Thirteen Lines" Unmanned Live Streaming Tutorial Attached: Tools, Documents, and Script Materials

How to check the popularity of keywords in Baidu promotion? Is there any tool?

Customer acquisition skills for financial products!

Why do some landing pages make you want to buy after seeing them?

Fission Guide: What is traffic pool thinking?

Tik Tok promotion skills: 3 steps and 7 key points!

7 ways to play with private domain traffic

2019 Advertising Industry Mid-Year Observation

2019 Kuaishou operation tips for increasing followers!

Recommend

Information flow promotion | Is it reliable to increase the volume through stacking plan?

A wonderful debater teaches 12 lessons on precise expression

Why is there such a big difference in conversion effects when they are both advertising channels?

Don’t know how to promote events? Here is a set of event operation templates!

The operation and promotion of Super Course Schedule

14 professional operation tools that are essential for Internet operators!

What is the TAG tag? How to use the TAG tag correctly?

C++ grammar and basic algorithms for middle school students' information science Olympiad

5 useful TikTok data analysis tools recommended

Will the quarantine costs for college students in Shanghai returning home be free in 2022? Attached are the latest policies from various places!

3 types, processes and key points of conference event planning

How do big self-media accounts make money?

Analysis of online fission activities!

Tencent Youlianghui's strategy to grab sales during Double 11

[Original] Free promotion of APP with zero budget in the early stage, just do these 17 points!