I took a course on user portraits some time ago, and also read some articles about user portraits. Based on my understanding of the above learning content and combined with my work practice, I would like to share with you the cognition, construction methods, productization and application of user portraits through this article. 1. Getting to know the user portrait1. User ProfileAs all user behavior data can be tracked by enterprises, enterprises are increasingly focusing on how to use big data for business analysis and precision marketing services; and to carry out refined operations, the first thing to do is to establish the user profile of the enterprise. When it comes to the concept of user portrait, we distinguish between user role (Persona) and user portrait (Profile): 1) User role (Persona) User personas are essentially a communication tool. When we discuss products, requirements, scenarios, and user experience, user personas come into being to avoid differences in understanding target users. User roles are based on a deep understanding of real users and a summary of highly accurate relevant data. They are fictional characters that contain typical user characteristics. The following is a typical user role: 2) User Profile User portraits are more commonly used by operations and data analysts. Precision marketing, business analysis, and personalized recommendations are all applications based on user portraits. User portraits are a collection of various variables that describe user data and can accurately describe any real user. The following is a simplified user portrait: [“ID”: 123456, “Name”: Zhang Jianguo, “Gender”: Male, “Date of Birth”: 631123200, “Place of Origin”: Beijing, “Place of Residence”: Beijing] [“Educational background”: School: Peking University, Major: CS, Year of admission: 1220198400] 2. User tags and user portraits1) User Tags User tags, which are descriptions of a certain dimension of user attributes, are independent and exhaustively enumerable. After collecting business, log, and tracking data, user attributes, user behavior, user consumption, risk control, social, and other dimensional tags are calculated using different statistical methods. For example: gender, age, number of visits in the past 30 days, purchase level, frequently active time periods, etc. For a detailed description of the construction of the user tag system, see the "2 Construction of Tags and Tag System" section. 2) User portrait Building a user portrait means labeling users in various dimensions. In terms of business value, tags and portraits are system modules similar to the middle layer, which lay the foundation for data-driven operations. They can help big data "walk out" of the data warehouse and provide users with personalized recommendations, precision marketing and other diversified services. For detailed descriptions of the user portrait system and its practical applications, see the chapters “3 User Portrait Productization”, “4 User Portrait Application” and “5 User Portrait Practice Case Studies”. 1.3 User Groups and User Tags User tags and user groups are two confusing concepts that are easy to confuse. Let's try to distinguish them: 3. Differentiation1) User Group A combination of user attributes and behaviors is needed to select a comprehensive target group; with only behavioral data, you can only see what a person has done, but you don’t know whether the person is a man or a woman, how old they are, how long they have been registered, what their purchasing power is, etc. The user group selected in this way is flawed and is generally not directly applicable to precision marketing scenarios. 2) User Tags To establish user tags, it is not necessary to combine user attributes and behavioral events. User attributes alone can be used, or behavioral events alone can also be used. User tags calculated based on user attributes and behavioral events are essentially user attributes, or in other words, user attributes themselves are tags. 3) Groups are a form of labeling As a middle-layer system module, tags are often not pushed with only one tag in precision marketing scenarios. In most cases, multiple tags need to be combined to meet the business definition of the crowd, as shown in the following figure: Here we use a scenario to introduce the application of selecting user groups based on user tags. During a women's clothing promotion event, channel operators need to screen out high-quality users on the platform and conduct marketing through channels such as SMS, email, and Push.
Therefore, after abstracting user attributes and behavioral event data into labels, the target potential user group can be found by combining labels; from this perspective, user groups are a way of applying user labels. 2. Building labels and labeling systems1. Classification of tagsThere are many ways to classify tags themselves, but from the perspective of tag implementation rules, they can be roughly divided into the following three types: statistical tags, rule tags, and machine learning mining tags. 1) Statistical labels This type of tag is the most basic and common type of tag. For example, for a certain user, fields such as gender, age, city, zodiac sign, active time in the past 7 days, active days in the past 7 days, and active times in the past 7 days can be statistically derived from user registration data, user access, and consumption data. This type of tag forms the basis of user portraits. 2) Rule-based tags This type of label is generated based on user behavior, user attributes and determined rules. For example, the definition of "active consumer" users on the platform is "number of transactions in the past 30 days ≥ 2". In the actual process of developing portraits, since operations personnel are more familiar with the business, and data personnel are more familiar with the structure, distribution and characteristics of the data, the rules of rule-based labels are determined by operations personnel and data personnel through consultation. 3) Machine learning to mine class labels This type of label is generated through machine learning mining and is used to predict and judge certain attributes or behaviors of users; for example, judging whether a user is male or female based on his or her behavioral habits, or judging the user's preference for a certain product based on his or her consumption habits. This type of label needs to be generated through algorithm mining. In project engineering practice, general statistical and rule-based labels can meet application requirements and account for a large proportion of development; machine learning mining labels are mostly used in prediction scenarios, such as judging user gender, user purchase preferences, user churn intention, etc.; generally speaking, machine learning label development cycle is long and development cost is high, so its development accounts for a small proportion. In fact, the final labeling system is defined from the user's perspective and needs to be combined with specific business; for example, the label classification of an e-commerce business includes user attribute dimension labels, user behavior dimension labels, user consumption dimension labels, risk control dimension labels, and social attribute dimension labels. 2. Label construction processThe following figure is a label construction process, which focuses on the product manager's perspective, mainly describing the analysis process and output documents of requirements, and briefly summarizing the development principles of labels. 1) Requirements collection and analysis In the demand collection and analysis phase, you can proceed according to the steps of restoring the business process - clarifying the business purpose - deriving labels from the strategy - aggregating labels. A clothing retailer expanded its business by setting up an online shopping mall and offline physical stores. Online, it is mainly through WeChat official accounts that traffic is directed to the mini program, and then the transaction is completed in the mini program. The following is a detailed description of how to collect and analyze label requirements through the clothing retail case: Identify and analyze business processes and business scenario touchpoints: User portraits are based on business, so the first step in building labels is to identify and analyze users' decision-making processes and business scenarios in order to quickly become familiar with the business. Refer to the following case business process restoration: First, WeChat users who are attracted through various scenarios follow the official account and become fans. Then the official account operator will push picture and text messages to the WeChat fans to operate the fans. At the same time, the fans are directed to the mini program mall, and the official account fans will eventually convert transactions in the mini program mall. During the whole process, the official account operator will continue to carry out operations such as maintaining WeChat fans and recovering lost fans. Recommended here: The detailed requirements chapter in "Effective Requirements Analysis" supports the main line requirements analysis method for business functions. Clarify the business purpose of each business scenario touchpoint: This step is based on the previous analysis of business processes, insight into business problems, and clarification of what business objectives are to be achieved, and the decomposition of business objectives; refer to the following case from clarifying the overall business objectives to the decomposition and quantification of business objectives: O: Assuming that the clothing retailer's online layout is relatively complete, the primary business goal at this stage is to increase sales. Therefore, "increasing sales" is the North Star indicator of the retail e-commerce. Then increasing traffic, increasing conversion rate, increasing average order value, and increasing repurchase rate are the core indicators after decomposition. S: Here we assume that you want to increase the traffic into the mini program mall. There are many strategies that can be adopted. For example, you can attract more WeChat users to follow and become fans by pushing coupons after scanning the code. Another example is to produce higher quality WeChat pictures and texts to better operate WeChat private domain traffic. M: Following the previous step, for the strategy of pushing coupons to attract users to follow the official account, we can focus on the ratio of following the official account through scanning the code, the ratio of unfollowing, and the ratio of new and old fans. Recommended here:
Design of operational strategies and user labeling requirements guided by business purposes. The construction of the labeling system is different for different business purposes. Therefore, labels should be derived from operational strategies. For example, if the business department wants to make personalized recommendations, it would be more valuable to label objects or people’s interests and preferences. However, if refined operations are to be performed, labels about user retention and activity would be more valuable. Refer to the following example of user tag selection: Taking increasing the rate of attention through scanning code as a quantitative goal, the selected operation strategy is to attract WeChat users to scan the code by pushing coupons. After new fans scan the code and follow, a 100-yuan coupon will be pushed, and after old fans scan the code, a 50-yuan coupon will be pushed. Therefore, the label "Is it a new fan?" needs to be used in the execution of the operation strategy. At this stage, you can prepare a simple Excel template to record communication content. The column header should include label name, label rules, usage scenarios, etc., and record the communication content together with the business party. Organizational tags Regarding organizational tags, classification management needs to be carried out from the user's perspective based on an understanding of the business and strategy. Here is a frame of reference:
2) Output label requirement document After the previous demand collection and analysis, the business side's label requirements have been clarified. In order to smoothly deliver R&D, the following steps are required: writing label system documentation - determining tracking points based on label rules - writing data requirement documentation. Writing label system documents In this phase, the data product manager needs to produce specific label system documents based on the previous communication with the business side: Tag ID: For example, ATTRITUBE_U_01_001, where "ATTRITUBE" is the demographic attribute theme, the "U" after "_" is the userid dimension, the "01" after "_" is the first-level classification, and the last "001" is the tag details under the first-level tag Tag name: English format name, for example, famale Tag Chinese: female
Determine the embedding point according to the label rules: The algorithm rules of the label have been clarified above. Next, we need to further determine which points should be buried to collect the required data. The following is a specific example: For the label "Preference for Purchased Product Categories", the event data of clicking the order button, as well as event attribute data such as product name and product category, will be used. Therefore, it is necessary to embed the event of clicking the order button. 3) Write a data requirements document Once the data to be collected by the tracking point has been determined, a specific data requirement document needs to be produced and handed over to the development colleague responsible for the tracking point to collect the data. In the data requirements document, the following should be clearly stated:
In actual work, writing label system documents, determining tracking points based on label rules, and writing data requirement documents will be a process of mutual improvement and complementation. 4) Label development In the entire engineering solution, the system relies on infrastructure including Spark, Hive, HBase, Airflow, MySQL, Redis, and Elasticsearch. In addition to the infrastructure, the main body of the system also includes three important components: ETL operations, user portrait topic modeling, and storage of label result data on the application side. The figure below shows the user portrait data warehouse architecture diagram. The following is a brief introduction to it:
Since different databases have different application scenarios, they are described below: MySQL: As a relational database, it can be used in applications such as metadata management, monitoring and warning data, and result set storage in user portraits. The following describes these three application scenarios in detail:
The difference between HBase and Hive is that HBase can run in real time on the database instead of running MapReduce tasks, which is suitable for real-time query of big data. The following example introduces the application scenario and engineering implementation of HBase in the portrait system: In order to encourage unregistered new users to register and place orders, a channel operator plans to guide them by issuing red envelopes or coupons through a pop-up window on the App homepage. After the ETL scheduling of the portrait system is completed every day, the corresponding population data is pushed to the advertising system (stored in the HBase database). When a new user who meets the conditions visits the App, the online interface reads the HBase database, and pushes the pop-up window to the user when the user is found. Elasticsearch: It is an open source distributed full-text search engine that can store and retrieve data in near real time. For scenarios that require high response time, such as user tag query, user population calculation, and multi-dimensional perspective analysis of user groups, you can also consider using Elasticsearch for storage. 5) Label publishing and effect tracking Through development and testing, after going online, it is necessary to continuously track the label application effect and business feedback, and adjust and optimize the model and related weight configuration. 3. Productization of User PortraitsIn terms of business value, tags and portraits are similar to an intermediate system module that provides data support for front-end services. After developing portrait tag data, if it just "lies" in the data warehouse, it cannot play a greater business value. Only after the portrait data is productized can the efficiency of each link in the data processing chain be improved in a standardized way, and it is also more convenient for business parties to use. The following is a summary from the perspectives of label production architecture and functional modules covered after productization: 1. User portrait product system architectureThe figure below is a structural diagram of a user portrait product system. The data is from left to right, mainly including four levels: data collection, data access, data integration/label calculation, and label application. Here is an attempt to briefly describe it: 1) Data Collection In the data collection module, log data, business data, and third-party data are collected mainly through three methods: client/server SDK, import, and docking with third-party applications. SDK:
Importer: You can choose different large import methods based on influencing factors such as the operating environment, source data format, and the amount of imported data to import historical file data into the user portrait product system. Link: In view of the characteristics of OpenAPI of different third-party products, we adopt the method of receiving event message push or active polling to collect users' personal attributes and behavioral event data in different third-party application systems. 2) Data access The buried data first enters Kafka in large quantities, and then is slowly consumed and connected to the subsequent data integration storage system. 3) Data integration/label calculation In the user portrait system, Hive is mainly used as the data warehouse for ETL processing, development of corresponding user attribute tables and user behavior tables, and label calculation. Data integration: Data received from various channels have data quality issues such as isolation, null values, format mismatch, and exceeding the limit range; therefore, dirty data cleaning, format conversion, user identification and merging and other integration work are required. Clean/Transform:
Id Mapping: The user attribute data, behavioral event data, etc. received from various channels are all isolated. In order to calculate the user's comprehensive labels, it is necessary to identify and merge the users; for example, through unionID, the information of the same user in the public accounts, mini-programs, and websites bound to the same WeChat open platform can be identified and merged. After data integration, the data will enter the following data model: In the user portrait system, label calculation will be done by a batch offline label processing engine that relies on the relatively stable underlying data structure. This label engine reads event data while reading user attribute data, and then combines it with specific label rules to perform a batch calculation and finally generate user labels. 4) Tag application The application of tags is mainly divided into two categories: front-end portrait display and access to other systems through APIs. They are described in detail in the following "3.2 User portrait productization function module" section. 2. User portrait product function module1) System dashboard Usually, the data dashboard of the user portrait system displays the core user data assets of the enterprise or the data of key focus groups in a visual form. Aims to establish and unify users' basic understanding of enterprise data assets or core population data, which are mainly divided into the following categories:
2) Tag management allows business personnel to add, delete, modify, and query tags, including: tag classification, new tag creation, tag review, tag listing and delisting, and monitoring of the number of people covered by tags. Based on user behavior data and user attribute data, create tags by setting tag rules: 3) The main capabilities of single user portrait include viewing detailed data of a single user portrait, such as user attribute information, user behavior and other data, by entering the user ID. 4) User grouping and user group portraits 1. User grouping The user grouping function is mainly used by business personnel. When applying tags, product managers, operations, customer service and other business personnel may not only look at the population corresponding to a certain tag, but may need to combine multiple tags to meet their business definition of the population. For example: combine the three tags "number of coupons received in the past 7 days is greater than 1", "activity level is high and extremely high", and "female" users to define the target population and check the number of users covered by this group of people. The user group portrait function is similar to the user group segmentation function. The user group portrait function also needs to combine tags to define the user group. The difference is that the user group portrait function supports analyzing the characteristics of the user group from multiple dimensions; while the user group segmentation function focuses on pushing the screened user groups to various business systems to provide service support. 5) BI analysis After the BI platform is connected with these data, the dimensions of the data can be enriched, supporting richer and deeper analysis and comparison through a variety of analysis models. 6) OpenAPIOpenAPI can ensure that the portrait system data is connected with various systems, such as push systems, marketing systems, advertising systems, recommendation systems, BI platforms, and ensure real-time update of data in each system to avoid the problem of different numbers from the same source. 4. User portrait applicationAs mentioned earlier, user portraits mainly have three applications: business analysis, precision marketing, and personalized recommendations and services. Specifically, it can be divided into: 1. Business AnalysisAfter the label data of the user portrait system enters the analysis system through the API, the dimensions of the analysis data can be enriched to support operational analysis of various business objects. The following is a summary of some of the indicators that marketing, operations, and product personnel will pay attention to during analysis: 1) Traffic analysis
2) User analysis
3) Product analysis
4) Order analysis
5) Channel analysis User activity:
User Quality: Retention: Next-day/7-day/30-day retention rate Channel revenue:
6) Product analysis
2. Precision Marketing1) SMS/email/push marketing In our daily lives, we often receive marketing information from many channels: a text message push about the arrival of a red envelope may prompt users to open an app they have not visited for a long time, and an email message push about a price reduction of a book on their wish list may stimulate users to open the push link and place an order directly. What types of marketing methods are there? It can be roughly divided into the following 4 categories:
2) Customer service skills When we complain, consult or give feedback to the customer service department of a platform, the customer service staff can accurately tell us our purchase situation on the platform, the results of the last consultation issue and other information, propose targeted solutions, and provide special services such as VIP customer service channels for high-value users. 3. Personalized recommendations and servicesOperators of apps can recommend different content to users through tags such as gender, age group, interests, browsing and purchasing behavior in the user portraits of individual pushes; for example, personalized article content recommendations on Toutiao, personalized video content recommendations based on user portraits on TikTok, and personalized product recommendations based on user browsing behavior and other portrait data on Taobao. 5. User portrait practice caseBased on the portrait system, we can conduct multi-faceted data analysis and reach user operation plans, quickly apply label data to the service layer (T+1, real-time application), and obtain user feedback through effect analysis to help iterate marketing strategies or product designs. The following are some practical cases to illustrate the application points and methods of user portraits in a scenario-based manner: 1. A/B crowd effect test1) Case Background In order to achieve good sales during the big promotion, a fast-moving consumer goods company of snacks planned to promote a series of articles on new products, health functions of products, etc. through message push, so as to build momentum for the big promotion and stimulate sales conversion. In order to accurately locate the target population flow, channel operators now plan to conduct two A/B population effect tests:
2) User portrait entry point The entire project needs to clarify how to divide the traffic into group A and group B, and how to design the crowd rules and effect monitoring for group A and group B. The following is a step-by-step introduction on how to use the portrait system in AB population testing: Divide users into groups A and B: In order to conduct A/B group testing, you first need to divide the traffic. You can use A/B allocation random diversion to divide users into A/B groups. A plan to test the impact of copy titles on traffic: In order to attract more users to visit the App during a big promotion, a platform channel operator plans to select a small number of users during the event warm-up period to conduct an AB effect test on a version of the copy title. In this test plan, control group A selected the user group that followed path A and had visited the app in the past x days and had browsed/collected/added to purchase the snack in the past x days, and pushed retail copy A to this group of users; control group B selected the user group that followed path B and had visited the app in the past x days and had browsed/collected/added to purchase the snack in the past x days, and pushed snack copy B to this group of users. The control group and the comparison group have the same number of users, but different copywriting. The click-through rates of the two groups are subsequently monitored to analyze the impact of different copywriting on user clicks. For example, users in group A are circled through the user group function, as shown in the figure below: The test plan for the increase in traffic brought by precise push compared to general push. Before using the portrait system to refine the push group, a platform pushed messages to users indiscriminately. In order to test the increase in traffic brought by refined operation of the group compared to indiscriminate operation, the channel operation staff decided to do an AB effect test at the snack marketing venue that has been the focus of recent operations. In this test plan, control group A selected the user group that followed path A, visited the app in the past x days, and browsed/collected/added the snack to purchase in the past x days; control group B selected the user group that followed path B, visited the app in the past x days, and had no category preference; the same copy was pushed to user groups A and B, and the click-through rates of the two groups were subsequently monitored to analyze the growth points brought about by the precision marketing push. 3) Effect analysis After the AB group crowd message push is launched, you need to build a monitoring report to monitor the traffic and conversion of the control group and the test group, focusing on the indicators in the following list: For example, the GMV comparison report of population A and population B built using the event analysis model is shown in the figure below: 2. Women’s Day targeted marketing1) Case Background A brand that focuses on women's products plans to carry out targeted marketing to goddesses with different preferences for different categories on Women's Day. The marketing information will be pushed twice, the first time at 10:00 a.m. on the same day, and the second time at 10:00 p.m. on the same day. Finally, the marketing effect will be evaluated by tracking the target audience's payment order completion rate on the same day. 2) Implementing the logic First, female users aged 18 to 40 are selected based on user gender and age tags. Then, the push notifications are pushed to 10:00 a.m. on March 8, 2020, and different marketing content is pushed based on user category preference tags. For example, marketing information about the Spring Beauty Festival is pushed to people whose category preference is makeup and skin care. The second wave of push notifications will be pushed to 10:00 p.m. on March 8, 2020, and the push information will be a unified promotion reminder. 3. Real-time marketing for newly installed unregistered users1) Case Background In order to promote registration and ordering by unregistered newly installed users, the operator of a snack mall App has formulated operating rules: when a newly installed unregistered user opens the App, coupons are pushed to them via App pop-ups for marketing purposes; for example, if a user installs the App but does not register, the App pop-up coupons are immediately pushed to the user when the user opens the App the next day, so as to better guide the user to complete registration and ordering. 2) User portrait entry point Channel operators filter out the corresponding user groups by combining user tags (such as "unregistered users" and "installation date less than ×× days ago"), and then choose to push the corresponding groups to the "advertising system". In this way, after the ETL scheduling of the daily portrait system is completed, the corresponding population data will be pushed to the HBase database for storage. When a new user who meets the conditions visits the App, the online interface reads the HBase database, and when the user is queried, the pop-up window is pushed to him. 4. Remarketing advertising of an e-commerce company1) Case Background The product operation team of an e-commerce app wanted to increase the repurchase rate of old customers and the order rate of new customers for electronic products, so they chose to cooperate with Toutiao to launch remarketing ads; for example, a user saw a vivo mobile phone on the e-commerce app, and when he was browsing Toutiao the next day, he saw the advertising information of the corresponding mobile phone. 2) Implementing the logic First, it is necessary to ensure that the API of the e-commerce app and Toutiao are connected, and then algorithm mining is performed based on the user's behavior in the app (browsing, collecting, adding to cart, searching, etc.) to generate labels for the user's product preferences. When Toutiao captures the user's device information, it will send a request to the e-commerce company, asking whether it needs to display advertisements to this user. At this time, the e-commerce platform will determine whether the user is its own user. If so, it will return a recommendation result to Toutiao. Then the user will see the product information that he has browsed before on Toutiao, and after clicking, he can jump to the product details page in the e-commerce app. VI. Conclusion
Author: Dapeng Source: A data person’s private land |
<<: Event theme planning and front-end gameplay design
>>: Creative formula for marketing promotion, master these 6 methods!
Daily Market Review Brother Saturday Column Under...
Every year, from the national level to the villag...
In the era of short videos, everyone is eager to ...
In 2020, a sudden new coronavirus outbreak swept ...
Invisible poor parents, tutoring phobia, cram sch...
Introduction to the basic teacher training resour...
Nowadays, many entrepreneurial partners choose pr...
How did the screenwriter of Nothing But Thirty re...
Friends who operate Douyin can use the Douyin pla...
During this year's epidemic, in addition to v...
Recently, the number of confirmed cases of local ...
1. Analyze the product The first step in promotio...
In order to better penetrate into various industr...
Hong Lan: 108 compulsory courses to accompany chi...
Why is there no traffic in your live broadcast ro...