Overview 1. Advertising styles and scenarios The above picture shows the current commercial scenario flow of Weibo advertising, “one screen and four major flows”. "One screen" refers to the fashion of opening Weibo, and "four major flows" refer to the main body of Weibo's commercialization, including relationship information flow, hot flow, comment flow and hot search flow. The picture on the right shows the background of advertising delivery. 2. Advertising Participants As shown in the figure above, computational advertising first faces these concepts. According to the size of different advertisers and their importance to the company, they are divided into KA category and small and medium-sized category. KA category tends to make extensive purchases. For small and medium categories, regular customers will bid. Common billing methods include: CPE, CPM, and CPD. Currently, OCPX is being promoted on a large scale in the Internet market. OCPX is a sales method that requires high technical content and is also a good way to reduce the risk of advertisers. 3. Core issues of computational advertising These are the three parties involved in advertising: the platform (site), users, and advertisers. The core issue in computational advertising design is how to pursue balanced and overall interest maximization among the three parties. 4. Advertising Process The above is the general advertising process. This is a three-way "movement" activity from the perspective of advertisers, platforms, and users: The process of advertising and marketing planning: create a promotion plan -> select audience targeting -> set an advertising budget -> set advertising creativity -> start advertising -> check advertising results -> make the next marketing decision. Precision advertising delivery: In response to advertising inventory requests, we will accurately profile users, then recall advertisements, perform rough and fine sorting of advertisements, select advertisements, render advertising creatives based on different platforms, and finally display them to users. User content consumption is relatively simple, please see the process in the figure. The Evolution of Weibo Advertising Strategy Engineering Architecture 1. Development History of Weibo Advertising Engineering Architecture As Weibo's commercialization process continues to develop, the engineering architecture supporting commercialization will change according to specific business needs. At the beginning, we were doing non-information flow advertising. We tried to place some banners on Weibo in the traditional way. Currently, the banners are no longer available on the mobile version of Weibo. From the simple pop-up advertising system in version 1.0 to the product lines represented by Fanstong in version 2.0, Weibo began to develop information flow advertising. Weibo is the first domestic information flow advertising company. During its exploration, it developed a series of advertising product matrices. In order to quickly launch products, it copied a large number of advertising systems. In order to change this situation, with the launch of Super Fans Pass at the end of 2017, it restructured the advertising system as a whole. Since then, Weibo's advertising engineering architecture has entered the 4.0 era. 2. Delivery system architecture 4.0 The above picture is the engineering architecture diagram of 2017, which is the 4.0 version of the engineering architecture that evolved with the exploration of Weibo's advertising product line. At that time, it was in the blue ocean stage of traffic, so by cooperating with the continuous increase in advertisers and advertising budgets, the stability, high availability, and high concurrency of the advertising system were continuously improved to continuously achieve the growth of advertising revenue. Therefore, we have also sorted out the system in layers. The blue area in the figure is the core link of the online advertising system. Through advertising requests, traffic is uniformly accessed, including multiple product matrices of Weibo. Advertising requests will be distributed to multiple product matrices, and user requests will be responded to through unified traffic price evaluation. Overall, such an engine structure is designed to continuously meet the customer needs of the product. The basic process is ad request, ad inventory access, overall traffic distribution, requesting user portraits, bidding services, ad triggering, and requesting online indexing services, forming a relatively complete behavior targeting system, including: 1. User behavior targeting. Behavioral targeting on Weibo, such as interaction on topics, interaction on popular Weibo posts, interaction on hot search groups, etc. 2. Social relationship orientation. For example, if the user group under a certain big V information can be understood as a naturally formed social group, then the fan group under this big V information can be selected for delivery. 3. Accurate audience targeting. It is a data package that is aggregated at user granularity formed by data processing by the platform or a third party, or recall by advertisers based on the effect of a single campaign, or existing customer information. This is a collection of precise users. 4. User attribute targeting. Including user portrait, age, region, etc. The above is the overall online delivery process, but the delivery process is not enough with only the above. It also includes personalized inventory strategies, advertising negative feedback strategies, intelligent frequency control strategies, etc., as well as supporting A/B Test systems, thus forming an online service group for advertising delivery. Since the traffic comes from the Weibo site, Weibo advertising requests do not require traffic anti-cheating. The existing anti-fraud measures are mainly aimed at interactive feedback, that is, there will be large-scale anti-fraud strategies for the feedback of post-link data after advertising, of course, also including social interaction. Then there will be a real-time settlement center - the settlement system, which provides advertisers with the reports they need, as well as an account system closely related to advertisers, which generally forms an online service group for the post-investment link. Below is near-line data access, which is based on data classification: user basic data, advertising targeting data, advertising real-time streaming data, algorithm model training data, and advertising creative library data to formulate the needs of online real-time access. The bottom one is the offline data warehouse. The data released online will be stored in the data warehouse. This is the advertising data bus. The data flow is generally implemented through the Kafka mechanism, and then gathered into the data warehouse to classify the data. The monitoring system of the advertisement on the far left of the picture will monitor the business operation status, service stability, and availability of the system at all levels. These are a complete tool chain at the business level. It turns out that multiple product lines are gradually aggregated into such a system. The architecture of the 4.0 era is, as a whole, an engineering architecture system designed for "extensive growth". The objective reality of this extensive growth is that the growth in the number of advertisements is achieved through the continuous increase in advertising budget supply, the continuous supply of traffic monetization scale by Weibo, and the continuous increase in the number, budget and scale of advertisers. At this time, the biggest test for the system is the high availability of the system and the guarantee of R&D efficiency when meeting business needs. Such an architecture is the "product of extensive growth." This architecture system has some problems (i.e. those in the red box): it is relatively downplayed in terms of the strategy model, which means that the iteration of the algorithm model is relatively simple from the functional architecture level. For example, A/B Test uses a very primitive A/B Test, which can quickly support the growth of the advertising business under the demographic dividend. However, with the disappearance of the demographic dividend, it can no longer support the growth of the advertising business well. At this time, the system's support for the strategy model becomes extremely important. 3. How the system supports the transformation of advertising growth How to support the transformation of advertising growth will be through the transformation from extensive growth (expanding traffic, expanding advertisers, expanding budget) to refined growth, and continuously improving the effect of delivery to promote revenue growth. At this time, the corresponding system needs to be transformed to achieve good driving of the strategy model based on the continuous improvement of the system. In this case, as the algorithm continues to introduce new deep learning models, the overall engineering architecture is also constantly being refined, transforming from the original business division method (Target, Filter, Rank) to a division method oriented to algorithm strategies (recall, model, mechanism, sorting). 4. Traffic Funnel Model Model commonly used in advertising systems: traffic funnel model. Rethink and redefine it: start with recalling ads, completing the maximum probability display based on precision, then selecting relevance, and then to a bidding and ranking mechanism with the model as the core. This article will not explain the mechanisms of recall and relevance from an algorithmic perspective, but will mainly introduce how engineering supports algorithm model iteration. 5. Next-generation strategy-oriented delivery architecture Based on the architecture system 4.0, the online delivery engine is classified into business categories to meet the new traffic funnel model. There are the following key points: ① Trigger, model, and strategy mechanisms develop independently and in depth The system can achieve independent in-depth development in supporting triggering, modeling, and strategy iterations, and can achieve rapid iteration without affecting each other. ② Introducing lean drive thinking, dual-core drive system, and releasing algorithm iteration efficiency When making an overall refined transformation, the system needs to keep trying, and the attempts require a good trial platform, so the lean-driven idea is introduced. The online lean platform includes: Faraday Experiment Platform and Faraday Lean Insight, which is a better tool chain to promote business model iteration, and pays more attention to the real-time nature and density of data. The overall system architecture is divided into: online lean tool platform, online delivery system, near-line data access, data model processing (exposure machine learning platform and online real-time streaming mechanism) and offline data platform ③ Real-time and density of feature data, and model-independent development Let's focus on the online delivery service. In the service, there will be traffic access, traffic triggering, and triggering mechanisms, including multi-channel triggering. After the multi-channel triggering system, there will be mechanism strategies, including model estimation services. Model estimation services are aggregation services that will perform coarse sorting and data trimming. This will be completed in a large-scale distributed estimation service. Ranker will also perform fine ranking based on the estimation. I would like to explain in particular why we perform coarse and fine sorting. My understanding is that coarse sorting is for the performance of fine sorting, because fine sorting involves large-scale fine calculations, and the performance may not be able to cope with it, so coarse sorting is needed. Moreover, while ensuring the effect, there will be multiple levels of coarse sorting to ensure performance. At the top are the Faraday Experimental Platform and Faraday Lean Insights. Overall, a dual-core engine will be formed: a good engineering architecture system and a lean-driven tool platform. 6. How to support accurate recall of advertising materials From the recall mechanism of Weibo ads, there will be user tag triggering, social communication triggering, precise population triggering, content triggering, and DNN vector triggering. After the five-way triggering, the MIXER ad recall level will be summarized, and after the summary, there will be coarse ranking strategies and fine ranking strategies. The information used here includes the traffic side and the advertising side. Traffic strategy: User profile, request context, historical interaction behavior Advertising side: Advertiser information, plan information, creative information 7. DNN Vector Trigger Model Here we introduce the deep vector trigger model, which uses a dual-tower model, including the user side and the advertising side. The user side is trained based on user information to generate user-side vectors, using a three-layer neural network. The same is true for the advertising side. Overall, the training is completed offline, and real-time vector estimation will be performed next. Then, the correlation is determined at the junction of the two towers, using simple cosine and sigmoid. 8. Triggering Engineering Architecture The trigger engineering system has developed a corresponding service system from the perspective of how to better support triggering. When a request is made for recall, the Agent will be triggered to perform five-way recall, including dual-tower recall, content-oriented recall, user portrait recall, precise population recall, and Weibo social relationship recall. After the recall, Mixer will be performed and trimmed in combination with the quality estimation service. Then, the online real-time advertising plan library data and offline data will access the five online triggers according to demand, thus forming an overall advertising trigger engineering architecture. Lean Channel Thinking Tool: "Two-Wing Plan" With the launch of Super Fans in 2017, considering the transformation of Weibo advertising from extensive growth to refined growth, there is a need for a better experimental platform and a way and idea of online operation insights, which constitutes the source of lean-driven thinking. It includes two parts: one is to experiment and regulate online strategies, and the other is to obtain lean insights for online strategy operations. Together with the online delivery system, they form an integrated two-wing strategic engineering architecture system. 1. Faraday Experimental Platform ① Faraday stratification experimental model Faraday's experiment adopted an orthogonal hierarchical model, which is generally used in the Internet industry, including Baidu's Edison and Alibaba's Tesla. The idea of Weibo Faraday model comes from the classic paper on Google's orthogonal traffic decomposition model, which provides a theoretical basis for multi-layer independent experiments. Of course, the experiment also combines the actual situation of Weibo advertising, including solving the problem from scratch at the beginning and simplifying the model in the paper. At the beginning, there was no concept of domain, and hash functions were used at each layer. ② Experiment bucketing Although each experimental layer shares the hash function, the hash function parameters are different, including the traffic identifier and the experiment id identifier, which shows that the traffic bucket division of different layers is orthogonal. In addition, the allocation conditions mentioned in the Google paper are also introduced. The scenario application is very classic. For example, when doing an experiment, the reuse of traffic or some characteristics of the experiment will be considered, including the region, gender, etc. of the experiment. In this way, the circled or limited traffic is used instead of all the traffic. If all the traffic is used, the experimental effect will not be significant, the experimental effect will be diluted, and it is easy to cause disbelief. This allows traffic to be reused and the experimental effects of the strategy to be well observed. These are the traffic bucket types and the allocation conditions for traffic delineation. ③ Faraday experimental platform This is the overall architecture diagram of the Faraday experimental platform. Adopt a fully automated mechanism. Through Faraday's web portal, the experimental information is recorded, the experiment is issued and analyzed at the online traffic entrance, and the online strategy is adjusted according to the experimental data information. The experimental hit data will be analyzed, the experiment will be buried, and then enter the real-time analysis engine to count the experimental results. There are two ways to issue and analyze experiments: One method is to uniformly send and parse the experiment at the total entrance of the traffic, all in one step, and then send the request information together, follow the request link, and finally return. Those strategies that are hit will have corresponding identification. If the number of experiments is relatively small, the analysis is stress-free, the experimental correlation analysis is relatively small, and the experiment consumes little bandwidth, then this is appropriate. However, as the scale of the experiment increases, since the current advertising system is a distributed system, if the complete experimental information is always sent with the request, the bandwidth consumption will be very serious, causing the return result to time out, the availability to decrease, and the experimental time will become very long. Therefore, another way has emerged, in which the corresponding services analyze the experimental conditions of the corresponding strategies respectively, while the experimental conditions of other strategies do not need to be analyzed. In this way, only the information of interest is obtained, avoiding information redundancy. Why wasn't such a design made at the beginning? Because the system was initially designed to solve the problem of creating something from nothing rather than the problem of going from one to many. The first step is to create an experimental platform. The first method is relatively simple and the experiment can be quickly launched. ④ A/B experiment on advertising side The commonly used A/B experiment is a bucket experiment on the traffic side, which compares different experiments based on different traffic ratios. However, this traffic-side bucketing experiment cannot meet some advertising needs. In the actual advertising system, there will be experimental strategies targeting certain advertising industries or advertisers, and the experimental results of concern are also the results of the advertising industries or advertisers being concerned. It is not impossible to use traffic-side experiments, but when analyzing the experimental results, the data analysis needs to be specific to the advertiser level, which will bring great challenges to the statistical analysis engine; if this is not done, and only the overall effect is considered, the experimental analysis effect will inevitably be diluted and cannot accurately reflect the significance of the strategy. This requires the design of a mechanism for advertising-side experiments: First, the experimental subjects will be identified, including: advertisers, advertising categories, advertising plans, advertising creativity , etc. Then conduct homogeneous or non-homogeneous experiments: Homogeneous experiments (i.e. randomly dividing the audience of a promotion plan into two and comparing different strategies based on the two divisions) may create two concepts similar to "pseudo-plans" for one plan, and then systematically launch the pseudo-plans at the same time. Divide the population into two groups and then compare the results. A non-homogeneous experiment is an experiment in which a group of people are selected to conduct an experiment. This experiment often has no controls and is judged longitudinally based on the experimental results. That is, the current strategy experimental results are compared with the strategy operation results of experiments that have not been conducted before. ⑤ Independent A/B experiment with advertising budget The problems that advertising experiment platforms may encounter in specific advertising experiments are not encountered in non-advertising businesses. Because the advertising field involves three parties: A, U, and C. These three parties form an effective complete loop. A simple loop takes into account the planned delivery -> spending money -> advertising offline. The budget is involved here, and the ad will be offline if the budget is gone. There is a problem involved here. If you conduct a simple traffic bucketing experiment, dividing the traffic into two parts and using different strategies, the results of these two strategies will have different consumption rates for advertisers' budgets. The result is that if a certain bucket (50-50) is consumed quickly over a period of time, if the budget is not independent, the budget of bucket B will be pulled over. Specifically, the advertiser's money will be spent more in bucket A. If the traffic is small, the effect may be very good, but as the traffic continues to increase, you will find that the experimental effect of the strategy will become very small. Therefore, the independence of the advertising budget will be taken into consideration when experiments are being conducted on the advertiser side. For example, the plan can be divided into two parts, and the budget can be allocated equally, so that each is independent and does not affect each other. Finally, we will consider applying the budget independence experiment to traffic bucketing. This has not been done yet, but it has been proven to be feasible. ⑥ Experimental testing and effect evaluation Experimental testing and effect evaluation is a real-time effect evaluation, which can achieve a data delay of within 5 minutes (conservatively). 1 minute is also acceptable, but the problem lies in the measurement of advertising effects. For example, ctr=interaction/exposure, the interaction may be delayed, causing the calculated results to be erratic. Therefore, in order to stabilize the experimental effect, it will be calculated every 5 minutes. The mechanism adopted for analyzing experimental effects spanning multiple days is offline cross-day + real-time stream analysis engine. Since the real-time streaming mechanism is not as effective as the offline one, we have made some optimizations here. For example, if the offline scheduling of such tasks can be performed a few hours ago, we will use offline batch processing to replace the real-time streaming analysis data from earlier today. The real-time streaming tends to use the most recent data, such as one or two hours. From the perspective of experimental decision-making, real-time streaming is mainly used to determine whether the experimental results will cause serious defects so that the experiment can be terminated in a timely manner. The real-time reports brought by offline data processing are mainly used to evaluate the experimental results. The overall experimental effect will be monitored for availability (for example, monitoring of traffic buckets; different strategies have different availability at the traffic level) to ensure that the availability of the two experimental buckets is at the same level. The experimental platform supports business indicators ranging from nearly 40 million to 100 million, and supports custom indicators. At the same time, the experiment supports version tracking and nodiff control experiments. 2. Advertising Lean Insights The problem that Lean Insight wants to solve is: since there are many online strategies, how can we make the system's operating status visible outside the system when the strategy is running, so that the strategy operation can be as transparent as water? Later, the lean matrix description method was proposed. The overall business is divided into multi-level business stages, and there will be complete insights at each stage. The above figure is the system architecture diagram. After the data lands, it enters the real-time streaming mechanism. The data categories in the real-time streaming mechanism are classified and associated with each other through keys. The log construction is componentized, including online debugging logs, user information logs, online strategy and model running logs, advertising logs, etc. Log processing and storage will be carried out further, which is divided into offline storage and online storage. Online storage will use columnar storage such as PG and clickHouse for online instant access. The overall presentation effect is: each layer is a business strategy stage; the processing of request access or ad recall sets is a funnel as a whole, which is two-dimensional from the perspective of the number of advertisements. The number of ad recall sets is constantly decreasing as the strategy runs. You can view each level in a more granular manner, such as the distribution of different bidding types and the overall bidding level, to identify performance differences between different strategy layers and gain insight into the specific impact of the strategy. Author: DataFunTalk Source: DataFunTalk |
<<: "Operating core users" actual case analysis!
>>: How to place Kuaishou information flow ads!
The ancients attached great importance to Wenchan...
krusier character illustration class 2019 Course ...
Although constrained by objective conditions such...
Today, UCAR released a series of advertisements o...
Recently, a friend said to me, "I have to ta...
Do you want to design and develop a website yours...
Learn insurance sales from Zhang Weihua: 8 nodes ...
Just yesterday, Douyin announced that its daily a...
How much does it cost to be an agent for a daily ...
This is an era where idols are born faster than b...
During the Spring Festival, everyone must have fo...
Regarding the topic of "Why does advertising...
Two recent events prompted me to post this articl...
17. 2022 Yu Feng 16. 2022 Higher Education Online...
Judging from the current development of advertisi...