WOT lecturer, Taobao mobile technical expert Chen Wu: The big data collection system behind Taobao mobile's billions of UVs

WOT lecturer, Taobao mobile technical expert Chen Wu: The big data collection system behind Taobao mobile's billions of UVs

The access to mobile Internet has made it impossible for traditional PC data collection to meet current business needs. The collection, processing and analysis of mobile big data has become an unavoidable problem for all enterprises. Compared with the PC era when terminals were relatively uniform, the diversity, traffic, network, functions and other factors of mobile terminals have restricted the collection of mobile big data. How can we achieve accurate and fast collection without affecting the user experience? 51CTO

[WOT2015 "Internet+" Era Big Data Technology Summit] Invited speaker Chen Wu, head of the infrastructure line of the wireless business unit of Alibaba, shared the data collection methods used by Taobao Mobile to cope with billions of UVs every day.

Chen Wu: Nicknamed Aoi, he is the technical supervisor of infrastructure line of the wireless business unit of the Alibaba Group. He has worked for 91 Wireless and Tencent. He joined Alibaba in 2013 and has been engaged in business development. After knowing the pain points of the business line in tracking points, he led the team to participate in the construction of the tracking point monitoring system of Taobao Mobile.

The following is the interview transcript:

Q: How does Taobao Mobile collect data?

Chen Wu: Currently, Taobao Mobile collects data through client tracking. There are two common tracking solutions: one is page tracking, which is usually done at the architecture level. The other is tracking of some control clicks and custom events, which are usually manually tracked by developers.

Q: What are the differences between mobile big data collection and PC big data collection?

Chen Wu: First of all, from the perspective of devices, China has 700 million mobile Internet users, and the number has been growing rapidly. The log volume of Taobao Mobile has basically doubled every year. In the process of the DT era, mobile devices have gradually become the mainstream data producers. The huge traffic of mobile Internet has posed a great challenge to the connection and computing capabilities of servers. In addition, mobile devices are close to users, and mobile data presents more user attributes: such as user location information, user gender, user voice information, user health status, and even user emotions can be digitized on the mobile terminal. It can be seen that the privacy of mobile data is much higher than that of PC data. Users' consideration of data security requires us to be more strict in mobile data security than PC. Mobile devices are more diverse than PCs. Chinese mobile phones range from a few hundred yuan to a few thousand yuan. In addition, China's unique copycat market requires that the design of the solution needs to take into account different models, power consumption, memory, CPU usage and other performance indicators.

Secondly, the network characteristics of the mobile Internet. The data collection traffic of the PC basically does not need to be considered, but the traffic cost on the mobile phone will directly affect the user cost, which also needs to be considered. The PC network is relatively stable, while the mobile Internet has to deal with frequent switching between strong and weak networks. How to ensure transmission on weak networks is also a point we need to consider.

The second is the interaction level. The interaction logic of the mobile terminal is much more complicated than that of the PC terminal. The screen of the mobile phone is relatively small, which leads to a deep level. Taobao Mobile attaches great importance to the entire transaction link. The transparent transmission of link context parameters is also much more complicated than that of the PC terminal. For example, we need to record the traffic map of the details page of Taobao Mobile. When embedding points, we must track the source of this traffic as the settlement basis for different business parties.

There is also the issue of technical framework. The entire mobile framework is much more complex than the Web framework. On the Web side, there are some standard Html tags, and it is easy to implement automated tracking through the spm.webx framework. However, there is no particularly standard semantics in the framework of the mobile side, as well as the Reactive Native framework, webApp framework, Native and H5 nesting, C and OC, C and Java language mixing, and the flexible switching of various native to H5 downgrade solutions during Double Eleven, which all add a lot of complexity to Taobao Mobile tracking.

In short, how to strike a balance between data volume, data quality, performance, success rate and real-time performance is the most challenging issue facing mobile data collection.

Q: How does Taobao Mobile solve the difficulties of collecting data on mobile devices?

Chen Wu: First of all, for performance issues, our tracking SDK will have its own monitoring, which will record the performance indicators and arrival rate indicators of key codes. On the server side, we establish monitoring baselines according to versions to ensure that the performance of our SDK itself is continuously optimized. There are many specific optimization methods: for example, we have configured incremental update solutions to save traffic and improve downlink arrival rate, improved computing performance by aggregating tracking points, made different upload strategies according to event priorities to ensure real-time performance, and ensured *** bandwidth utilization through dynamic window adjustment algorithms. Secondly, for transaction link issues, we will use Alibaba's SPM (super position model) and TPK (transparent key) flags to perform transparent transmission at the framework layer. ***, for mixed programming issues, we provide C layer and JS layer tracking bridges in the SDK to ensure that downstream business parties can easily call our tracking code.

Q: How to store the collected data?

Chen Wu : This involves the backend architecture. Our first layer is the adash server, which is responsible for receiving and decoding data streams. After decoding, it is split into different data streams according to the consumption business scenarios. The second layer is Alibaba's TT stream, which is responsible for consuming adash data. The other end of TT connects to our real-time computing galaxy system and the offline storage ODPS cloud ladder system, and finally lands the data on our cloud server. The business system responsible for data products can produce real-time monitoring reports from galaxy and offline monitoring reports from ODPS.

Q: How to handle abnormal data?

Chen Wu: We will evaluate the cost of processing abnormal data. If it is low-priority data, we may discard it directly. If it is high-priority data (such as financial report data), we will do some post-cleaning of the data.

From previous cases, we usually encounter two situations: duplicate data and dirty data. For example, to clean duplicate data, we simply assume that a mobile phone will not generate two identical logs in the same millisecond, so we use this algorithm to perform a round of deduplication on the database based on the device ID and time projection for the problematic hourly table in ODPS. This cleaning usually requires a lot of computing cost.

Another type is cleaning on the client side. For example, if there is a custom point in the details business, and later the version upgrade changes the page name from shop to sp, then there will be errors in the PV statistics. At this time, we can correct the data in the tracking point SDK by sending configurations to correct the error logs at the source. This saves a lot of computing costs on the server side, but the configuration arrival rate and timeliness must be considered.

In our daily work, we also accumulate cleanup rules by establishing some intermediate tables. For data that meets the requirements in these tables, we can choose to clean up or retain them.

Q: How to ensure the balance between SDK performance and data volume?

Chen Wu: First of all, data [classification]. We will divide Taobao's data into performance data and business data according to business, and then set priorities for these data. Data such as UV, GMV and real-time calculation data have very high priorities. We will transmit data according to priority. We can use lower-level threads to process low-priority data. Another thing is [sampling]. Our network performance points have tens of billions of points a day. These points do not involve business settlement, but are only used to monitor the success rate of the network. We can monitor the sample sampling by configuring the controlled sampling rate. ***Performance can be improved through [aggregation]. Taking the counting points as an example, we only need to accumulate the values ​​of the same points within a period of time, and finally aggregate multiple points within a period of time into one point for reporting, which can well control the frequency of log points. Aggregation improves performance, but if there is an abnormal exit during the aggregation process, some data may be lost, so we only aggregate low-priority points.

Q: What are you most worried about on Double Eleven?

Chen Wu: Traffic will surge during the Double 11 period. There will also be a large-scale party this year, so it is foreseeable that there will be an extreme surge in traffic this year. Therefore, I am more worried about disaster recovery. Will the increase in traffic overwhelm the server or the network card? Will the downstream data output be delayed when the data volume surges? We usually simulate large traffic conditions, perform full-link stress testing, and sort out the performance bottlenecks of each node to ensure Double 11.

Q: How does Taobao Mobile handle disaster recovery?

Chen Wu: Disaster recovery requires both the client and the server to work together. Client disaster recovery is nothing more than controlling the amount of data and the number of requests, but these two are relatively contradictory. If the number of requests is reduced, it means that the user can only increase the time to upload the log. The advantage of this is that the server's QPS is reduced, but the accumulated logs on the client become larger, the server decoding also becomes larger, and the downstream output time becomes longer, so it is necessary to reasonably balance the upload interval and data volume according to the computing power of the downstream server. During Double Eleven, we will make a multi-level downgrade plan for the client, and push different levels of configuration according to the server's water level to ensure that the client's high-level data can be fully uploaded and the low-level data can be uploaded as much as possible.

In addition, for server-side disaster recovery, our core businesses are all multi-site active, which can quickly migrate traffic in the event of a failure to ensure the continuous availability of services. Cluster isolation is done by business at the business layer. Business systems related to Taobao Mobile, Tmall, and Juhuasuan Double 11 are clustered separately, and other businesses will be placed in other clusters. Important data will be stored in multiple copies. If the collection server loses data, partial data recovery can be performed based on some log streams.

***There is also system stress testing guarantee. We have conducted multiple full-link traffic amplification stress tests before Double Eleven. In addition, there are also downgrade plans for various downstream businesses.

In November, Shenzhen WOT, we talked about big data

The high-end technology summit [WOT2015 "Internet +" Era Big Data Technology Summit] hosted by 51CTO will be grandly opened in Shenzhen from November 28th to 29th. 42 heavyweight guests in the industry will gather to analyze the key applications of big data technology. The organizer will invite more lecturers to the "WOT Lecturer Interview Room" to deeply analyze the technical dry goods.

More interviews at WOT2015

 

  • WOT lecturer Zhang Ximeng: What can save you, a data analyst who is tired of sewage treatment"
  • WOT lecturer Ren Huawei: Big data technology makes O2O basic information more "reliable"
  • How to bring Google's mysterious data center management system home
  • WOT lecturer Yang Desheng: What do programmers need to start a business?
  • WOT lecturer Qian Chengjun: Big data brings new explorations to the development of Baidu's testing team
  • WOT lecturer Liu Peng: Big data should guide machine rather than human decision-making

 

 

 

<<:  Ruan Yifeng: Cyclic loading of JavaScript modules

>>:  Baidu App Development Kit puts 100 million Android devices at risk of attack

Recommend

Liu Rongjun, "3M: How to build an innovative enterprise"

Liu Rongjun's "3M: How to Build an Innov...

How does a P2P platform choose high-quality channels?

As we all know, the cost of acquiring P2P custome...

Review: Thoughts on APP listing on overseas app stores and localization

The author joined an online literature export pro...

How to complete OPPO bidding within ten days?

After changing to a new job, the first task was t...

Case Analysis | How did Zhangmen 1-on-1 grow from 0 to 300,000 users?

Faced with high public domain customer acquisitio...

iQIYI Android client startup optimization and analysis

1 Introduction There is an eight-second rule in t...

The two essences of marketing promotion: content and channels!

I have been engaged in marketing promotion for so...

User operations, how to solve user pain points?

When considering user needs, we look at the produ...

Promotion account structure diagram, how to build a promotion account structure?

The account structure of search promotion consist...

Jesse's Trading System (half-year course) with instructions + video course

Jesse's Trading System (half-year course) wit...

What are the common http status codes?

For those who have just started SEO website optim...

Apple to try adding mental health features to iPhone

Apple is working on a series of new programs to h...

Foreign trade tips | SNS marketing methods

Although social media tools vary and marketing me...