[Original article from 51CTO.com] The WOT2016 Big Data Summit will be held at the Beijing JW Marriott Hotel from November 25 to 26, 2016. Dozens of front-line experts in the big data field and data technology pioneers will gather on site to engage in in-depth exchanges and discussions on cutting-edge technical topics such as machine learning, real-time computing, system architecture, and NoSQL technology practices, while sharing the latest practices and hottest industry applications in the big data field. 51CTO reporter conducted an exclusive interview with Tian Chao, R&D director of Yidian Zixun's big data platform, who will be speaking at the conference. Let us get a sneak peek and find out what Tian Chao thinks about Yidian Zixun's large-scale real-time click feedback platform. Tian Chao is currently the technical director of the big data center at Yidian Zixun, responsible for infrastructure and big data platform related work. He graduated with a master's degree from the Institute of Computing Technology of the Chinese Academy of Sciences. He has worked as an engineer at Yahoo Beijing R&D Center, CTO of Synchronous Disk, and senior technical manager of AutoNavi Software. He is currently the technical director of the big data platform of Yidian Zixun. Big data technology refers to the ability to process massive amounts of data and data applications built on such processing capabilities. Since the widespread popularity of Hadoop, the industry has had the ability to build large-scale data storage and computing. As technology continues to develop, the demand for upper-level applications to have the ability to process massive amounts of data in real time is increasing, which has led to the emergence of various real-time computing frameworks and systems such as Storm. Today's technologies, including Spark and Google Dataflow, hope to more organically unify offline computing with online computing. Real-time data processing capabilities are an essential component for a modern Internet company. Online machine learning, real-time user portrait systems, real-time data warehouses, real-time statistical analysis systems and other businesses of various companies all require the ability to calculate large-scale feedback data in real time. The real-time computing parts of these systems have certain commonalities and certain special parts. At the beginning of the design, Yidian Zixun's real-time feedback platform abstracted the common computing models and data structures of the real-time computing parts of the above systems. When designing the system, it referred to Google's Mesa system, and designed it into a scalable platform that can support the real-time computing tasks of the above systems within Yidian Zixun. Many information platforms only serve readers, but Yidian News can do the opposite, serving readers while also providing information to authors. The system analyzes based on user behavior and explores user needs for interests and how those needs are met. These data and in-depth data mining provide a global God's perspective for Yidian News' content ecosystem construction, allowing Yidian News to observe group performance and content trends from a higher perspective. Yidian News also has a system called Yidian Insight, which is currently in an invitation test. The system maps knowledge of user interests to different fields and displays this knowledge in various data visualization methods. Search engines emphasize user search, which is equivalent to users leading the content; recommendation means that users are completely passive and do not express themselves. First, users are given common content, and then based on their click behavior, their preferences are guessed, and then the content is recommended to them. Search engines and recommendation engines are different systems with similar structures. The core goal of Yidian Zixun's interest engine design is to organically integrate search technology and recommendation technology. In the interest engine, the underlying data of users' search and recommendation behaviors are completely connected, making full use of users' active expressions and passive behavior signals, constantly learning and mining users' interests based on artificial intelligence technology, and distributing content based on user interests. Tian Chao believes that the continuous development of technologies from big data to artificial intelligence is actually a natural process of the industry's ability to process and utilize data. In the early days, most technologies in the industry were used to process result data, with data volumes at the GB level, and databases used for storage. The ability to obtain, store, and calculate data was at an early stage. With the continuous development of a series of infrastructures such as Hadoop, big data technology has also continued to develop. Technical personnel not only process business result data, but also conduct more in-depth processing of logs describing user behavior to assist business calculations. In this era, the amount of data has grown to the PB level, and various distributed file systems are used for storage. At this stage, various offline computing, streaming computing, and graph computing models have also developed with the development of big data applications. Today, after having better computing models and more massive data, the use of data has also deepened, and the combination of artificial intelligence and deep learning technology with big data can also construct more intelligent applications. [51CTO original article, please indicate the original author and source as 51CTO.com when reprinting on partner sites] |
<<: Android unit testing - verify the correct posture of function parameters and return values
>>: How to use Android image resources to create a more sophisticated APP
Han Shen's "Learn SEO with zero basic kn...
Appstore is a must-go place for IOS system users ...
It is becoming increasingly difficult to live str...
Shenzhou 16 "Doctor Crew" Within about ...
With the popularity of short videos, a group of b...
Leviathan Press: When I was a kid, I would always...
1. Do not drink coffee during the recovery period...
With the continuous development of China's eco...
I believe that many people who do brand marketing...
We always say that copywriting is sentences writt...
Lang Xianping, a well-known "Internet celebr...
The products with the lowest threshold and the hi...
Why is it bad if the title tag of a web page is t...
The growth of a person needs the blessing of birt...