【51CTO.com original article】 Activity description : Aiti Tribe is a service community that provides core developers with in-depth technical exchanges, solutions to development needs, and resource sharing. Based on this community, we invite industry technology experts to provide one-on-one breakthroughs on development needs and remove stumbling blocks in the development process. We help developers solve development problems with the most professional and efficient answers. Topic keywords: big data spark data analysis Data portrait Tribe lineup : Xu Tao, director of big data at Longzhu Live; Wang Jin, co-founder of Shuguo Technology; Target audience : Junior development engineer, data analyst, operation and maintenance engineer How to participate: Join the 51CTO developer QQ exchange group 370892523. If you have any technical questions, ask in the group or send them to the group owner. Event Details: Nanjing-Shi Guojun-Java: Is there any relevant information on Spark learning? Xu Tao: I recommend studying the official Spark documentation. Other Spark books may not keep up with the updates of Spark technology. Beijing-robingao –Java: When using Spark for offline analysis, how do you analyze Nginx logs? What specific dimensions do you look at? Xu Tao: It is recommended to use Hive + map/reduce for offline analysis, which is more stable than Spark. Nginx logs are generally used for traffic monitoring and operation and maintenance alarms, and have strong timeliness. Spark-Streaming can be used. Beijing-robingao –Java: Do you have any experience sharing on customer profiling? Please be more specific. Xu Tao: User portraits are "labeling" users. User portraits can be divided into static labels and dynamic labels. Static labels are indicators that are rarely updated or almost unchanged, such as user personal information. Dynamic labels are user behavior labels, such as the favorite categories of live broadcast stations. Labels are added through user behavior logs and transaction flow data. Some websites/APPs only have a small amount of user personal information, but through labeling, we can obtain a large amount of user behavior logs. We can predict the user's gender, age group, city type, job type, etc. through cluster analysis. Some of the more characteristic labels of live broadcast stations include: favorite anchors, habitual online time periods, sign-in users, etc. Nanjing-Shi Guojun-Java: If you want to submit multiple SQL statements to a Spark cluster at the same time, can you do it without using Spark-submit? Xu Tao: It is recommended to submit it in the Spark-SQL client. Chongqing-Xiaobao-Android: Regarding streaming media, I would like to know about any cases related to streaming media in Android, such as video and voice streaming? Xu Tao: This topic is too broad. Cases related to live streaming include live playback, microphone connection, and H5 live streaming player. Guangzhou-Zhao Hui-Big Data: What is the value of multi-source data fusion in big data? Wang Jin: If big data is not integrated across multiple sources, the value of the data is very limited, and the true core value of big data cannot be reflected. The value of multi-source data integration can be better reflected in industries such as finance, e-commerce, and insurance. Zhuhai-Xiaoyuan-Java: Does 51CTO have any special topics related to big data? 51CTO : Yes, you can subscribe to the Big Data Journal. To subscribe, go to Homepage, Personal Homepage, and click Subscriptions. For example: Zhuhai-Xiaoyuan-Java: Are there any security-related topics provided? 51CTO : Security topics such as: HPE Security - the data bodyguard behind "Kung Fu Panda"; focusing on the US network paralysis incident, the security of the Internet of Things is thought-provoking; special report on the 2016 National Cyber Security Awareness Week; special report on the 11th (ISC)2 Asia Pacific Information Security Summit; prevention is still the best way to avoid ransomware attacks. Beijing-Yang Kai-Network Engineer: Want to learn about cloud computing 51CTO : You can refer to this article to learn about re:Invent 2016----AWS's five cloud computing superpowers. Nanjing-Xiaopang-Android: The relationship between cloud computing and big data 51CTO : Features of cloud computing: Through dynamic scheduling of computing, network and storage resources and rapid deployment of applications, virtual technology is used to improve the utilization of information equipment, thereby achieving the goals of saving resources, improving efficiency, centralized management, information sharing and saving fiscal expenditure. Cloud computing platforms mainly deploy various application systems, store massive amounts of data, and provide services for e-government, social management, public services, etc. Do you still have questions about these solutions? Welcome to join the 51CTO developer QQ exchange group 370892523 for discussion. Next event: December 26 Keywords : mobile android internet of things front end [51CTO original article, please indicate the original author and source as 51CTO.com when reprinting on partner sites] |
>>: Android performance optimization memory leak
1 WeChat is a semi-closed circle. “Good wine need...
The author recently met a friend in the same indu...
Apple has been having a hard time recently. The i...
On the morning of November 22, Liu Qiangdong issu...
In the automotive industry, the production platfo...
I wonder if you have had a similar experience: th...
It has been more than 5 years since the mini prog...
[[139618]] After continuing to invest resources i...
Many years ago, the first impression Bilibili lef...
Have you ever had such an experience? After getti...
This article mainly introduces how long it takes ...
Microsoft announced the first free Windows version...
After entering the stock market, if anyone in the...
[[184787]] This article introduces the method of ...