How the AI ​​behind TikTok works

How the AI ​​behind TikTok works

TikTok is a video-sharing app that allows users to create and share short videos. It impresses users with personalized recommendations "just for you". It is highly addictive and popular among Generation Z, and artificial intelligence is the main technology behind it.

TikTok Architecture

The architecture of TikTok's recommendation system consists of three components: big data framework, machine learning, and microservice architecture.

(1) The big data framework is the starting point of the recommendation system. It provides real-time data stream processing, data calculation, and data storage.

(2) Machine learning is the brain of the recommendation system. A range of machine learning and deep learning algorithms and techniques are applied to build models and generate recommendations tailored to individual preferences.

(3) Microservice architecture is the underlying infrastructure that enables the entire system to provide fast and efficient services.

Big Data Framework

Without data, there is no wisdom. Most of TikTok's data comes from users' smartphones. This includes the operating system and installed applications, etc. More importantly, TikTok pays special attention to users' activity logs, such as viewing time, swiping, likes, sharing, and comments.

Log data is collected and aggregated by Flume and Scribe, which are piped into Kafka queues. Apache Storm then processes the data stream in real time with other components in the Apache Hadoop ecosystem.

The Apache Hadoop ecosystem is a distributed system for data processing and storage. This includes MapReduce, the first-generation distributed data processing system, which processes data in parallel with batch processing. YARN is a framework for job scheduling and cluster resource management; HDFS is a distributed file system; HBase is a scalable distributed database that supports structured data storage for large tables. Hive is a data warehouse infrastructure that provides data aggregation and querying. Zookeeper is a high-performance coordination service.

With the rapid growth of data volume, real-time data processing frameworks have emerged. Apache Spark is a third-generation framework that facilitates near-real-time distributed processing of big data workloads. Spark enhances the performance of MapReduce by processing in memory. In the past few years, TikTok has applied the fourth-generation framework Flink. It is designed for real-time stream processing locally.

Database systems include MySQL, MongoDB, etc.

Machine Learning

This is the core of how TikTok got the hyper-personalized, addictive algorithm that it is a household name for. After the massive data sets come in, content analysis, user analysis, and scene analysis. Neural network deep learning frameworks like TensorFlow are used to perform computer vision and natural language processing. Computer vision will decipher images with photos and videos. Natural language processing includes classification, labeling, and evaluation.

Use classic machine learning algorithms, including logistic regression, convolutional neural networks, recurrent neural networks, and gradient boosted decision trees. Apply common recommendation methods such as content-based filtering, collaborative filtering, and more advanced matrix factorization.

TikTok’s secret weapon for reading people’s minds is:

(1) Algorithm experimentation platform: Engineers experiment with a mixture of multiple machine learning algorithms such as logistic regression and convolutional neural networks, then run tests (A/B tests) and make adjustments.

(2) Extensive classification and labeling: Models are based on user engagement, such as viewing time, swiping, and commonly liking or sharing (what people do is usually a subconscious reflection). The number of user features, vectors, and categories exceeds that of most recommendation systems in the world, and they are constantly increasing.

(3) User Feedback Engine: Updates the model after retrieving user feedback in multiple iterations. The experience management platform is built on top of this engine and ultimately improves upon these flaws and suggestions.

To solve the cold start problem in recommendation, a recall strategy is used, which selects thousands of candidates from tens of millions of videos, which have been proven to be popular and high-quality videos.

At the same time, some AI work has been moved to the client side for ultra-fast responses. This includes real-time training, modeling, and inference on the device. The client side uses machine learning frameworks such as TensorFlow Lite or ByteNN.

Microservices Architecture

TikTok uses cloud-native infrastructure. Recommendation components such as user analysis, prediction, cold start, recall, and user feedback engines are used as APIs. These services are hosted in cloud platforms such as Amazon AWS and Microsoft Azure. As a result of the system, video curation will be pushed to users through the cloud.

TikTok uses containerization technology based on Kubernetes. Kubernetes is known as a container orchestrator, which is a toolset for automating the application lifecycle. Kubeflow is dedicated to deploying machine learning workflows on Kubernetes.

As part of the cloud native stack, service mesh is another tool that handles service-to-service communication. It controls how different parts of an application share data with each other. It inserts functions or services at the platform layer rather than the application layer.

Due to the requirement for high concurrency, these services are built with Go and gRPC. In TikTok, Go has become the dominant language in service development due to its good built-in network and concurrency support. gRPC is a remote process control framework for efficiently building and connecting services.

Tiktok's success lies in its efforts to provide the best user experience. They build internal tools to maximize low-level (system-level) performance. For example, ByteMesh is an improved version of Service Mesh, KiteX is a high-performance Golang gRPC framework, and Sonic is an enhanced Golang JSON library. Other internal tools or systems include parameter servers, ByteNN, and abase.

As TikTok's head of machine learning said, sometimes the underlying infrastructure is more important than the (machine learning) algorithms on top of it.

<<:  A generation of legendary phones is finally abandoned? iOS 16 may remove support for iPhone 6S/SE

>>:  Analyze ten product details to see how big manufacturers design!

Recommend

Nature has a set of "lazy" rules? You heard it right!

1. Introduction: Nature’s mysterious “lazy” law H...

Why can’t you write a good copy that scores 80 points?

I have read "Copywriting Fever", "...

The same old story: TV versions of video sites undergo a name change trend

Since June this year, the State Administration of...

TTPPRC business model, acquire MBA's business analysis ability in 30 minutes

[[155274]] This is an era of entrepreneurship for...

Can data-driven operations really lead to rapid user growth?

As the saying goes, "Good wine needs no bush...