How did Facebook achieve 800,000 people watching live broadcasts at the same time?

How did Facebook achieve 800,000 people watching live broadcasts at the same time?

There are only a handful of companies that know how to provide world-spanning distribution services—even fewer than the number of countries with nuclear weapons today. Facebook is one of these few, and its new live video streaming product Facebook Live is a representative of leapfrog distribution services.

Facebook CEO Mark Zuckerberg:

We finally decided to shift the focus of our video services to live streaming , because live streaming is an emerging model that is different from the online video model of the past five to ten years. We are about to enter the golden stage of video development. Fast forward five years, and most of what people see on Facebook will likely be video.

The power of Facebook Live was on display in a 45-minute video in which two people used rubber bands to pressurize a watermelon and eventually make it explode. The video has been viewed by as many as 800,000 people and has over 300,000 comments. This astonishing statistic is based on Facebook's 1.5 billion user base.

When the 2015 Super Bowl was broadcast in the United States, 1.14 billion people watched the game, and about 2.36 million watched the game through live broadcast. On the Twitch platform, the peak number of viewers at the 2015 E3 gaming event reached 840,000. During the Republican debate on September 13, the number of live viewers reached 921,000.

Under current technological conditions, Facebook is also far ahead in the field of live streaming. It can’t be ignored that Facebook has a lot of live shows going on at the same time.

Chris Cox, Director of Product at Facebook:

An article in Wired quoted Chris Cox, Facebook's product director, as saying that Facebook has a department of more than 100 people responsible for live streaming (it started with less than 12 people and now has grown to 150 engineers).

Facebook needs to provide services for millions of simultaneous live broadcasts without any failures, support millions of viewers watching the live broadcasts, and handle the problem of smooth connections between different devices and service providers. “This is a really hard technical problem that requires a lot of infrastructure,” Cox said.

Are you curious about how these infrastructure problems are solved?

Federico Larumbe from Facebook's traffic team, who has been working on a super-fast cache software that powers the Facebook content delivery network and global load balancing system, gave a great talk. In his speech, he gave a detailed introduction on how live broadcast was achieved. This speech was indeed very wonderful.

The starting point of live broadcast technology

Facebook has a new feature that allows users to share recorded videos in real time. The live streaming service started in April 2015, initially only available to celebrities through the Mentions app, mainly to interact with their fans. The following year, the service offering was enhanced and the product agreement changed. They started using live streaming and hypertext transfer, and gained support from the iPhone, which allowed them to use a CDN structure. At the same time, Facebook began to study the Real-Time Messaging Protocol (RTMP) based on the Transmission Control Protocol, which can transmit a live video or audio from a mobile phone to a live broadcast server.

Advantages: RTMP has lower risks for streamers and viewers. Different from traditional broadcasting methods, people can communicate with each other on the live broadcast platform . With low risk and low latency, user experience is improved.

Disadvantages: Because the original device is not based on HTTP, a whole new structure is required. It takes time for new real-time messaging protocols to develop and gain scale.

Facebook is also working on MPEG-DAH (Dynamic Adaptive Streaming over HTTP)

Advantages: Compared with streaming, it can save 15% of space.

Disadvantages: It has more flexible bit rate requirements. The encoding quality will vary depending on the network throughput.

Live video is diverse, and this raises questions

Live streaming will experience a watermelon-like traffic pattern: it will go through a sharp rise after the start. After a few minutes, there will be more than 100 requests per second, and it will continue to grow until the end of the live streaming, after which the traffic will drop sharply. In other words, the traffic changes are very drastic.

Live streaming is different from general video programs in that it will produce very extreme traffic patterns. Live broadcasts attract more attention and are usually watched by three times more people than regular videos. Live broadcasts that are ranked higher in the News Feed will usually have more people watching, and notifications about the live broadcast will be sent to all fans on each page, so more people will watch the video. However, extreme traffic can cause problems for caching systems and global load balancing systems:

Cache issues

Many people may want to watch the live video at the same time. This is the classic thundering herd problem. Extreme traffic patterns place significant stress on the cache system. Videos are broken down into second-by-second files, and when extreme traffic occurs, the servers that cache these files become overloaded.

Global load balancing issues

Facebook has points of presence all over the world, and its traffic is distributed in various countries around the world. So the problem Facebook needs to solve is how to prevent the access points from being overloaded.

Overall architecture

How is live content transmitted to millions of viewers? The anchor starts the live video broadcast on their mobile phones. The mobile phone sends an RTMP streaming video to the live broadcast server. The server decodes the video and then transcodes it into multiple bitrates. Next, a set of one-second MPEG-DASH segments are generated for each bit rate. These fragments are stored in the data cache processing center and then sent to the cache hard disk at the network point of presence. The audience can then receive the live broadcast, with the player in their device extracting clips from the POP cache at a rate of one per second.

How does it work?

There is a multiplicative relationship between the data cache center and the caches at numerous points of presence. Users access point caches rather than data centers, and these point caches are distributed around the world.

Another multiplicative relationship occurs within the point of presence. There are two levels of points of presence: the proxy server level and the cache level. The viewer sends a request to retrieve the clip from the proxy server. The server checks to see if the fragment exists in the cache. If it exists, the clip is sent to the viewer, if not, the request is sent to the data center. Different fragments are stored in different cache hard disks, so that load balancing can be achieved between different cache hosts.

Protecting Data Centers from Thundering Herd Effects

What happens if all viewers request the same clip at the same time? If the clip is not in the cache, each viewer's request will be sent to the data center, so the data center needs to be protected.

Pull request

By consolidating requests into a point-of-presence cache, the number of requests is reduced. Only the first request made is sent to the data center. Other requests will not be processed until the first request is responded to, and then the data will be sent to each viewer. Now there is an added caching layer in the proxy server to prevent hot server problems. If all relevant requests were sent to a cache host, waiting for the fragment to be sent back, this could overload the host.

Global Load Balancing Solution

While data centers are well protected from thundering herd issues, points of entry still present risks. The live streaming traffic may be so large that it may be overloaded before the load detection at the point of presence reaches a balance.

Each point of presence has limited servers and connectivity. How can we prevent the access points from being overloaded? A system called Cartographer maps can solve this problem well. It measures the latency of each network and point of presence, i.e. latency measurement. Each user is sent to the nearest POP and the load on each POP can be measured. There are counters on the servers to measure the amount of load they are receiving. By combining the data from these counters, we can know the load of each access point.

Optimization Problems with Capacity Constraints and Risk Reduction

Because of the control system, there will be delays in measurement and action. They changed the load measurement window from 90 seconds to 3 seconds. The solution to this problem is to predict the load before it occurs. A capacity estimator is built into it to infer the future load of each point of presence based on the previous load and current load.

How does the estimator predict load? If the current load increases, is it likely to decrease?

The cubic spline function can predict more complex traffic patterns than the linear interpolation function. We use the cubic spline function to solve the interpolation function and first obtain the first and second derivatives. If the speed is positive, the load is rising. If the acceleration is negative, it means that the speed decreases and eventually decreases to zero.

Avoid shock

Interpolation functions can also solve the oscillation problem.

Delays in measuring and reacting are due to outdated data, and interpolation functions can reduce errors, make more accurate predictions, and reduce oscillations. This way, the load will be closer to the capacity target. The current forecast is based on the last three intervals of 30 seconds each, which were overloaded almost instantly.

<<:  How much does it cost to develop a Xiangxi musical instrument mini program? What is the quote for developing Xiangxi musical instrument mini program?

>>:  This is how you write copy with a good conversion rate!

Recommend

How much does it cost to customize a catering WeChat mini program?

Q: How much does it cost to customize a catering ...

WeChat for Business: A Practical Guide to Marketing Management

Introduction to the practical strategy resource o...

YouTube video marketing, tips to improve ad conversion rates!

Are you struggling to grow your YouTube channel? ...

"Understand operations and do marketing" - three steps to operational strategy

In the early days of mobile Internet , the earlie...

8 big data analysis models, essential for operations!

You probably know there are eight models for data...

How much does it cost to create a training mini program in Changji?

WeChat Mini Program is an application that users ...

Marketing promotion: 10 tips for content marketing!

In the era of mobile Internet with information ex...

B Station Brand Marketing Guide!

Keywords of this article: Bilibili , brand market...