When optimizing to scale to multiple cores…

When optimizing to scale to multiple cores…

When optimizing to scale to multiple cores

"There is no silver bullet in software development. All we can do is choose and balance;"

In the previous article, we talked about the five directions of program optimization under single thread (ref: "Five Directions of Program Optimization"); when the single core is optimized to the extreme value, it is time to multi-task;

It is very clear to think about it. A single task is broken down into multiple tasks, and multiple CPUs are allowed to work at the same time and execute in parallel, so the efficiency is naturally improved.

However, it may not be that simple;

Granularity of task decomposition

First, we need to determine whether our single task can be decomposed. For example, if we want to parse many files, it is easy to divide such a task into multiple tasks. However, if it is a time-consuming serial logic calculation, and the later calculation depends on the previous results, it is difficult to split. This form may need to be split at a higher level.

Data race

Programming is about computing and data. Computing is done in parallel, but the data is still accessed from the same source. Accessing common resources will result in resource competition.

If this is not controlled, the same data may be repeatedly calculated (in multiple read scenarios) or dirty data may be generated (in write-back scenarios).

Introducing locks

In order to make data access orderly, locks need to be introduced to prevent dirty data;

Controlling the granularity of locks is a topic that requires careful consideration;

For example, in scenarios with large amounts of reads and small amounts of writes, using read-write locks can significantly improve efficiency compared to locking everything equally.

In the products we come into contact with daily, databases are masters of locks. When updating data, whether it is locking rows, columns, or tables, the performance of different granularities varies significantly.

Thundering Herd

Consider a scenario where multiple threads are waiting for a lock. If the lock can be obtained, the thread starts working (thread pool)

When the lock is released, waking up multiple threads may cause herd shock;

Solution:

Use a single-threaded solution/processing accpet connection to handle the operation of waiting for the lock, so that only one thread is waiting for the lock at any time;

For more details, see:

In the "Client-Server Programming Method", a thread pool is created in advance, and each thread accepts a section

Data replication

Let each thread use its own data, so that the data is not shared, which can eliminate resource competition and locks;

Duplicate data into multiple copies to reduce competition and allow each user to access their own data;

But this introduces a new problem: if each thread writes back data, how to ensure the consistency of such data?

After all, what they represent is actually a piece of data;

When it comes to data consistency, synchronization between multiple copies of data is a problem;

Data Sharding

Well, let’s change our thinking and not use data replication. We use data sharding. The idea of ​​sharding is easier to think of. Since “computing” is divided into multiple small tasks, data can also be processed in the same way.

The data is divided into shards, each of which stores different contents and has nothing in common;

In this way, there is no data contention in data access, and because the data is different, there is no problem of data consistency synchronization;

However, sharding is far from being as good as imagined;

Sharding causes each thread to see a fragment of data instead of the entire set. This means that the thread can only process this specific part of data. In this way, the calculations between threads lose their interchangeability. Certain tasks can only be processed on specific threads.

And if a task needs to access all the data, it becomes even more complicated;

It turns out that after sharding, we pushed the problem upward to the thread level, and needed to consider the processing at the business logic level;

This may be more complicated;

OK, if you want faster speed and use multi-core processing, you need to face more problems;

When it comes to the architecture level, the problems we face when expanding a single machine to a multi-machine cluster are similar.

Reference: Reading Notes on "Large-Scale Website Technology Architecture" [2] - Architecture Patterns

There is no silver bullet in software development, all we can do is choose and balance;

<<:  Why Google is eating itself

>>:  4 tips to simplify your IT programmer's work life

Recommend

Whose nerves were touched by Pechoin’s “high dissemination, low conversion”?

Recently, Pechoin ’s amazing advertisement has se...

Activities and icons in the Android application life cycle

Add navigation style to your Android mobile app T...

How to create a super IP like "Li Jiaqi"

15,000 lipsticks were sold out in just 5 minutes....

Zhihu video income project, simple operation to earn 10000+ per month

I believe many people have heard that Zhihu could...

You must pay attention to these 4 points for high conversion landing pages!

We know that the composition of a bidding account...

How to get millions of traffic for free in 30 days?

Let me first tell you who this article is suitabl...

Talk about the four ways of ground promotion using the rules of the Internet

Regional promotion can also be understood as grou...

Four Seasons Health Diet Therapy Course

Four Seasons Health Diet Therapy Course Resource ...

Why do 1 million people believe the Earth is flat?

A silent, blue planet. It has been 60 years since...

Is this pop-up window weird? Beware of "logic bombs"!

1. Introduction Xiaobai: Dadongdong, can you help...

The Miracle Behind Sleep: What Does Your Body Do When You Fall Asleep?

We spend a precious time every night, that is, sl...

How to name the APP version number correctly and elegantly?

As a mobile product manager, I often plan APP ver...