Keep-alive implementation principle This article is provided by the great Hong Yang and quoted and shared by the author. Keeping App processes alive has always been the eternal pursuit of major manufacturers, especially leading application developers. After all, if the App process dies, it can no longer do anything; once the App process dies, it will no longer be able to conduct any business on the user's phone, and all business models will be useless on the [user side]. The early Android system was imperfect, which led to many loopholes for many apps to exploit, so they had various ways to stay alive. For example, before Android 5.0, the process forked from the App in native mode was not controlled by the system. When the system killed the App process, it would only kill the Java process started by the App. As a result, a large number of "cancers" were born. They forked the native process and started themselves up through the am command when the App's Java process was killed, thus achieving immortality. At that time, Android was full of evil spirits and demons; the system could not control the applications at all, so it has long been criticized for its power consumption and lag. At the same time, the weakness of the system has led to the emergence of a series of frameworks and apps that control system background processes, such as the Xposed framework, blocking operation, Green Guardian, Black Domain, and Refrigerator. However, with the development of the Android system, everything is evolving in a positive direction.
However, the devil is always better than the good. As systems continue to evolve, so do methods for keeping alive. About 4 years ago, MarsDaemon appeared. This library uses a dual-process daemon approach to keep alive, and it was very popular for a while. However, the good times did not last long. After entering the Android 8.0 era, this library gradually died out. Generally speaking, Android process keep alive is divided into two aspects:
As the Android system becomes more and more complete, it is becoming increasingly impossible to keep yourself alive by yourself; therefore, there are basically two ways to "keep yourself alive":
Of course, there is another ultimate method, which is to establish a PY [friend, but I always think it is (pi yan)] relationship with major system manufacturers and add yourself to the system memory cleanup whitelist; for example, the national application WeChat. Of course, ordinary people are not qualified to take this path. About a year ago, the great gityuan published on his blog a method used by TIM to keep alive, which can be called the "ultimate immortality technique"; this method can greatly improve the survival rate of processes in the current Android kernel implementation. The author studied the implementation principle of this keep alive idea and provided a reference implementation Leoric. Next, I will share with you the implementation principle of this ultimate black technology to keep people alive. The underlying technical principle of keep-aliveKnow yourself and know your enemy, and you can fight a hundred battles with no danger of defeat. Since we want to stay alive, we must first know how we died. Generally speaking, there are two ways for the system to kill a process, both of which are provided by ActivityManagerService:
On native systems, processes are often killed using the first method, unless the user actively clicks "Force Stop" in the App's settings interface. Force Stop However, domestic manufacturers and ROMs such as OnePlus and Samsung now generally use the second method. The first method is too gentle and cannot control applications that want to cause trouble. The second method is more powerful. Generally speaking, after being force-stopped, the App can only wait to die. Therefore, to achieve keep alive, we need to know how force-stop works. In this case, let's track the execution process of the system's forceStopPackage method: First is the forceStopPackage method in ActivityManagerService: ActivityManagerService forceStopPackage Here we can see that the system force-stops the process by uid, so whether you are a native process or a Java process, force-stop will kill you all. Let's continue tracking the forceStopPackageLocked method: forceStopPackageLocked This method implementation is very clear: First, kill all processes in this App, and then clean up the four major component information remaining in system_server; we are concerned about how the process is killed, so continue to track killPackageProcessesLocked, this method will eventually call the removeProcessLocked method in ProcessList, removeProcessLocked will call the kill method of ProcessRecord, let's take a look at this kill: Here we can see that the target process is killed first, and then the target process group is killed by uid. If only the target process is killed, we can keep it alive by using dual-process daemonization. The key lies in this killProcessGroup. After further tracking, it is found that this is a native method. Its final implementation is in libprocessgroup. The code is as follows: Note the strange number here: 40 . We continue to track: Look at what our system does. It loops 40 times and kills processes continuously. It waits 5ms after each kill . The time is over after the loop is completed. Seeing this code, I think anyone will have a question: If the App still has a process after killing the process 40 times in a row, then isn’t it lucky enough to escape? ImplementationSo, how to achieve this goal? Let's look at this critical 5ms. Assuming that after the App process is killed, it can start a bunch of new processes at a fast enough speed (within 5ms), then after the system kills all the old processes in one cycle, it will encounter a bunch of new processes after sleeping for 5ms; this cycle will repeat 40 times. As long as we can start a new process every time, our App can escape the system's pursuit and achieve immortality. Yes, the purgatory-like 200ms, as long as we can survive 200ms, we can successfully overcome the tribulation and achieve enlightenment and ascend. I don't know if you have ever played the game of Whack-A-Mole. The whole process is very similar. You press one and another one pops up. As long as it pops up quickly enough every time, we win. Now the crux of the problem lies in: How to start a bunch of new processes within 5ms? Looking back at the original keep-alive method, they start the process through the am command. This command is actually a java program. It will start a process and then start an ART virtual machine, then obtain the binder agent of ams, and then communicate with ams for binder synchronization. This process is really too slow. In this 5ms race against death, its speed is really not satisfactory. Later, MarsDaemon proposed a new method, which uses binder reference to send Parcel directly to ams. This process is much faster than the am command, thus greatly improving the success rate. In fact, there is still room for improvement here. After all, this is still called at the Java layer. The Java language has a very criticized feature in such a situation with extremely high real-time requirements: Garbage Collection (GC); although the possibility of encountering a GC pause directly within 5ms is very small, due to the existence of GC, there are many checkpoints in the Java code in ART; Imagine that you are a courier who has important military information to report, but you encounter many obstacles on the way and may be ordered to stop temporarily. This situation is unacceptable. Therefore, the best way is to send a binder call to AMS through native code; Of course, if we go a little lower level, we can even send data directly to the binder driver through ioctl to complete the call, but this method has poor compatibility and is not as worry-free as the native method. By sending a binder message to AMS at the native layer to start the process, we have solved the problem of "quickly starting the process". But this is still not enough. Let's go back to the game of whack-a-mole. If you press a mole, a new mole will pop up. If you can press it every time, the probability of winning is still relatively high. But what if every time you press a mole, all the other moles pop up? That's much more difficult. If our process can pull up all the other processes after any one of them dies, it will be difficult for the system to kill us. The new keep-alive technology uses two mechanisms to ensure that processes are pulled together:
Specifically, two processes p1 and p2 are created. These two processes are associated with each other through file locks. When one is killed, the other is started. At the same time, p1 generates an orphan process c1 after two forks, and p2 generates an orphan process c2 after two forks. A file lock association is established between c1 and c2. In this way, if p1 is killed, p2 will immediately sense it. Since p1 and c1 belong to the same process group, the killing of p1 will trigger the killing of c1. After c1 dies, c2 will immediately sense it and start p1. Therefore, the four processes form an iron triangle, thus ensuring the survival rate. After this analysis, we have a clear idea of the general principle of this solution. Based on the above principles, I wrote a simple PoC, the code is here:
Those who are interested can take a look. AMS kills processes one by one using ProcessRecord (https://android.googlesource.com/platform/frameworks/base/+/4f868ed/services/core/java/com/android/server/am/ActivityManagerService.java#5766), which means that killProcessgroup in libprocessgroup will be executed multiple times. In this way, when killing a process belonging to a cgroup, the other process can survive as long as it successfully starts once android:process is another process. Because the new process corresponds to a new ProcessRecord, it will not be killed in the above loop. In addition, the 40-time loop gives a very long time to start a new one. By observing the log, we can find that the interval of killProcessgroup is as long as tens to more than one hundred milliseconds. Room for improvementThe principle of this solution is relatively simple and intuitive, but to achieve stable keep-alive, many details need to be supplemented; especially the 5ms race against death, which needs to be optimized at all costs to increase the success rate. Specifically, the current implementation is called by binder in the Java layer, and we should complete it in the native layer. I have implemented this solution before, but this library is essentially detrimental to the interests of users, so I do not intend to make the code public. Here is a brief introduction to the implementation ideas for everyone to learn: How to perform binder communication at the native layer? libbinder is an NDK public library. Get the corresponding header file and link it dynamically. Difficulty: There are many dependencies, and stripping header files is a manual job. How to organize the data for binder communication? The communication data is actually a binary stream, which is specifically represented by a (C++/Java) Parcel object. The native layer does not have a corresponding Intent Parcel, so the compatibility is poor. plan:
How to deal with it?Today I am making this implementation principle public and providing PoC code. This is not to encourage everyone to use this method to keep the system alive, but to hope that major system manufacturers can be aware of the existence of this black technology and promote their own systems to completely solve this problem. I knew about this program two years ago, but it was little known at the time. In the past month, I found that many apps have used this solution, which has caused terrible trouble for my Android phone. After all, I have nearly 800 apps installed on my phone. If every app uses this solution to stay alive, then the system will be unusable. How does the system respond? If we compare the system killing of processes to beheading, then the essence of this survival solution is to quickly grow a new head; therefore, the solution is also very simple, as long as we kill a process, we let other processes stay still and not cause trouble. There are many specific implementation methods, which will not be elaborated here. How do users respond? Until manufacturers come up with a solution, users can have some options to mitigate rogue apps that use this solution to stay alive. Here are two applications recommended to you:
The freezer and Island's deep sleep can completely prevent the App from keeping alive. Of course, if you like other "freezing" apps, such as the Black Room or Tai Chi's Yin Yang Gate, that's fine too. Other applications that do not suppress background activities through the "freezing" mechanism will theoretically have very limited effects on this keep-alive solution. Summarize1. There is nothing wrong with black technology. It is just a means to fight against the system by deeply understanding the underlying principles of the system. Many people would ask, what is the use of understanding the underlying principles of the system? This article should be able to give an answer: it can realize functions that others can never achieve, promote products through technology, and thus generate huge commercial value. 2. Although black technology is powerful, it should not exist in this world. Without rules, there will be no order. Black technology can be black for a while, but it cannot be black forever. To improve the survival rate of products, it ultimately depends on the products themselves. Respecting users and improving the experience is the right way. |
>>: Mobile app development trends you must know in 2020
Microbial electrochemistry is an emerging field t...
When you walk on any road or railway in China The...
According to new research from Accenture, only 18...
The food issue has always attracted public attent...
The popular Wang Baobao is at the center of the n...
Today is the era of short videos , which occupy a...
Which air purifier is the best? If it were three ...
Course Catalog 1The effective bottom-fishing meth...
In the early hours of this morning, NetQin offici...
Although the New Year is just a time scale artifi...
Smartwatches have dominated the tech headlines in...
As the weather warms up, have the weight loss arm...
From growth being king to retention being king, t...
I don’t know when it started, but the mobile phon...
Since 2015, the first year of the community , the...