Watchdog mechanism source code analysis

Watchdog mechanism source code analysis

[[434595]]

Preface

Linux introduces Watchdog. In the Linux kernel, when Watchdog is started, a timer is set. If no write operation is performed to /dev/Watchdog within the timeout period, the system will restart. Watchdog implemented by timer belongs to the software level;

Android has designed a software-level Watchdog to protect some important system services. When a failure occurs, the Android system will usually restart. Due to the existence of this mechanism, some system_server processes are often killed by Watchdog, causing the phone to restart.

Today we will analyze the principle;

1. Detailed explanation of WatchDog startup mechanism

The ANR mechanism is for applications. For system processes, if they are "unresponsive" for a long time, the Android system has designed a WatchDog mechanism to control them. If the "unresponsive" delay is exceeded, the system WatchDog will trigger the suicide mechanism;

Watchdog is a thread that inherits from Thread. In SystemServer.java, get the watchdog object through getInstance;

1. Start in SystemServer.java

  1. private void startOtherServices() {
  2. ······
  3. traceBeginAndSlog( "InitWatchdog" );
  4. final Watchdog watchdog = Watchdog.getInstance();
  5. watchdog.init(context, mActivityManagerService);
  6. traceEnd();
  7. ······
  8. traceBeginAndSlog( "StartWatchdog" );
  9. Watchdog.getInstance().start();
  10. traceEnd();
  11. }

Because it is a thread, all you need to do is start it;

2. View the construction method of WatchDog

  1. private Watchdog() {
  2. super( "watchdog" );
  3. // Initialize handler checkers for each common thread we want to   check . Note
  4. // that we are not currently checking the background thread, since it can
  5. // potentially hold longer running operations with   no guarantees about the timeliness
  6. // of operations there.
  7. // The shared foreground thread is the main checker. It is   where we
  8. // will also dispatch monitor checks and do other work .
  9. mMonitorChecker = new HandlerChecker(FgThread.getHandler(),
  10. "foreground thread" , DEFAULT_TIMEOUT);
  11. mHandlerCheckers.add (mMonitorChecker) ;
  12. // Add checker for main thread. We only do a quick check since there
  13. // can be UI running on the thread.
  14. mHandlerCheckers.add (new HandlerChecker(new Handler (Looper.getMainLooper()),
  15. "main thread" , DEFAULT_TIMEOUT));
  16. // Add checker for shared UI thread.
  17. mHandlerCheckers.add (new HandlerChecker(UiThread.getHandler(),
  18. "ui thread" , DEFAULT_TIMEOUT));
  19. // And also check IO thread.
  20. mHandlerCheckers.add (new HandlerChecker( IoThread.getHandler (),
  21. "i/o thread" , DEFAULT_TIMEOUT));
  22. // And the display thread.
  23. mHandlerCheckers.add (new HandlerChecker(DisplayThread.getHandler(),
  24. "display thread" , DEFAULT_TIMEOUT));
  25. // Initialize monitor for Binder threads.
  26. addMonitor(new BinderThreadMonitor());
  27. mOpenFdMonitor = OpenFdMonitor. create ();
  28. // See the notes on DEFAULT_TIMEOUT.
  29. assert DB ||
  30. DEFAULT_TIMEOUT > ZygoteConnectionConstants.WRAPPED_PID_TIMEOUT_MILLIS;
  31. // mtk enhance
  32. exceptionHWT = new ExceptionLog();
  33. }

Focus on two objects: mMonitorChecker and mHandlerCheckers

The source of the mHandlerCheckers list elements:

Import of construction objects: UiThread, IoThread, DisplatyThread, FgThread added

External import: Watchdog.getInstance().addThread(handler);

Source of mMonitorChecker list elements:

External import: Watchdog.getInstance().addMonitor(monitor);

Special note: addMonitor(new BinderThreadMonitor());

3. Check the run method of WatchDog

  1. public void run() {
  2. boolean waitedHalf = false ;
  3. boolean mSFHang = false ;
  4. while ( true ) {
  5. ······
  6. synchronized (this) {
  7. ······
  8. for ( int i=0; i<mHandlerCheckers. size (); i++) {
  9. HandlerChecker hc = mHandlerCheckers.get(i);
  10. hc.scheduleCheckLocked();
  11. }
  12. ······
  13. }
  14. ······
  15. }

Check the mHandlerCheckers list elements;

4. Check scheduleCheckLocked of HandlerChecker

  1. public void scheduleCheckLocked() {
  2. if (mMonitors. size () == 0 && mHandler.getLooper().getQueue().isPolling()) {
  3. // If the target looper has recently been polling, then  
  4. // there is   no reason to enqueue our checker on it since that
  5. // is   as good as it not being deadlocked. This avoid having  
  6. // to do a context switch to   check the thread. Note that we
  7. // only do this if mCheckReboot is   false   and we have no  
  8. // monitors, since those would need to be executed at this point.
  9. mCompleted = true ;
  10. return ;
  11. }
  12. if (!mCompleted) {
  13. // we already have a check   in flight, so no need
  14. return ;
  15. }
  16. mCompleted = false ;
  17. mCurrentMonitor = null ;
  18. mStartTime = SystemClock.uptimeMillis();
  19. mHandler.postAtFrontOfQueue(this);
  20. }

When mMonitors.size() == 0: Mainly to check whether the elements in mHandlerCheckers have timed out, the method used is: mHandler.getLooper().getQueue().isPolling();

The list elements of the mMonitorChecker object must be greater than 0. At this time, the focus is on mHandler.postAtFrontOfQueue(this);

  1. public void run() {
  2. final int   size = mMonitors.size () ;
  3. for ( int i = 0 ; i < size ; i++) {
  4. synchronized (Watchdog.this) {
  5. mCurrentMonitor = mMonitors.get(i);
  6. }
  7. mCurrentMonitor.monitor();
  8. }
  9. synchronized (Watchdog.this) {
  10. mCompleted = true ;
  11. mCurrentMonitor = null ;
  12. }
  13. }

Listen to the monitor method, here is to monitor mMonitors, and the only one that can meet the conditions is: mMonitorChecker, for example: various services are added to the list through addMonitor;

  1. ActivityManagerService.java
  2. Watchdog.getInstance().addMonitor(this);
  3. InputManagerService.java
  4. Watchdog.getInstance().addMonitor(this);
  5. PowerManagerService.java
  6. Watchdog.getInstance().addMonitor(this);
  7. ActivityManagerService.java
  8. Watchdog.getInstance().addMonitor(this);
  9. WindowManagerService.java
  10. Watchdog.getInstance().addMonitor(this);

The monitor method executed is very simple, for example ActivityManagerService:

  1. public void monitor() {
  2. synchronized (this) { }
  3. }

Here we just check whether the system service is locked;

Watchdog's inner class;

  1. private static final class BinderThreadMonitor implements Watchdog.Monitor {
  2. @Override
  3. public void monitor() {
  4. Binder.blockUntilThreadAvailable();
  5. }
  6. }
  7. android.os.Binder.java
  8. public   static final native void blockUntilThreadAvailable();
  9. android_util_Binder.cpp
  10. static void android_os_Binder_blockUntilThreadAvailable(JNIEnv* env, jobject clazz)
  11. {
  12. return IPCThreadState::self()->blockUntilThreadAvailable();
  13. }
  14. IPCThreadState.cpp
  15. void IPCThreadState::blockUntilThreadAvailable()
  16. {
  17. pthread_mutex_lock(&mProcess->mThreadCountLock);
  18. while (mProcess->mExecutingThreadsCount >= mProcess->mMaxThreads) {
  19. ALOGW( "Waiting for thread to be free. mExecutingThreadsCount=%lu mMaxThreads=%lu\n" ,
  20. static_cast<unsigned long>(mProcess->mExecutingThreadsCount),
  21. static_cast<unsigned long>(mProcess->mMaxThreads));
  22. pthread_cond_wait(&mProcess->mThreadCountDecrement, &mProcess->mThreadCountLock);
  23. }
  24. pthread_mutex_unlock(&mProcess->mThreadCountLock);
  25. }

Here we just check that the number of executable threads contained in the process does not exceed mMaxThreads. If it exceeds the maximum value (31), we need to wait;

  1. ProcessState.cpp
  2. #define DEFAULT_MAX_BINDER_THREADS 15
  3. But systemserver.java sets
  4. // maximum number of binder threads used for system_server
  5. // will be higher than the system default  
  6. private static final int sMaxBinderThreads = 31;
  7. private void run() {
  8. ······
  9. BinderInternal.setMaxThreads(sMaxBinderThreads);
  10. ······
  11. }

5. Exit after timeout

  1. public void run() {
  2. ······
  3. Process.killProcess(Process.myPid());
  4. System.exit(10);
  5. ······
  6. }

Kill the process you are in (system_server) and exit;

2. Principle Explanation

1. All services that need to be monitored in the system call Watchdog's addMonitor to add Monitor Checker to the mMonitors List or addThread method to add Looper Checker to the mHandlerCheckers List;

2. When the Watchdog thread is started, it begins an infinite loop and its run method begins to execute;

  • The first step is to call HandlerChecker#scheduleCheckLocked to process all mHandlerCheckers
  • The second step is to regularly check whether it has timed out. The interval between each check is set by the CHECK_INTERVAL constant, which is 30 seconds. Each check will call the evaluateCheckerCompletionLocked() method to evaluate the completion status of HandlerChecker:
  • COMPLETED means it has been completed;
  • WAITING and WAITED_HALF indicate that the system is still waiting but has not timed out. A trace will be dumped once during WAITED_HALF.
  • OVERDUE means timeout has occurred. By default, timeout is 1 minute;

3. If the timeout is reached and the HandlerChecker is still in an unfinished state (OVERDUE), get the blocked HandlerChecker through the getBlockedCheckersLocked() method, generate some descriptive information, save the log, including some runtime stack information.

4. Finally, kill the SystemServer process;

Summarize

Watchdog is a thread used to monitor whether the system services are running normally and no deadlock occurs;

HandlerChecker is used to check Handler and monitor;

Monitor uses locks to determine whether there is a deadlock;

If the timeout is 30 seconds, the log will be output, and if the timeout is 60 seconds, the system will restart.

Watchdog will kill its own process, which means that the system_server process id will change at this time;

This article is reproduced from the WeChat public account "Android Development Programming"

<<:  Don’t listen to the official, the reasons and solutions for the rapid battery drain of mobile phones are here

>>:  Even if you have disabled or uninstalled the app, it will still automatically renew! Expert advice: You can turn off the automatic renewal of Alipay and WeChat

Recommend

Changes in the traffic market in 2020

2020 is a very unforgettable year in the hearts o...

16 Awesome E-Commerce Conversion Rate Optimization Strategies

This article introduces 16 optimization strategie...

How much does it cost to create a ticketing app in Shangluo?

In order to better penetrate into various industr...

Tanlin Diversity Bonus Intensive Course

Tanlin Diversity Bonus Intensive Course Resource ...

How to choose a title for a short video? 10 templates for short video titles!

No matter how good the content is, it will be gre...

Xiaomi Youpin Product Analysis Report

Boutique e-commerce is different from traditional...

Which aspect of Pinduoduo is most similar to Toutiao?

Known as the Toutiao of the e-commerce industry ,...

Get the App information you want through PackageManager!

[[207610]] 1. Introduction Let's get straight...

China Unicom iPhone 5S Official Unlock 4G

Yesterday, China Unicom launched an APP designed ...