Android advanced obfuscation and code protection technology

Android advanced obfuscation and code protection technology

[[197795]]

This is an article about Android code protection, which aims to introduce various advanced techniques for code obfuscation and reverse engineering prevention. Everyone is busy, and I am also in a hurry to go back to develop my new application, so I won't say much, the more I do, the better.

Before I begin, it's worth mentioning that this article has more than 5,000 words and was written entirely using the "Pure Writing" tool I developed. Pure Writing focuses on security, writing experience, and never losing content. So in the spirit of cherishing life, I used Pure Writing to write this article.

This article has two parts, one is about obfuscation, and the other is about some security measures under obfuscation. The basic principle is: to increase the obfuscation strength and protect the code security as much as possible without causing trouble to yourself and being able to read the exception log normally.

Original article address: http://drakeet.me/android-advanced-proguard-and-security/

Obfuscation

Android officially integrates Proguard for us to perform code obfuscation work. You can search for various explanations of Proguard's rules. These articles are all the same, so I will not repeat them. I will only talk about some special and useful tips:

In general, the default in Android's gradle is:

  1. proguardFiles getDefaultProguardFile( 'proguard-android.txt' ), 'proguard-rules.pro'  

Many people don't understand this line of code. It means that two Proguard rules files are specified, one is the official obfuscation rules file path obtained by the getDefaultProguardFile() method, and the other is the proguard-rules.pro file path in the same directory as the current gradle.

The latter is in our project, written by us, there is nothing much to say. What we need to focus on is the former default Proguard file. What is its content? Have you ever explored it? If not, you can search for proguard-android.txt in your system files and you should be able to find it. Go and see for yourself. I will talk about some key points. This default file helps us declare many obfuscation rules, including: keep all classes inherited from View, keep all classes inherited from Activity, keep all JavascriptInterface, native method declarations, and keep some contents annotated with @Keep.

So you know why, by default, even if you don’t add a rule yourself, your custom Views and Activities are preserved, at least the class names are not obfuscated.

So why does the official default write this for us? Why should View and Activity be retained by default?

In short, because Proguard was originally built for Java, it cannot search for which Java classes are referenced in our AndroidManifest, layout and other files. Therefore, if the Java code changes but the reference in the XML file does not change, reflection will fail. Therefore, these classes used by XML need to be kept.

To solve this problem, the Ele.me team provided a little-known gradle plugin to harmlessly obfuscate Activity and View. The project is called Mess: https://github.com/eleme/Mess. You can read its documentation and tutorials later. The links are attached at the end. In short, Mess makes up for the shortcoming of Proguard's inability to retrieve XML files and helps Proguard complete the renaming and mapping of Activity and View.

Having said that, I suggested that you understand the default obfuscation configuration file line by line, because only in this way can you know what the entire obfuscation tool has done for you. After understanding it clearly, one approach I suggest is to copy this default file to your project directory, delete getDefaultProguardFile('proguard-android.txt'), and then import the original default file that exists in your directory. The advantage of doing this is that it is convenient for you to modify this default file, because some of its content is unnecessary or can be changed. But basically we can keep it as it is. Another advantage of copying it is to avoid it being updated by an external party, which will cause variables after you refer to it. In short, the configuration item proguardFiles (actually a gradle method) can accept *** rules file paths, and its parameter is a variable string parameter. However, in order to avoid horizontal development of the code, I prefer to use another method called proguardFile. Note that there is a missing s. It accepts a single parameter, which is equivalent to adding a rule. In this regard, I provide my configuration for reference:

  1. release {
  2. debuggable false  
  3. minifyEnabled true  
  4. zipAlignEnabled true  
  5. shrinkResources true  
  6. signingConfig signingConfigs.release
  7. proguardFile 'proguard-common.pro'  
  8. proguardFile 'proguard-rules.pro'  
  9. proguardFile 'proguard-rules-google-ads.pro' }

The proguard-common.pro file is the official default configuration file I copied above. I put it in the current module directory and parallel to proguard-rules.pro. This is very clear and easy to reuse.

After finishing the basic content, I decided to introduce two particularly useful Proguard rules:

-repackageclasses

-repackageclasses is a very powerful rule. It can move all your code and all the third-party library codes used to the same package. Some people may know this configuration, but knowing it alone cannot bring it to its full potential. By default, you only need to write -repackageclasses in the rules file. It will move all the above code files to the root package directory, that is, under /package. In this way, when someone decompiles your APK, they will see thousands of class files listed under the root package. In addition, because we sometimes have to keep some class files, the package name hierarchy of your application will still exist, and some classes that are not completely obfuscated will continue to remain under your package name. These class files will not be well protected. So I want to introduce a little trick, which is -repackageclasses followed by the package name of your application, such as:

-repackageclasses com.drakeet.purewriter.debug

After doing this, Proguard will eventually move all class files including third-party libraries to your package name. The so-called hiding leaves in the forest, at this time those classes that you have not completely obfuscated can also be hidden in this sea of ​​files, and these class file names will be obfuscated into names with a combination of letters abcd.

It should be noted that the -repackageclasses + your package name approach has a confusion bug, while the default -repackageclasses without a package name will not cause a bug, so you need to test it when using this method for the first time, otherwise please settle for the next best option. I won’t go into the details of this bug as it would be too verbose.

The second practical rules configuration item: -obfuscationdictionary

-obfuscationdictionary is followed by a plain text file path. It specifies a dictionary file as the obfuscation dictionary. By default, our code names will be obfuscated into a combination of letters such as abcdefg... If you need to modify it, you can use this configuration item to modify the dictionary to garbled or Chinese content. Garbled naming can make decompilers doubt their lives. Chinese naming can destroy the normal operation of some decompilation software, and some Chinese naming can also have a dazzling effect. For example, the words of an elder who is more popular on GitHub are used as a dictionary. It is inconvenient to post them here (it may be dangerous). You can search by yourself. Don't blame me if you can't find them. These words as code names can immerse decompilers in them and make them lose interest in analyzing the code:P.

***, regarding obfuscation, we still have a weakness, which is resource files. Proguard does not care about our resource files at all. Therefore, if the resource file name is not protected, it is easy to find the associated Java code. In this regard, the WeChat team provides a useful resource obfuscation tool, which can not only help you fully obfuscate resource files, but also help you reduce the overall size of resource files. This tool is called AndResGuard, open source address: https://github.com/shwenzhang/AndResGuard

Well, I have finally briefly talked about some key points about obfuscation. There are actually many more small contents about obfuscation, such as using consumerProguardFiles to configure an obfuscation file for a library or SDK project, so that when an app references your library, there is no need to configure related obfuscation content. The app will automatically read the keep action required from the file configured by consumerProguardFiles, which is a very useful function for library developers. I will not go into more details. I will attach a snippet of my obfuscation configuration file at the end of the article.

Safety

Code obfuscation is not enough, we need more skills to protect our code, especially for SDK developers who need to do obfuscation but also need to expose many APIs. Obfuscation is the foundation, and code security is awareness.

First of all, we need to know how our obfuscated code is cracked. In fact, for decompilers, the easiest starting point is string search. The string values ​​we hard-coded in the code will be restored as they are during the decompilation process, so this is our primary focus. To avoid being cracked through strings, we should do the following:

First, do not hard-code string values. Even if you have to do this, you should at least create another class, such as HardStrings, to statically store these hard-coded strings. This way, the decompiler can only search for your constant class, but it is difficult to search where these string constants are referenced.

Second, delete the log code during the release obfuscation process. Using the -assumenosideeffects configuration item can help us delete all the log code before compiling it into APK. This not only helps improve performance, but the log code often retains a lot of our intentions and many strings that can be decompiled:

  1. -assumenosideeffects class android.util.Log {
  2. public   static boolean isLoggable(java.lang.String, int );
  3. public   static   int d(...);
  4. public   static   int w(...);
  5. public   static   int v(...);
  6. public   static   int i(...);
  7. }

Third, for some hard-coded and log content that you have to leave, you can use encoding to replace it. For example, you can stipulate that "4001" represents a certain error, instead of writing a specific description string of the error in your code. If you do this, you need a place to record the contents of these encoding mappings. There is a trick for this: you can create another constant class whose content is a bunch of static string objects. For the example above, you can use the real error message as the name of a string variable and write its value as a code, as follows:

  1. public   static final String SHOULD_REGISTER_FIRST_ERROR = "ssrrffe" ;

This way, when you look at the unobfuscated code that references this static variable, you can see at a glance what it means. What the decompiler sees is:

  1. public   static final String abc = "ssrrffe" ;

I don't understand the naming, and I don't understand the value.

Fourth, hide particularly sensitive string contents such as AppKey in the native so file.

That’s about it about string techniques. It’s good enough to be able to do these. There are some extreme methods that I won’t talk about. In order to hinder hackers from reading, it will become very troublesome. It’s a double-edged sword, which is not the result we want.

Then let's talk about another weakness of the obfuscated code, which is some content that we have to keep. If you are a closed-source SDK developer, you will need to keep more content, almost all public classes, variables, and methods. So what should we do about this problem? Here is a method:

Set a delegator for the content that needs to be kept, and then throw the delegator into the sea.

It's very mysterious, right? Haha, it helps to remember. In fact, it is the same as the idea of ​​hiding leaves in the forest that we talked about in the obfuscation chapter. If a class has to be kept, then transfer all its contents to a private or internal class object to complete. The code of this delegate class object can be completely obfuscated, and then you hide this delegate class in a large amount of code through the obfuscation tool. This is enough to cause a lot of trouble for the decompiler. Compared with directly obtaining the logic code, it will be much more difficult to find the entity's logic code in the future.

Therefore, if you know there is such a method, you can actually not use the Activity and View obfuscation tool provided by Ele.me, and you can also protect your Activity and View well.

However, in general, we do not need to protect all content, just entrust the key and core content.

******, what we need to do is to prevent the decompiler from repackaging, which is a complete dead end. What we can do is to add signature verification to the code and make a two-way dependency. Regarding this, I have written something similar to Alibaba's black box, which can check signatures and encrypt and decrypt content in native. There are also plans to organize and open source it in the future, so I won't say more here for now.

In addition, I have written an article called "Theoretical Guide to Android Key Protection and C/S Network Transmission Security". If you are interested, you can read it later.

In short, code security and obfuscation is a matter of awareness and skills, but it is not difficult. It is good enough to master the above content. This is the end of the sharing. If you have any questions or problems, please feel free to contact us.

<<:  Summary of various bottlenecks encountered on the road of Android learning

>>:  Android Complete Componentization Solution Practice

Recommend

Introduction to the Tik Tok information flow brand advertising matrix!

Douyin’s brand advertising product matrix: Douyin...

4 steps to explain the most comprehensive ideas for attracting new users!

Attracting new members to the community is a prob...

IHS Markit: Average U.S. vehicle lifespan reaches record 12.1 years in 2020

The average useful life of U.S. cars and light tr...

Is Godzilla actually a monster "blown out" by a nuclear bomb?

Godzilla comes from Japan, Japan in the 1950s. Th...

Is the sound of spring thunder really worth "tens of millions of taels of gold"?

The Book of Rites, Monthly Ordinances, records: &...

How to choose a mini program development company?

There are several criteria for judging whether a ...

【WP Development】Realize the "Shake" function

Although I log in to WeChat once every eight mont...

Is it good or bad? The new Office 2016

Microsoft pointed out at the Windows 10 launch ev...