Case StudiesIn this case study, we will examine a malicious sample of the Nexus banking trojan (file MD5: d87e04db4f4a36df263ecbfe8a8605bd). Nexus is a framework sold on underground forums that is capable of stealing funds from a number of banking apps on Android phones. A report published by Cyble provides more details about the framework and a thorough analysis of the sample. Analyzing the sample using jadx, the AndroidManifest.xml file in the application (d87...) shows that it requests access to sensitive information such as SMS, contacts, phone calls, etc. in the device. The main activity in AndroidManifest.xml does not appear at the beginning of the application because it will be unpacked later, but another class is mentioned as "com.toss.soda.RWzFxGbGeHaKi" and extends the Application class, which means it will be the first class to run in the application: picture The onCreate() callback in the Application subclass "com.toss.soda.RWzFxGbGeHaKi" references two additional methods: melodynight() and justclinic(), which in turn calls another method: bleakperfect(). picture The bleakperfect() method, along with several other methods in the application, contains a large amount of dead code that involves assigning values to variables and performing arithmetic on them using multiple loops, but ultimately those variables are never used. Additionally, this method is used to decode strings referenced elsewhere in the code. This is done by XORing one byte array (the encoded string) with another byte array (the XOR key) and storing the result in a third byte array, which is then converted to a string. picture Patching methods such as this, which remove redundant code and replace lengthy XOR operations with string returns, can make analysis of the application much easier and more efficient. To do this, we must understand how this code is presented in the DEX file. DEX OverviewAndroid applications are primarily written in Java. To run on an Android device, Java code is compiled into Java bytecode, which is then converted into Dalvik bytecode. Dalvik bytecode can be found in the DEX (Dalvik Executable) file of the APK. An APK (Android Package File) is essentially a ZIP file containing the application code and required resources. The DEX file can be examined by extracting the contents of the APK. DEX files are divided into several sections, including headers, string tables, class definitions, method code, and other data. Most sections are divided into equal-sized chunks that contain multiple values to define the items in the section. To show how common concepts in Java, such as classes or strings, are translated in DEX files, we will use the class_defs section as an example. picture About ClassThe class_defs section consists of class_def_items, each of which is 32 bytes long in the application. The name of the class is stored in the following way: the class_def_item contains an index to an item in the type_ids section (class_idx), which in turn contains an index to another item in string_ids (descriptor_idx). The value under string_id_item is an offset from the beginning of the file, which points to the beginning of the string_data_item containing the actual class name string (data) preceded by its length (utf16_size). picture class_def_item has another member (class_data_off) which is an offset into a class_data_item, which represents the data associated with the class. It contains information about the static and virtual methods, static and instance fields of the class, and matching encoded_method and encoded_field items for each method and field. About the Methoddirect_methods and virtual_methods contain a series of encoded_method items. In the first encoded_method item of each method type, the method_idx_diff value holds the index of the matching item in the method_ids section. However, in subsequent items, this value is the difference relative to the previous item, and to calculate the method_ids index, the difference must be added to the previous method_idx_diff value. picture Finally, the method name in method_id_item is stored under name_idx similar to the class name in type_id_item, and the string value of the method name is retrieved using the string_id_item index. picture In an Android application, each method has a prologue (or code_item), which specifies information about the method size, input and output parameters, and exception handling data. The offset of this prologue in the DEX file is stored in the code_off value of the encoded_method item mentioned above. The first two bytes of the preamble indicate the register size, i.e. how many registers the bytecode uses, followed by the word size of the input and output parameters, while the last four bytes are the bytecode size (or insns_size). The bytecode size is calculated in 16-bit instruction units, which means that to calculate the total number of bytes (8-bit units) in the bytecode, you must multiply this value by two. The Dalvik bytecode for a method starts directly after the prologue. picture About StringsSo far, we have seen two examples where string_id_items are used to extract class names and method names from the string table in the DEX file. However, in Dalvik bytecode, string_id_item is also very important and is referenced when a string value is used in the application code. For example, the following bytecode sequence returns the "sampleValue" string, where "0xABCD" is the index of the string_id_item of "sampleValue" in the string_ids section. This means that one obstacle when patching the bytecode of a malicious sample is that the decrypted strings that should be returned after decoding do not exist in the string table of the DEX file. Instead, they must be added to the file after decoding in order to have a matching string_data_item and string_id_item index that can be referenced by the code. Naturally, adding these strings causes some of the file's sizes, indices, and offsets to change. This creates another obstacle because in the previously shown DEX file, there are multiple dependencies between different items, and changing the index or offset they refer to will cause these items to be parsed incorrectly or have incorrect member values. This is why when patching a method, it is important to ensure that the rest of the DEX file remains intact. About PatchesTo achieve this, we created dexmod, a Python helper tool that patches DEX files according to user-specified deobfuscation logic. In addition to patching, the tool also supports operations such as method lookup or adding strings using bytecode patterns . For the obfuscated methods in the Nexus sample to return the decrypted string, dexmod must be used to decode and add the string to the file. The bytecode sequence that returns the string seen in the DEX file is then placed at the beginning of the bytecode of each obfuscated method and paired with the corresponding string_id_item index. Any remaining bytes in the method can be replaced with 0x00 (NOP) for additional code cleanup, but this is not necessary. The prologue of each method also needs to be updated to reflect these changes; the register size is reduced to 1, since only one register is used (v0), and the bytecode size is updated to 3, since it now consists of only 3 16-bit instructions (6 bytes). The other values in the prologue can remain unchanged, since the items they represent are not affected. picture In the header of the DEX file, the checksum and SHA-1 signature values must also be updated; otherwise, verification of the file contents will fail. After implementing these steps with dexmod, the DEX file can be re-checked with jadx. The once obfuscated functions will now remove all dead code and return the decoded string: picture Since the obfuscated method in the Nexus sample is called by another method rather than directly, another possibility is to patch the caller method and return a string, thus skipping the obfuscated method altogether. Doing so would save researchers time in repeatedly jumping methods during analysis. SummarizeThis case study shows how useful Dalvik bytecode patching can be for researchers, and how it can be done using free, open-source tools. Similar to the problems faced by other anti-obfuscation solutions, packers and obfuscation techniques are frequently updated, and unfortunately it is difficult to find a patching solution that works for a large number of applications over a long period of time. In addition, while searching the bytecode of an application can be efficient in identifying code patterns, trying to modify a DEX file without damaging some parts of it can be a challenge. Appendix (DexMod)The dexmod tool contains the following scripts:
Create a method object with the following properties: - methodIdx: method_idx value, used in Dalvik bytecode to call the method - offset: file offset of method bytecode - name: the name of the method
The dexmod tool leverages dexterity, an open source library that parses DEX files, and assists in adding strings to DEX files while fixing references to affected string IDs and other section offsets. The dexterity library has some limitations, it does not fix string indexes referenced in the bytecode all at once, and some changes were made to its code during this case study to properly add the strings. Dexterity open source library address: https://github.com/rchiossi/dexterity |
<<: Let's talk about the love and hate between ordinary permissions and dangerous permissions
>>: iOS 18 starts internal testing, what does the interface look like?
A few days ago, a friend of mine asked me, what i...
For domestic Internet ToC applications, WeChat Mi...
After all, the purpose of activities is to get mo...
Data analysis capabilities are important for both...
Resource introduction of Breaking the Flattery Gr...
The "nofollow" tag was proposed by Goog...
Due to the dual impact of the COVID-19 pandemic a...
How to register a new Apple ID? When you start to...
On the afternoon of February 21, the Hubei Provin...
Many people may now ask: Can Douyin continue to o...
Mini programs are all developed by companies with...
[[233400]] Key Points China Telecom conducted a 5...
[[142625]] Before we get started, I'd like to...
There are thousands of ideas popping up in this w...
Q: How to find a better WeChat applet development...