WeChat Mini Program "Decompile" Practice (Part 2): Source Code Restoration

WeChat Mini Program "Decompile" Practice (Part 2): Source Code Restoration

In the previous article "WeChat Mini Program "Decompilation" Practice (I): Unpacking", we introduced in detail how to obtain the .wxapkg package of a certain mini program, and analyzed the structure of the .wxapkg package. Finally, we used the script to decompress and obtain the files in the package: the code files and resource files after the mini program is "compiled". However, since most of these files are obfuscated and have poor readability, this article will further analyze and restore the contents of the .wxapkg package to the contents before "compilation" as much as possible.

Note: This article contains some source code analysis. Due to the small screen size of mobile phones, the reading experience may not be good. It is recommended to browse it on a computer.

Special thanks: The restore tool used below comes from the open source project wxappUnpacker on GitHub. Special thanks to the original author for his selfless contribution.

Overview

We know that front-end web page programming uses a combination of HTML + CSS + JS, where HTML is used to describe the structure of the page, CSS is used to describe the appearance of the page, and JS is usually used to handle page logic and user interaction. Similarly, there are the same roles in mini programs. A mini program project mainly includes the following types of files:

  • JSON configuration file with .json suffix
  • WXML template file with .wxml suffix
  • WXSS style file with .wxss suffix
  • JavaScript script logic file with .js suffix

For example, the source code project structure of the "Knowledge Collection" applet is as follows:

However, according to the previous article, after unpacking the .wxapkg of the "Knowledge Collection" applet, the following files are obtained:

Mainly including app-config.json, app-service.js, page-frame.html, *.html, resource files, etc., but these files have been "compiled and obfuscated" and reintegrated and compressed. WeChat developer tools cannot recognize them, and we cannot debug/compile and run them directly.

Therefore, we first try to analyze the structure and purpose of the contents of each file extracted from .wxapkg, and then introduce how to use the script tool to restore them to the source code before "compilation" with one click, and run them in the WeChat developer tool.

File Analysis

This section mainly takes the unpacked .wxapkg source code file of the "Knowledge Collection" applet as an example for analysis.

You can also skip the analysis in this section and go directly to the next section to introduce how to use scripts to "decompile" and restore the source code.

app-config.json

Mini Program projects mainly include three types of JSON configuration files: tool configuration project.config.json, global configuration app.json, and page configuration page.json. Among them:

  • project.config.json is mainly used to personalize the developer tools and some basic configurations of the applet project, so it will not be "compiled" into the .wxapkg package;
  • app.json is the global configuration of the current Mini Program, including all page paths, interface performance, network timeout, bottom tab, etc. of the Mini Program;
  • page.json is used to configure the window performance of each page. The configuration items in the page will overwrite the same configuration items in the window of app.json.

Therefore, the "compiled" file app-config.json is actually a summary of app.json and the configuration files of each page. Its content is roughly as follows:

  1. {
  2. "page" : { // Each page configuration
  3. "pages/index/index.html" : { // A page address
  4. "window" : { // Specific configuration of a page
  5. "navigationBarTitleText" : "Knowledge Collection" ,
  6. "enablePullDownRefresh" : true  
  7. }
  8. },
  9. // Omitted here...
  10. },
  11. "entryPagePath" : "pages/index/index.html" , // Mini Program entry address
  12. "pages" : [ "pages/index/index" , "pages/detail/detail" , "pages/search/search" ], // Page list
  13. "global" : { // Global page configuration
  14. "window" : {
  15. "navigationBarTextStyle" : "black" ,
  16. "navigationBarTitleText" : "Knowledge Collection" ,
  17. "navigationBarBackgroundColor" : "#F8F8F8" ,
  18. "backgroundColor" : "#F8F8F8"  
  19. }
  20. }
  21. }

By comparing the content of the original project app.json and each page configuration page.json, we can derive a simple integration rule for the app-config.json summary file, and it is easy to split it into the corresponding json files before "compilation".

app-service.js

In the mini program project, the JS file is responsible for the interaction logic, mainly including app.js, page.js of each page, developer-defined JS files and imported third-party JS files. After "compilation", all these JS files will be summarized into the app-service.js file, and its structure is as follows:

  1. //Declaration of some global variables
  2. var __wxAppData = {};
  3. var __wxRoute;
  4. var __wxRouteBegin;
  5. var __wxAppCode__ = {};
  6. var global = {};
  7. var __wxAppCurrentFile__;
  8. var Component = Component || function (){};
  9. var definePlugin = definePlugin || function (){};
  10. var requirePlugin = requirePlugin || function (){};
  11. var Behavior = Behavior || function (){};
  12.  
  13. // Mini program compilation basic library version
  14. /*v0.6vv_20180125_fbi*/
  15. global .__wcc_version__= 'v0.6vv_20180125_fbi' ;
  16. global .__wcc_version_info__={ "customComponents" : true , "fixZeroRpx" : true , "propValueDeepCopy" : false };
  17.  
  18. // Some third-party or custom JS source code in the project
  19. define( "utils/util.js" , function (require, module, exports, window,document,frames,self,location,navigator,localStorage,history,Caches,screen,alert,confirm,prompt,XMLHttpRequest,WebSocket,Reporter,webkit,WeixinJSCore) {
  20. "use strict" ;
  21. // ... specific source code content
  22. });
  23.  
  24. // ...
  25.  
  26. // app.js source code definition
  27. define( "app.js" , function (...) {
  28. "use strict" ;
  29. // ... app.js source code content
  30. });
  31. require( "app.js" );
  32.  
  33. // JS source code definition corresponding to each page
  34. __wxRoute = 'pages/index/index' ; // Page routing address
  35. __wxRouteBegin = true ;
  36. define( "pages/index/index.js" , function (...){
  37. "use strict" ;
  38. // ... page.js source code content
  39. });
  40. require( "pages/index/index.js" );

In this file, each JS file in the original applet project is defined by the define method, and the definition includes the path and content of the JS file, as follows:

  1. define( "path/to/xxx.js" , function (...){
  2. "use strict" ;
  3. // ... xxx.js source code content
  4. });

Therefore, we can also easily extract the source code of these JS files and restore them to the corresponding path location. Of course, the contents of these JS files have been obfuscated and compressed. We can use tools such as UglifyJS to beautify them, but it is still difficult to restore some original variable names, but it basically does not affect normal reading and use.

page-frame.html

In the applet, WXML files are used to describe the page structure, and WXSS files are used to describe the page style. There is an app.wxss file in the project to define some global styles, which will be automatically imported into each page; in addition, each page also contains page.wxml and page.wxss to describe its page structure and style; at the same time, we will also customize some public xxxCommon.wxss style files and public xxxTemplate.wxml template files for some pages to reuse, which are generally imported in the page.wxss and page.wxml of each page.

After "compiling" the applet, all .wxml files and app.wxss and public xxxCommon.wxss style files will be integrated into the page-frame.html file, and the page.wxss style file of each page will generate a page.html file in its own path.

The content structure of the page-frame.html file is as follows:

  1. <!DOCTYPE html>
  2. <html lang= "zh-CN" >
  3. <head>
  4. <meta charset= "UTF-8" />
  5. <meta name = "viewport" content= "width=device-width, user-scalable=no, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0" />
  6. <meta http-equiv= "Content-Security-Policy" content= "script-src 'self' 'unsafe-inline'" >
  7. <link rel= "icon" href= "data:image/ico;base64,aWNv" >
  8. <script>
  9. //Declaration of some global variables
  10. var __pageFrameStartTime__ = Date .now();
  11. var __webviewId__;
  12. var __wxAppCode__ = {};
  13. var __WXML_GLOBAL__ = {
  14. entries: {},
  15. defines: {},
  16. modules: {},
  17. ops: [],
  18. wxs_nf_init: undefined,
  19. total_ops: 0
  20. };
  21.        
  22. // Mini program compilation basic library version
  23. /*v0.6vv_20180125_fbi*/
  24. window.__wcc_version__ = 'v0.6vv_20180125_fbi' ;
  25. window.__wcc_version_info__ = {
  26. "customComponents" : true ,
  27. "fixZeroRpx" : true ,
  28. "propValueDeepCopy" : false  
  29. };
  30.        
  31. var $gwxc
  32. var $gaic = {}
  33. $gwx = function (path, global ) {
  34. // $gwx method definition (core)
  35. }
  36.        
  37. var BASE_DEVICE_WIDTH = 750;
  38. var isIOS = navigator.userAgent.match( "iPhone" );
  39. var deviceWidth = window.screen.width || 375;
  40. var deviceDPR = window.devicePixelRatio || 2;
  41. function checkDeviceWidth() {
  42. // checkDeviceWidth method definition
  43. }
  44. checkDeviceWidth()
  45.        
  46. var eps = 1e-4;
  47. function transformRPX(number, newDeviceWidth) {
  48. // transformRPX method definition
  49. }
  50.        
  51. var setCssToHead = function (file, _xcInvalid) {
  52. // setCssToHead method definition
  53. }
  54. setCssToHead([])(); // Clear the CSS in Head first
  55. setCssToHead([...]); // Set the content of app.wxss to Head, where ... is the content of app.wxss in the applet project
  56. var __pageFrameEndTime__ = Date .now()
  57. </script>
  58. </head>
  59. <body>
  60. <div></div>
  61. </body>
  62. </html>

Compared with other files, page-frame.html is more complicated. WeChat directly "compiles" .wxml and some .wxss and obfuscates them into JS code and puts them into the above files. Then, it constructs Virtual-Dom by calling these JS codes and renders the page.

The most important ones are the two methods $gwx and setCssToHead.

$gwx is used to generate all .wxml files through JS code. The content structure of each .wxml file is defined and obfuscated in the $gwx method. We only need to pass it the .wxml path parameter of the page to obtain the content of each .wxml, and then simply process it to restore it to the content before "compilation".

In $gwx there is an x ​​array which is used to store the .wxml files of the current applet. For example, the x value of the "Knowledge Collection" applet is as follows:

  1. var x = [ './pages/detail/detail.wxml' , '/towxml/entry.wxml' , './pages/index/index.wxml' , './pages/search/search.wxml' , './towxml/entry.wxml' , '/towxml/renderTemplate.wxml' , './towxml/renderTemplate.wxml' ];

At this point we can open the page-frame.html file in Chrome, and then enter the following command in the Console to get the content of index.wxml (output a JS object, and by traversing this object, the content of .wxml can be restored)

  1. $gwx( "./pages/index/index.wxml" )

The setCssToHead method is used to generate .wxss code based on several split style string arrays and set it to the Head of HTML. At the same time, it also embeds the style arrays corresponding to all imported .wxss files (public xxxCommon.wxss style files) in the _C variable in this method, and marks which files reference the data in _C. In addition, at the end of the page-frame.html file, this method is called to generate the content of the global app.wxss and set it to the Head.

Therefore, we can extract the content of the corresponding .wxss and restore it at each place where the setCssToHead method is called.

For a more detailed analysis of the two methods $gwx and setCssToHead in the page-frame.html file, you can refer to this article.

In addition, the checkDeviceWidth method is used to detect the width of the screen, and its detection result will be used in the transformRPX method to convert rpx units to px pixels.

The full name of rpx is responsive pixel. It is a size unit defined by the mini program itself, which can adapt to the current device screen width. The mini program stipulates that the screen width of all devices is 750rpx. The actual pixel value represented by 1rpx is different depending on the actual width of the device screen.

*.html

As mentioned above, after "compiling" the page.wxss style file of each page, a page.html file will be generated in its respective path. The structure of each page.html is as follows:

  1. <style></style>
  2. <page></page>
  3. <script>
  4. var __setCssStartTime__ = Date .now();
  5. setCssToHead([...])() // Set the content of search.wxss
  6. var __setCssEndTime__ = Date .now();
  7. document.dispatchEvent(new CustomEvent( "generateFuncReady" , {
  8. detail: {
  9. generateFunc: $gwx( './pages/search/search.wxml' )
  10. }
  11. }))
  12. </script>

In this file, the .wxss style content is set to the Head by calling the setCssToHead method, so similarly, we can extract page.wxss of each page according to the calling parameters of setCssToHead.

Resource Files

After "compilation", the pictures, audio and other resource files in the mini program project will be directly copied to the .wxapkg package, and their original paths will remain unchanged, so we can use them directly.

"Decompile"

In the previous section, we have completed a brief analysis of almost all the file contents of the .wxapkg package. Now let's introduce how to restore the source code of the applet through the node.js script.

Here we need to thank the author of wxappUnpacker again for providing the restore tool, which allows us to "stand on the shoulders of giants" and easily complete the "decompilation". Its usage is as follows:

  • node wuConfig.js: split the content in app-config.json into page.json and app.json corresponding to each page;
  • node wuJs.js : split app-service.js into a series of original independent JS files, and use Uglify-ES beautification tool to restore the code to the content before "compilation" as much as possible;
  • node wuWxml.js [-m] : extract and restore each page's .wxml and app.wxss and public .wxss style files from page-frame.html;
  • node wuWxss.js: The command parameter is the directory after .wxapkg is unpacked. It will analyze and extract and restore the page.wxss style file of each page from each page.html;

At the same time, the author also provides a one-click unpacking and restoration script. You only need to provide a .wxapkg file of a small program and then execute the following command:

  1. node wuWxapkg.js [-d]

This script will automatically unpack the .wxapkg file and restore the related "compiled/obfuscated" files in the package to their original state (including the directory structure).

PS: This tool depends on node.js packages such as uglify-es, vm2, esprima, cssbeautify, css-tree, etc., so you may need to npm install xxx to install these dependent packages to execute correctly.

For more detailed usage and related issues, please refer to the GitHub repo of this open source project.

***, we create a new empty applet project in the WeChat developer tool, and import the above restored related directory files into the project, and then compile and run it. The following figure shows the code project of the "Knowledge Collection" applet .wxapkg after restoration:

That’s it!

Summarize

This article analyzes in detail the file structure of .wxapkg after unpacking, and introduces how to get the source code of any applet through the script "one-click restore".

For some simple mini-programs developed using the native development method officially introduced by WeChat, the above tools can basically directly restore the executable source code. However, for some mini-programs with complex logic or developed using frameworks such as WePY and Vue, the restored source code may have some minor problems, which require us to manually analyze and solve them.

Follow-up

This article's analysis of the content structure and purpose of each file after the mini-program source code is "compiled" is relatively scattered, and there is no research on the dependencies and loading logic of each file. Later, we will write some more articles to explain how the WeChat client parses, loads, and runs the mini-program .wxapkg package.

<<:  WeChat Mini Program "Decompile" Practice (I): Unpacking

>>:  Eight "traps" to avoid in the programmer profession

Recommend

The latest application market promotion strategy in 2015

1. Overall Logic There is only one logic in runni...

How to carry out Shatin SEO optimization? How to optimize SEO?

Website optimization is divided into internal web...

Baizhi Elite Java Online Course

Course Catalog: ├──01springboot | ├──codes | | ├─...

How to promote paid courses through live streaming at zero cost?

Live streaming is like setting up a plan, and its...

Analysis of China Merchants Bank’s private domain operation case

When it comes to the banking industry, many peopl...

Advertising design industry improvement video course

The practical courses in the advertising design i...

Task process completion rate

Source code introduction: ASProgressPopUpView is ...