Amap APP full-link source code dependency analysis project

Amap APP full-link source code dependency analysis project

1. Background

After years of development, the code volume of Amap App has reached millions of lines, supporting the complex business functions of Amap Map. However, with the expansion of the team and the complexity of the business, the increasingly fragmented code and the complex dependencies between codes have brought many maintenance problems. The more prominent problems include:

  • I dare not easily modify or take offline the exposed interfaces or components, because I don’t know where they are dependent on me and will be affected. As a result, the code becomes bloated and the package size becomes larger and larger.
  • When a module is released to a new version of the client without any changes, a full regression test of the entire function is required, because it is unknown whether the dependent modules have changed;
  • It is difficult to judge whether the trend of Native shifting from business implementation to underlying support is reasonable and whether governance is effective;

These problems have reached a point where we must start to manage them, and the key to solving such problems is to understand the dependencies between code.

2. Amap APP Platform Architecture

To clear up some confusion, before discussing the implementation of dependency analysis, let's briefly explain the platform architecture of the Amap APP so that you can have some background understanding of some terms and scenarios.

Amap APP can be divided into four parts from the perspective of language platform. The JS layer is mainly responsible for business logic and UI framework. The C++ layer in the middle is used for high-performance rendering (mainly map rendering), and some aspect APIs are implemented, so that only one set of logic can be maintained on both ends. The Android and iOS layers are mainly used as adaptation layers to connect some operating system interfaces and smooth out the differences between the two ends (as much as possible).

The aspect here refers to the dividing line between the JS layer and the Native/C++ layer. Some aspect APIs will be implemented here, that is, a series of interfaces for interaction between the JS layer and the Native/C++ layer, such as the Bluetooth interface, system information interface, etc. The interface is implemented by the Native/C++ layer, then exposed to the JS layer, and called by the JS layer.

3. Basic Implementation Principles

The most basic and important data of the entire project is dependency. The simplest example of dependency is that file A depends on a method of file B.

To find out this relationship, generally two steps are required.

Step 1: Compile the source code and get AST

Traverse all source codes, generate an Abstract Syntax Tree (AST) through syntax analysis. Taking the JS scanner as an example, I use the typeScript module as the compiler, which supports both JS(X) and TS(X), and generates AST through ts.createSourceFile. In addition to JS, iOS uses CLang, Android uses bytecode analysis, and C++ uses symbol table analysis.

Step 2: Path extraction, dependent pathfinding

From AST, we can find all references and exposed expressions, for example, import/require and export/module.exports in JS. The way to find expressions is to recursively traverse all syntax nodes. In JS, I use ts.forEachChild provided by TypeScript compiler to traverse, and identify syntax node types through ts.SyntaxKind.

After finding the expression, find the specific dependent file through the dependency path. Taking JS as an example, we can reference certain identifiers (identifierName) of a file (fileName) of other modules (bundleName) by const { identifierName } = require('@bundleName/fileName'). We need to locate the specific identifier based on this expression.

Cross-aspect dependencies require one more step. The aspect API needs to be divided into the calling side and the declaring side. The calling side data is analyzed through AST at the JS layer, and the declaration side data (corresponding to the identifier of the specific implementation aspect API) is analyzed at the Native/C++ layer. The calling side and declaration side data are associated together through the version number to achieve full dependency link penetration.

We save this relationship and some metadata and use it as source data for data analysis.

4. Project Architecture

The overall project structure is as follows:

We used Node.js and the group's egg.js framework to build this dependency analysis engineering service. Considering the variability and diversity of data usage scenarios, I chose GraphQL as the query interface to output the data types we defined, which are encapsulated by the upper-level applications. If multiple upper-level applications need similar data at the same time, we will also integrate and reuse them.

The data processing module is an independent module written in Node.js. It supports reuse in other projects and is planned to be reused in IDE and other projects in the future.

On the left are our data consumers, only a few of which are listed here; on the right is our database, which is used to store analysis results; on the bottom are four-end scanners and triggers. The four ends produce source data for the source code of their own platforms respectively. Triggers support release process triggering, event triggering, timing triggering, front-end triggering (application-side front-end, not Web front-end) and manual triggering, etc.

5. Application scenarios and implementation principles

There are endless possibilities for the use of full-link dependencies. Here are a few examples.

Impact scope judgment (reverse dependency analysis)

The first application scenario we can think of is the impact scope judgment, which is also the first starting point of our project. As we all know, if we maintain an interface (or component), we will find that when more and more places use it, the risk of iterating it will become higher and higher. We need to know exactly where this interface is called to determine how many functions to regression test, how to release it, how to make it compatible, etc. This requires reverse dependency analysis.

Reverse dependency is relative to the dependency analyzed by the scanner. The dependency analyzed by the scanner is called forward dependency, which mainly indicates "which other modules this module depends on", while reverse dependency refers to "which modules this module is depended on". So naturally, our reverse dependency is data processing based on forward dependency.

(Reverse dependency query page)

Based on the reverse dependency data and combined with data from multiple versions, we can also calculate the "number of consecutive unreferenced versions" to measure the security of the offline interface.

(The number of consecutive unused versions of some aspect API)

Maintainers of component libraries, frameworks, and aspect APIs are heavy users of this capability. This capability provides them with data support and helps them understand how their modifications will affect other modules, so that they can make changes, release decisions, and regression tests.

Analysis of changes between versions

When submitting a version for testing, we can compare the dependency chains of the two versions, analyze the changes in files and their entire impact chain, provide some data support for QA, and more accurately know which functions need to be regression tested and which do not.

There are many scenarios for version change analysis. In addition to the normal version iteration scenario, there is another common scenario: the module is integrated into the new version of the Amap APP without change. In this case, the "release code remains unchanged, but other dependent modules have changed", especially Native/C++ and common modules. What the test environment needs to know is what changes have been made to other modules that the current module depends on, what impact these changes have on this module, and which functional points need regression testing.

The main consumers of this data are QA students, who can use this data to improve testing efficiency and discover regression points that were missed.

Trend change judgment

As mentioned before, due to the long time span of Amap APP and the fact that there were no restrictions before, some of our business logic codes are still implemented through Native. We hope to gradually migrate to JS or C++ layer implementation, and Native is only used for adaptation.

To judge the progress and effect of this governance, we need to support it from two aspects of data. One is the number of lines of code on each platform, which we have a dedicated service for, so we won't mention it for now; the other is the interface trend. The interface trend is also divided into two types: the call side and the declaration side. According to the direction of our governance, the effect we expect should be: a curve in which the number of calls to the Native business aspect API decreases continuously according to version/time. When the number of calls to some APIs is 0, the API can be taken offline, and then another curve will appear - the number of declarations of the Native business aspect API also decreases continuously.

(Since a certain version, the number of aspect API calls has been continuously reduced)

(Aspect API not used in a certain version)

Students who conduct architecture governance and aspect API governance are the main consumers of this data. With this data, they can determine whether the trend of architecture governance is reasonable, whether a certain aspect API can be taken offline, etc.

Package size optimization - useless and duplicate file search

We also contributed to the optimization of the package size. Based on the dependency data, we can find some files that are not referenced or have the same content (same md5 value), which also occupy a lot of size.

We used dependency analysis engineering to find thousands of such images, with @1x @2x @3x files being the hardest hit. We found many images that pretended to be images of another resolution (we even pushed designers to standardize their output and added verification tools).

6. Final Thoughts

The above is a basic overview of Amap's full-link dependency analysis project. In the specific implementation, there will be countless details to deal with, such as various historical issues, multi-level version processing to produce exponential code snapshots, change analysis to produce exponential analysis results, etc. It also involves a lot of knowledge such as compilation principles, data structures and algorithms (especially graph structures), which is a great test of programming ability and trade-off ability, as well as the most important thing - resilience. Welcome everyone to discuss and come up with new ideas and new scenarios together!

<<:  Why does sending original images on WeChat leak privacy? Let's talk about Exif

>>:  Android phone cost-effectiveness ranking list in November 2019

Recommend

National Botanical Garden: enjoy the flowers and learn the names of plants

National Botanical Garden: enjoy the flowers and ...

What wonders will happen when a matter sun and an antimatter sun collide?

This is an answer to a question invited by netize...

The deepest well in the world is 12,262 meters deep.

On Russia's Kola Peninsula, in the wilderness...

Clinical diagnosis of depression and five practical treatments

An introduction to the clinical diagnosis of depr...

Peony is a condiment in the past.

Peonies fade and peonies bloom, which is the time...

Huawei Hongmeng continues to open up: Ark JS runtime is officially open source

[[422334]] In order to survive in the gap between...