Mobile IM development: technology selection and common problems

Mobile IM development: technology selection and common problems

[[134897]]

I am currently working on an iOS IM SDK, and the beta version is out. For details, please visit http://netease.im. During the internal trial phase, there were constant requests from other departments or partners asking about various technical details, so I wrote an article to record and introduce all aspects of the next IM APP, including technology selection (including communication methods, network connection methods, protocol selection) and common problems.

Communication method selection

There are only two options for IM communication: direct device connection (P2P) and transfer through a server.

P2P

P2P is more common in chat tools within the local area network. Typical applications include: pigeon messaging, Maze (you know), etc. This type of software generally does two things after it is started:

Perform UDP broadcast: send your own information and receive information from other terminals in the same LAN

Open TCP listening: Wait for other end to connect

For detailed process, please refer to the source code of Flying Pigeon. However, this method has various limitations and inconveniences: on the one hand, it is only suitable for online point-to-point message transmission, and does not support offline, group and other services. On the other hand, due to the existence of NAT, it is much more difficult to connect machines in different LANs, and connections cannot be established under certain network types (symmetric NAT).

Server Transfer

Almost all Internet IM products use server-transfer to transmit messages. Compared with the P2P method, it has the following advantages:

  • Able to support more services that P2P cannot support or cannot support well, such as offline messaging, groups, and chat room services

  • Facilitates the expansion of business logic and compatibility between old and new versions

Of course, it also has its own problems: the server architecture is complex and the concurrency requirements are high.

Network connection method

There are two main ways to connect to the IM network:

  • Long connection based on TCP

  • HTTP short connection PULL method

The latter is common in WEB IM systems (of course, many WEB IMs are now implemented based on WebSocket). Its advantages are simple implementation and easy development. The problem is that the traffic is large, the server load is large, the timeliness of messages cannot be guaranteed well, and it does not support large-scale users. It is more suitable for small IM systems, such as the client system of a small website.

TCP long connection can better support large number of users, but the problem is that the implementation of client and server is relatively complicated. Of course, there are some variants, such as using MQTT for server notification/message sending and HTTP short connection for uploading instructions and messages. This method can ensure the timeliness of downlink messages/instructions, but the problem of slow uplink is still serious in weak network. Early exchanges were based on this method.

Protocol Selection

The general principle of IM protocol selection is: easy to expand, convenient to cover various business logics, and relatively save traffic. The latter requirement is especially important for mobile IM.

Common protocols are:

  • XMPP

  • SIP

  • MQTT

  • Private Protocol

The advantages of the XMPP protocol are: the protocol is open source, highly scalable, and implemented in various languages ​​on each end (including the server), making it easy for developers to access. However, there are also many disadvantages: XML has weak expressiveness, too much redundant information, large traffic, and a lot of pitfalls in actual use.

The SIP protocol is mostly used in VOIP-related modules. It is a text protocol. Since I have not actually used it, I will not comment on it, but from the fact that it is a text protocol, it can almost be concluded that its traffic will not be small.

The advantages of MQTT are simple protocol and low traffic, but it is not a protocol designed specifically for IM and is mostly used for push.

Almost all mainstream IM apps on the market use private protocols. A well-designed private protocol generally has the following advantages: high efficiency, traffic saving (usually using binary protocol), high security, and difficult to crack. The disadvantage is that there are no existing samples to refer to in the early stage of development, which places high demands on designers.

A good protocol needs to meet the following conditions: high efficiency, simplicity, good readability, traffic saving, easy to expand, and able to match the current team's technology stack. Based on the above principles, we can conclude that: If the team is small and the team's technology accumulation in IM is not enough, you can consider using XMPP or MQTT+HTTP short connection implementation. Otherwise, you can consider designing and implementing a private protocol yourself.

Design of private protocol

Serialization selection

The biggest characteristics of mobile Internet compared to wired networks are: low bandwidth, high latency, high packet loss rate, poor stability, and high traffic costs. Therefore, binary protocols are generally used for serialization of private protocols instead of text protocols. Common binary serialization libraries include protobuf and MessagePack. Of course, you can also implement your own binary protocol serialization and deserialization process, such as TeamTalk of Mogujie. However, the former two are far better than TeamTalk in terms of scalability and readability (TeamTalk does not even support Variant, and an int occupies 4 bytes when transmitted), so in most cases it is not recommended to implement the serialization and deserialization process of binary protocols yourself.

Protocol format design

Application layer protocols based on TCP are generally divided into a packet header and a packet body (such as HTTP), and the IM protocol is no exception. The packet header is generally used to indicate the common part of each request/feedback, such as packet length, request type, return code, etc. The packet header is filled with information corresponding to different requests/feedbacks.

A simplest packet header can be defined as

  1. struct PackHeader
  2. {
  3. int32_t length_; //packet length  
  4. int32_t serial_; //Package serial number  
  5. int32_t command_; //packet request type  
  6. int32_t code_; //return code  
  7. };

Taking the heartbeat packet as an example, assuming that the current serial is 1 and the command of the heartbeat packet is 10, when using MessagePack for serialization: length=4, serial=1, command=10, code=0, each field occupies one byte, the package body is empty, and only 4 bytes are required.

Of course, this is the simplest example. When facing real business logic, more information will need to be stuffed into the package body. This requires developers to summarize the common parts based on their own business logic, such as the protocol version number added for compatibility, the module ID added for load balancing, etc.

Other issues

The above is a rough selection process of an IM system: communication method, connection method, protocol selection, and protocol design. However, there are still a lot of problems to be dealt with in the actual development process.

Protocol encryption

In order to ensure that the protocol is not easily cracked, almost all mainstream IMs on the market will encrypt the protocol for transmission. The common process is similar to HTTPS encryption: after the connection is established, the client and the server negotiate, and finally the client obtains a current Session key, and subsequent data transmission is encrypted and decrypted using this key. Generally, stream encryption, such as RC4, is used for efficiency considerations. In the early negotiation process, it is recommended to use asymmetric encryption such as AES to increase the difficulty of cracking.

Quick Connect (Login)

For iOS APP, since there is no real background, each time the APP is started, it basically needs to reconnect and log in once (except for switching within a short period of time), so how to quickly reconnect and log in is very important. Common optimization ideas are as follows:

  • The server IP address is cached locally and refreshed periodically. For mobile network tuning, please refer to "iOS Mobile Network Tuning".

  • Merge some requests. For example, encryption and login operations can be merged into the same operation, which can reduce the time of an unnecessary network request round trip.

  • Simplify synchronization requests after login. Some synchronization requests can be postponed to UI operations, such as refreshing group member information.

Keep connected

Generally, the way for APP to maintain connection is to use the heartbeat at the application layer, and perform reconnection operations through the timeout of the heartbeat packet and other conditions (network switching). So the question is: why use the application layer heartbeat and how to design the application layer heartbeat.

As we all know, the TCP protocol has a KEEPALIVE setting option. After setting it to KEEPALIVE, the client will send a heartbeat packet to the server every N seconds (the default is 7200s). However, in actual operation, we often use application layer heartbeats. The reasons are as follows:

  • KEEPALIVE puts a lot of pressure on the server load (that's what the server said...)

  • Socks proxy does not support KEEPALIVE

  • KEEPALIVE will fail in some complex situations, such as when the router crashes and the network is directly disconnected

In order to save traffic and power, mobile terminals usually make some small optimizations on heartbeat packets in actual operation.

  • Try to simplify the heartbeat packet as much as possible to ensure that the size of a heartbeat packet is within 10 bytes

  • Heartbeat packets are only sent when idle (a heartbeat is performed if no packets are received within n seconds after the last data packet is received)

  • Adjust the heartbeat packet interval according to the front and back status of the APP (mainly Android)

Message reachable

In mobile networks, packet loss and network reconnection are very common. In order to ensure the reachability of messages, a message receipt and resend mechanism is generally required. For example, Yixin.com, each message will be resent up to 3 times, with a timeout of 15 seconds. At the same time, the current connection status will be checked before sending. If the current connection is not established correctly, the cached message will be checked periodically (every 2 seconds, 15 times). Therefore, in the worst case, a message will have a retry time of more than 2 minutes to ensure the reachability of the message.

Because of the existence of retransmission, the receiving end may occasionally receive two duplicate messages. In this case, the receiving end needs to deduplicate. The general practice is that each message has its own unique message id (usually uuid).

File upload optimization

IM messages (including SNS modules) contain a large number of file upload requirements, so how to optimize file upload has become a relatively large topic. Common optimization ideas include the following:

  • Advance the upload process: provide audio recording and uploading at the same time. Pre-upload pictures in Moments. After selecting pictures, users usually input text. During this time, the background can silently upload the selected pictures.

  • Provides lightning upload method: the server deduplicates files based on MD5.

  • Optimize the connection with the upload server (see Quick Connection) and provide the function of connection reuse.

  • Upload files in chunks: Because mobile networks suffer from severe packet loss, uploading files in chunks can make each group contain a reasonable number of TCP packets, which reduces the probability of retries, reduces the cost of retries, and makes it easier to upload to the server.

  • Supports pipeline upload under the premise of subpackaging to avoid unnecessary network waiting time.

  • Support breakpoint resume

<<:  iOS application architecture discusses the organization and calling scheme of view layer

>>:  Improving JavaScript performance becomes the top priority for the Edge team

Recommend

A brief analysis of the principles of mobile rendering

Author| Shang Huaijun Rendering on a computer or ...

Inventory: Five regular and long-term side projects that can make money at home

Today’s content is a bit too much, the full text ...

One article explains brand positioning!

A brand is the position that a product or service...

Xuancheng SEO training: How to create high-quality backlinks for your website

The ranking of a website is determined by many fa...

Analysis of mobile advertising market in Q3 2021

In Q3 2021, the gaming, short video, e-commerce l...

How should an APP build a user growth system? Share 6 points!

Just as KOL corresponds to the herd effect in psy...

Hammer cuts prices, is sentiment bankrupt?

[[121743]] On the morning of October 27, the WeCh...

Like effect button

Source code introduction: A button with the same ...

A new way to innovate e-commerce - earn money by buying in groups!

When I was talking with a friend last week, I cam...

Event operation: Review of the 0-cost poster fission event!

Here I would like to share with you a review of a...