This article first analyzes the security deficiencies of the HTTP protocol, and then explains the key technical points and principles of HTTPS to achieve secure communication. Then, the handshake and communication process of the HTTPS protocol are analyzed by capturing packets. Finally, I summarize the HTTPS-related problems I encountered during the development process, and give a systematic solution to the HTTPS problem in the current project for summary and sharing. 1. Shortcomings of HTTP protocol When HTTP1.x transmits data, all the transmitted content is in plain text. Neither the client nor the server can verify the identity of the other party. The problems are as follows:
In fact, these problems not only occur on HTTP, but also exist in other unencrypted protocols. (1) Communication using plain text may be eavesdropped According to the working mechanism of the TCP/IP protocol suite, there is a risk of eavesdropping on communications in any corner of the Internet. The HTTP protocol itself does not have encryption capabilities, and all transmissions are in plain text. Even communications that have been encrypted can still be spied on, which is the same as unencrypted communications. It just means that if the communication is encrypted, it may be impossible for people to decipher the meaning of the message information, but the encrypted message information itself can still be seen. (2) Failure to verify the identity of the communicating party may result in impersonation In HTTP protocol communication, since there is no process to confirm the communication party, anyone can initiate a request. In addition, once the server receives a request, it will return a response regardless of who the other party is. Therefore, if the communication party is not confirmed, there are the following risks:
(3) The integrity of the message cannot be proven; it may have been tampered with. Integrity refers to the accuracy of information. If its integrity cannot be proved, it usually means that it is impossible to determine whether the information is accurate. The HTTP protocol cannot prove the integrity of the communication message. During the period from when the request or response is sent to when it is received by the other party, even if the content of the request or response is tampered with, there is no way to know. For example, when downloading content from a website, it is impossible to determine whether the file downloaded by the client is consistent with the file stored on the server. The file content may have been tampered with in transit. Even if the content has been changed, the receiving client cannot detect it. Such an attack in which the request or response is intercepted and tampered with by the attacker in transit is called a Man-in-the-Middle attack (MITM). (4) Several features that a secure HTTP version should have Due to the above problems, an HTTP security technology is needed that can provide the following functions: (1) Server authentication (clients know they are talking to the real server and not a fake one); (2) Client authentication (servers know they are talking to the real client and not a fake one); (3) Integrity (data on the client and server will not be modified); (4) Encryption (the conversation between the client and the server is private and there is no need to worry about eavesdropping); (5) efficiency (an algorithm that runs fast enough to be used by low-end clients and servers); (6) Ubiquity (basically all clients and servers support these protocols); 2. Key technologies of HTTPS In this demand context, HTTPS technology was born. The main functions of the HTTPS protocol basically rely on the TLS/SSL protocol, which provides identity authentication, information encryption and integrity verification functions, and can solve the security problems existing in HTTP. This section focuses on several key technical points of the HTTPS protocol. (1) Encryption Technology There are two general types of encryption algorithms: Symmetric encryption: The encryption and decryption keys are the same. The DES algorithm is a representative example; Asymmetric encryption: The encryption and decryption keys are different. RSA algorithm is a representative example; Symmetric encryption is very strong and generally cannot be cracked, but there is a big problem that the key cannot be generated and stored securely. If a fixed, identical key is used for encryption and decryption in each conversation between the client and the server, there will definitely be a big security risk. Before the emergence of asymmetric key exchange algorithms, a big problem with symmetric encryption was not knowing how to securely generate and store keys. The asymmetric key exchange process is mainly to solve this problem, making the generation and use of keys safer. However, it is also the "culprit" that seriously reduces the performance and speed of HTTPS. HTTPS uses a hybrid encryption mechanism that combines symmetric and asymmetric encryption. Asymmetric encryption is used in the key exchange phase, and symmetric encryption is used in the subsequent communication and message exchange phase. (2) Authentication - a certificate that proves the correctness of the public key The biggest problem with asymmetric encryption is that it is impossible to prove that the public key itself is the real public key. For example, when you are about to establish a communication with a server using public key encryption, how can you prove that the public key you received is the public key originally issued by the server? Perhaps the real public key has been replaced by an attacker during the transmission of the public key. If the authenticity of the public key is not verified, there will be at least two problems: man-in-the-middle attack and information denial. In order to solve the above problems, public key certificates issued by digital certificate certification bodies (CA) and their related agencies can be used. The specific process of using CA is as follows: (1) The server operator submits an application for a public key to a digital certificate authority (CA); (2) CA verifies the authenticity of the information provided by the applicant through various means, including online and offline means, such as whether the organization exists, whether the enterprise is legal, and whether it owns the domain name; (3) If the information is approved, the CA will digitally sign the public key that has been applied for, and then distribute the signed public key, and put the public key into the public key certificate and bind it together. The certificate contains the following information: the applicant's public key, the applicant's organizational information and personal information, the issuing agency CA's information, the validity period, the certificate serial number and other plain text information, and also contains a signature; The signature generation algorithm: First, use the hash function to calculate the information summary of the public plain text information, and then use the CA's private key to encrypt the information summary. The ciphertext is the signature; (4) The client sends a request to the server during the HTTPS handshake phase, asking the server to return the certificate file; (5) The client reads the relevant plaintext information in the certificate and uses the same hash function to calculate the information digest. Then, it uses the corresponding CA's public key to decrypt the signature data and compares the certificate's information digest. If they are consistent, the legitimacy of the certificate can be confirmed, that is, the public key is legitimate; (6) The client then verifies the domain name information, validity period, and other information related to the certificate; (7) The client will have the certificate information (including the public key) of the trusted CA built in. If the CA is not trusted, the corresponding CA certificate cannot be found and the certificate will be judged as illegal. A few points to note during this process: (1) No private key is required when applying for a certificate, ensuring that the private key can only be controlled by the server; (2) The legitimacy of the certificate still relies on asymmetric encryption algorithms, and the certificate mainly adds server information and signatures; (3) The certificate corresponding to the built-in CA is called a root certificate. The issuer and the user are the same, and the certificate is signed by itself, which is called a self-signed certificate. (4) Certificate = public key + applicant and issuer information + signature; 3.HTTPS protocol principle (1) History of HTTPS Brief History of HTTPS Protocol:
(2) Protocol Implementation At a macro level, TLS is implemented as a record protocol. The record protocol is responsible for exchanging all underlying messages on the transport connection and can be configured for encryption. Each TLS record starts with a short header. The header contains the type (or subprotocol), protocol version, and length of the record content. The message data follows the header, as shown in the following figure: The main TLS specification defines four core subprotocols:
(3) Handshake Protocol The handshake is the most complex part of the TLS protocol. During this process, the two communicating parties negotiate the connection parameters and complete the identity verification. Depending on the functions used, the whole process usually requires the exchange of 6 to 10 messages. Depending on the configuration and supported protocol extensions, there may be many variations of the exchange process. The following three processes are often observed in use:
(4) One-way authentication handshake process This section takes the login process of QQ mailbox as an example to analyze the handshake process of one-way verification by capturing packets. A complete handshake process of one-way verification is as follows: There are four main steps:
The following is a detailed analysis of this process. 1.ClientHello In the handshake process, ClientHello is the first message. This message conveys the client's capabilities and preferences to the server. It contains the specified version of SSL supported by the client and the list of cipher suites (the encryption algorithm used and the key length, etc.). 2.ServerHello The ServerHello message transmits the connection parameters selected by the server back to the client. The structure of this message is similar to ClientHello, except that each field contains only one option. The server's encryption component content and compression method are filtered from the received client encryption component. 3.Certificate The server then sends a Certificate message containing a public key certificate. The server must ensure that the certificate it sends is consistent with the selected algorithm suite. However, the Certificate message is optional because not all suites use authentication, and not all authentication methods require certificates. 4.ServerKeyExchange The purpose of the ServerKeyExchange message is to carry additional data for key exchange. The message content will be different for different negotiation algorithm suites. In some scenarios, the server does not need to send anything, in which case the ServerKeyExchange message does not need to be sent. 5.ServerHelloDone The ServerHelloDone message indicates that the server has sent all expected handshake messages. After this, the server will wait for the client to send a message. 6.ClientKeyExchange The ClientKeyExchange message carries all the information provided by the client for key exchange. This message is affected by the negotiated cipher suite, and the content varies with different negotiated cipher suites. 7.ChangeCipherSpec The ChangeCipherSpec message indicates that the sender has obtained enough information to generate connection parameters, has generated encryption keys, and will switch to encryption mode. Both the client and the server will send this message when conditions are ripe. Note: ChangeCipherSpec is not a handshake message, it is another protocol with only one message, implemented as its sub-protocol. 8.Finished The Finished message means that the handshake is complete. The message content will be encrypted so that both parties can securely exchange the data needed to verify the integrity of the entire handshake. Both the client and the server will send this message when conditions are ripe. (5) Handshake process of two-way authentication In some scenarios with higher security requirements, two-way verification may be required. The complete two-way verification process is as follows: As you can see, compared with the one-way verification process, the two-way verification has two more messages: CertificateRequest and CertificateVerify, and the rest of the process is roughly the same. 1.Certificate Request Certificate Request is an optional feature specified by TLS for the server to authenticate the client's identity. It is implemented by the server asking the client to send a certificate. The server should send a CertificateRequest message immediately after ServerKeyExchange. The message structure is as follows:
You can choose to send a list of certificate authorities that you accept, represented by their distinguished names. 2.CertificateVerify When client authentication is required, the client sends a CertificateVerify message to prove that it actually has the private key of the client certificate. This message is only sent when the client certificate has signing capabilities. CertificateVerify must follow ClientKeyExchange. The message structure is as follows:
(6) Application data protocol The application data protocol carries application messages, which are data buffers if only considered from the perspective of TLS. The record layer packages, defragments, and encrypts these messages using the current connection security parameters. As shown in the figure below, you can see that the transmitted data has been encrypted. (7) Alert protocol The purpose of an alert is to inform the other end of abnormal communication conditions through a simple notification mechanism. It usually carries a close_notify exception, which is used when the connection is closed to report an error. The alert is very simple, with only two fields:
4. Common issues when using HTTPS in Android (1) Server certificate verification error This is the most common problem and usually throws the following type of exception: This type of error may usually be caused by the following three reasons:
SSLHandshakeException occurs when the server's CA is not trusted by the system. This may be because the purchased CA certificate is relatively new and the Android system does not trust it yet, or the server is using a self-signed certificate (this is often encountered during the testing phase). A common way to solve this problem is to specify that HttpsURLConnection trust a specific CA set. In the code implementation module of Part 5 of this article, we will explain in detail how to make Android applications trust a self-signed certificate set or skip the certificate verification process. (2) Domain name verification failed There are two key parts to an SSL connection. The first is to verify that the certificate is from a trusted source, and the second is to ensure that the server you are communicating with presents the correct certificate. If it is not presented, you will usually see an error similar to the following: This problem usually occurs because the domain name configured in the server certificate is inconsistent with the domain name requested by the client. There are two solutions: (1) Regenerate the server certificate using the real domain name information; (2) Custom HostnameVerifier: During the handshake, if the hostname of the URL does not match the server's identification hostname, the verification mechanism can call back the implementation of this interface to determine whether the connection should be allowed. A whitelist function can be implemented by customizing HostnameVerifier. The code is as follows:
(3) Client certificate verification SSL supports the server to confirm the identity of the client by verifying the client's certificate. This technology is similar to the characteristics of TrustManager. This article will explain how to enable Android applications to support client certificate verification in the code implementation module in Part 5. (4) TLS version compatibility issues on Android During the interface joint debugging process, the test team reported a problem that the HTTPS request failed on systems below Android 4.4 but was normal on systems above 4.4. The corresponding error is as follows:
According to the official documentation, the Android system supports the following versions of the SSL protocol: That is to say, according to official documents, TLS1.1 and TLS1.2 are enabled by default in API 16+. However, in fact, they are enabled by default in API 20+, and versions below 4.4 cannot use TLS1.1 and TLS 1.2. This is also a bug in the Android system. Referring to some methods on stackoverflow, a better solution is as follows:
For systems below 4.4, use a custom TLSSocketFactory to enable support for TLS1.1 and TLS1.2. The core code is:
5. Code Implementation This section mainly provides a relatively systematic solution based on some common problems encountered when using HTTPS in Android applications proposed in Section 4. (1) Overall structure Whether using a self-signed certificate or client authentication, the core is to create a KeyStore of your own, and then use this KeyStore to create a custom SSLContext. The overall class diagram is as follows: MySSLContext in the class diagram can be used in the process of connecting to the server using HTTPUrlConnection:
The core is to use custom verification logic through httpsURLConnection.setSSLSocketFactory. The overall design uses the strategy mode to decide which verification mechanism to use:
(2) One-way verification and customized trusted certificate set In the app, put the server certificate in the resource file (usually in the asset directory, because the certificate is the same for every user and does not change often), but it can also be placed on the device's external storage.
serverCertificateNames defines the names of the certificates trusted by the App (these certificate files must be placed in the specified file path and have the same name), and then you can load the server certificate chain into the keystore. By obtaining a trusted keystore with a server certificate, you can use it to initialize a custom SSLContext:
(3) Skip the certificate verification process The process is similar to the above, except that the TrustManager provided here does not need to provide a trusted certificate set, and accepts any client certificate by default:
Then construct the corresponding SSLContext:
|
<<: Learn open source projects: LeakCanary-How to detect whether Activity is leaking
>>: How can back-end developers feel? Salaries of front-end programmers exposed
There were many changes mentioned at the Apple co...
The operations uncle said: This article shares wi...
gossip "Honey has no expiration date and can...
This is a brand marketing operation manual that I...
After the fifteenth day of the first lunar month,...
On June 7, the Asian CES exhibition, which fully ...
In recent years, the topic of To B marketing has ...
The westernmost point of China Snow-capped mounta...
What is Competitive Analysis ? The so-called comp...
People who want to pet cats but can’t get them ar...
The recent drastic drop in temperature in many pa...
As we all know, excellent cases are of great sign...
As a traffic pool with over 100 million daily act...
Advertising in the new media era is truly pervasi...