Architecture design: a design concept for remote call services (an application practice of zookeeper)

Architecture design: a design concept for remote call services (an application practice of zookeeper)

Before learning Zookeeper in depth, I would like to introduce you to an application example related to Zookeeper. I named this example as remote call service. Through the description of this application example, we will have a deep understanding of the application scenario of Zookeeper.

Remote call is a communication mechanism between systems. Another way to understand it is inter-process communication. Remote call technology is the core technology for distributed system development. Remote call technology can form a group of computer systems into a network system and provide overall services to the outside world. Then this group of computer systems constitutes a larger and higher-performance computer system.

In my previous blog, I introduced a distributed website architecture design, in which a component written using netty technology is used as a medium for communication between the front-end system and the server system. In a large Internet company, there will be many such website systems. If each website is developed as described in my blog, then there will be certain problems with the maintenance and management of system communication and the allocation and management of network resources for each system. For such problems, I will give you an example so that you may understand it better. For example, an Internet company has several websites that provide services to the outside world. Some websites have a large number of visits, while others are relatively small. However, the company's broadband resources are limited. Then we hope to dynamically manage and allocate these resources. If the communication function of our website is tightly coupled with the website, then the work of allocating these resources will be more complicated and cumbersome, and it is easy to cause problems. There will be many such problems, and I will not analyze them in detail here. When doing software development, there is a principle that if a certain function can be used universally and the function needs to be managed uniformly, we should extract this function into an independent system or component, and give this system or component some enhanced functional features, which will definitely improve the robustness, availability and efficiency of the entire system.

The communication technology I described in the distributed website is a kind of remote call technology. Remote call technology is the communication technology between the client and the server. It can be regarded as a kind of cs architecture technology. There are many excellent frameworks in Java to implement remote calls, such as RMI that comes with Java, Httpinvoker that comes with spring, webservice technology, etc. However, these existing technologies cannot meet the remote call needs of Internet companies. Today I will talk about a set of remote call technologies that I conceived myself, which is based on the practices of some similar software in our company.

This framework is mainly for Java, and other languages ​​are not currently supported. First, I would like to summarize the technologies that remote call technology should include, which are:

  • Communication technology: Remote calls are to use network technology to form different systems into a whole, so communication technology is the focus. The communication technology I choose here is netty technology. Netty provides asynchronous, event-driven network application frameworks and tools to quickly develop high-performance, high-reliability network servers and client programs. Netty makes it simple and efficient for us to develop communication programs, and its efficiency is also very good. At the same time, it also supports a variety of different network protocols.
  • Serialization and deserialization technology: Java serialization technology refers to converting objects into byte data, which can be restored to Java objects. This restoration process is called deserialization. This mechanism can automatically handle the differences between different operating systems. For example, objects serialized under Windows can be rebuilt on Linux. Java JDK comes with a serialization and deserialization mechanism. People familiar with Hadoop know that Hadoop has designed a serialization and deserialization mechanism. Why didn't the Hadoop author choose to use the serialization mechanism that comes with Java? This is because the Java serialization mechanism is very complex, and complexity brings low efficiency. Another important disadvantage of the Java serialization mechanism is that the binary data it serializes will be very large, because Java will carry too much relevant information about the object when serializing. Excessive data volume will affect the efficiency of network transmission. Therefore, Hadoop has designed a serialization and deserialization mechanism. The communication between different nodes of Hadoop is also a remote call mechanism. Therefore, we find that good serialization and deserialization technology is very important for remote calls. Our company's remote call framework has two serialization technologies: one is the serialization and deserialization mechanism that comes with Java, and the other is Hessian technology, which is a more efficient serialization and deserialization technology.
  • Compression technology: When doing network programming, the most scarce resource is broadband resources. If the transmitted data is too large, then data compression becomes very important. Here I recommend a compression technology, snappy, which is an efficient compression and decompression package and a compression technology widely used within Google.
  • High concurrency technology: Remote call technology must be multi-threaded, only in this way can multiple concurrent processing requests be satisfied. Java provides an Executor framework in version 1.5, which introduces the concept of tasks in thread development, making multi-threaded program development more reasonable and controllable. For executor technology, you can read a classic book "Java Concurrent Programming Practice". To make threads more efficient, pool technology is also indispensable. Apache's common-pool is a very good pool technology. We can create all threads in advance and then put them into the common-pool pool for management.
  • Non-intrusive: This can also be called loose coupling. For Java web development, the best decoupling method is to use spring technology. When we introduce the remote call framework into our system and configure the relevant parameters, we can define the method for remote call in the spring configuration file. Then, when calling in the program, we can use spring to directly obtain the bean. Then, the development of remote call is no different from calling the server method in the action. The following is an example code:
    1. <!-- Service provider configuration -->
    2. <bean id= "serverProvider"   class = "cn.com.sharpxiajun.RmifSpringProviderBean" >
    3. <property name= "interface" value= "cn.com.ITest" ></property><!-- Remote call interface -->
    4. <property name= "target" ref= "clsTest" ></property><!-- clsTest is the implementation class of ITest. clsTest is a bean id value. -->
    5. </bean>
    6.  
    7. <!-- Service caller configuration -->
    8. <bean id= "clientConsumer"   class = "cn.com.sharpxiajun.RmifSpringConsumerBean" >
    9. <property name= "interface" value= "cn.com.clsTest" ></property><!-- value is the interface implementation class of the target defined by the Provider -->
    10. <property name= "seriaType" value= "hessian" ></property><!--Serialization method -->
    11. <property name= "compress" value= "true" ></property><!-- Compression flag -->
    12. </bean>
  • Load balancing: Distributed systems cannot do without load balancing. Good load balancing can make full use of the computing resources of different servers to provide the system's concurrency and computing power. For websites (our company does not have too many website servers now), there are two strategies that can be used if there are less than 10 servers: one is simple polling. For example, if there are 6 servers, we will send the first request to the first server, the second request to the second, and so on. After the cycle of 6 servers is completed, we start from the first one again; the second is a random method, that is, using a random function. Of course, I don’t know what polling mechanism is better for more servers. I hope that someone who knows can recommend it to me.

In addition to the above functions, I hope that the remote call framework I designed here can also have a heartbeat management mechanism, a timeout management mechanism, and a service classification management mechanism, that is, it can adjust network resources according to the importance of the service or the busyness of the system.

Haha, I've been talking for so long, and some of you may be a little annoyed. Didn't I say that I would use Zookeeper as an example? Why haven't I seen any trace of Zookeeper yet? Don't worry, Zookeeper will be on the scene soon.

Let’s start with the distributed website I wrote in my previous blog. We can regard the server system as the service provider and the front-end system as the service caller. The provider can be compared to the merchant and the caller can be compared to the customer. Merchants and customers can conduct transactions directly. This direct transaction method is very primitive and may even have risks. In modern society, direct transactions between merchants and customers are very efficient. The reason for this efficiency is that there is a large and standardized market. Transactions between merchants and customers are conducted in the market, which makes transactions safer and more efficient. The biggest feature of the distributed framework I designed is that it provides a market-like role to manage service providers and service callers. I call this functional module the remote call management component.

The remote call management component is the core of this framework. Its main function is to receive registration notifications from service providers. The notifications are generally interfaces, implementation classes of the interfaces, and the IP addresses of servers. The management component will record these notifications and group and mark these service programs according to the configuration. The registered information management component will push this information to the service caller. The remote call management component also includes a heartbeat mechanism, which is for service providers. The heartbeat mechanism detects the health status of service providers. The management component will not detect the health status of service callers because it is unnecessary. This is because the use of this framework is still for callers to directly request providers. Logically, there is no need to care about the status of the caller. This is the same as the browser in the BS architecture. We don’t care whether the browser user exists. The relationship between the service provider, service caller, and remote call management component is shown in the following figure:

The remote call framework runs as follows: when the service provider starts, it transmits its IP address and registration method to the remote call management component. The management component receives the registration information and stores it. The storage technology uses Zookeeper. After the storage is successful, the management component will send a successful notification back to the service provider. At the same time, the management component will detect whether the service provider is healthy through heartbeats. When the service caller starts, it will request the service provider information from the management component. After receiving the request, the management component will push the relevant information to the service caller. When the actual system is running, the service caller communicates directly with the service provider. The communication method is Netty. If there are relevant changes between the caller and the provider, they will first notify the service management component, and the service management component will push the relevant change information to the corresponding system.

The remote call management component is mainly implemented through Zookeeper. Zookeeper has a hierarchical namespace. Its model is a tree structure. The tree structure is a powerful data type that can store almost all different data types. We save this information through Zookeeper to facilitate our management of the entire remote call framework. At the same time, Zookeeper is highly reliable, which I mentioned in the previous Zookeeper article. This ensures the stability of the entire remote call framework. In actual applications, we will compile the components into a jar package, and different projects will directly reference this jar package, so that the management component server and the service provider and caller are connected. As for the communication mechanism between the provider and the caller, it is carried out directly, because we integrate the communication program in the jar package, but the corresponding management mechanism is extracted to the external server for unified management.

This is the remote call framework I designed. Unfortunately, I have not actually implemented this idea yet. I brought it out today to demonstrate the practical application of Zookeeper and to pave the way for my later explanation of Zookeeper. As for whether it is feasible, it depends on whether there is a chance to develop a similar system in the future. By then, I guess there will be many unexpected problems to be solved.

(The design of the remote call service was based on the design of my technical friend Ma Dexin, who was once a technical architect of Taobao)

<<:  Let's talk about how to achieve high performance with CQRS architecture

>>:  Neural network basics: seven network units, four layer connection methods

Recommend

Product Promotion: 11 Ways to Make Your Product Popular

Why is it that even though some products are rare...

Solid info! 5 major customer acquisition models for APP promotion!

I used to think about this question often when I ...

Corn's ancestor was a weed? Yes, and it's still there today

The history of human civilization is accompanied ...

How to analyze user needs and build a user system?

1. What is the user system? Before talking about ...

4 steps to create a National Day marketing campaign plan!

I have read many books on training marketing dire...

How to purchase the video of African children holding signs?

How to purchase the video of African children hol...

The invisible overlords on earth all have super survival abilities

If the Earth is likened to a huge dynasty, then h...