Using RxJava to quickly obtain massive data

Using RxJava to quickly obtain massive data

Imagine that when you need some dynamic data, you only need to request the network every time. However, a more efficient approach is to cache the data obtained from the network to disk or memory.

Specifically, the plan is as follows:

  1. Occasional networking operations are only used to obtain *** data.

  2. Read data as quickly as possible (by fetching previously cached network data).

I will implement this plan by using RxJava.

Basic Mode

Create an Observable<Data> for each data source (network, disk, and memory) and use the concat() and first() operators to construct a simple implementation.

The concat() operator holds multiple Observable objects and concatenates them into a queue in order. The first() operator only takes and emits the first event from the concatenated queue. Therefore, if you use concat().first(), no matter how many data sources there are, only the first event will be retrieved and emitted.

  1. // Our sources (left as an exercise for the reader)  
  2. Observable<Data> memory = ...;
  3. Observable<Data> disk = ...;
  4. Observable<Data> network = ...;
  5.  
  6. // Retrieve the first source with data  
  7. Observable<Data> source = Observable
  8. .concat(memory, disk, network)
  9. .first();

The key to this pattern is that the concat() operator subscribes to all Observable sources only when data is needed. Since the first() operator stops retrieving the queue early, there is no need to access the slower source if there is cached data. In other words, if memory returns the result, there is no need to worry about disk and network being accessed. Conversely, if there is no data in memory or disk, the network request is performed.

Note that the Observable data sources held by concat() are retrieved one by one in order.

Persistent Data

Obviously, the next step is to cache data. If you don't cache the results of network requests to disk, and cache the results of disk accesses to memory, then it's not called caching at all. The next code to write is to persist the network data.

My solution is to have each data source save or cache the data after sending the event.

  1. Observable<Data> networkWithSave = network.doOnNext( new Action1<Data>() {
  2. @Override   public   void call(Data data) {
  3. saveToDisk(data);
  4. cacheInMemory(data);
  5. }
  6. });
  7.  
  8. Observable<Data> diskWithCache = disk.doOnNext( new Action1<Data>() {
  9. @Override   public   void call(Data data) {
  10. cacheInMemory(data);
  11. }
  12. });

Now, if you use networkWithSave and diskWithCache, the data will be automatically saved after loading.

(Another advantage of this strategy is that networkWithSave and diskWithCache can be used anywhere, not just in our multi-data model.)

Stale data

Unfortunately, right now the code we have that saves the data is overdoing it. It always returns the same data, regardless of whether it's out of date or not. We want to occasionally connect to the server and grab the latest data.

The solution is to use the first() operator to filter, which is to set it to reject worthless data.

  1. Observable<Data> source = Observable
  2. .concat(memory, diskWithCache, networkWithSave)
  3. .first( new Func1<Data, Boolean>() {
  4. @Override   public Boolean call(Data data) {
  5. return data.isUpToDate();
  6. }
  7. });

Now, we only need to send events that are determined to be the latest data. Therefore, as long as the data of one data source expires, we will continue to retrieve the next data source until the latest data is found.

Comparison of first() and takeFirst() Operators

For this design pattern, either first() or takeFirst() operators can be used.

The difference between the two calling methods is that if the data of all data sources are expired and no valid data is sent as an event, first() will throw a NoSuchElementException (Translator's note: the first() operator always returns false), while the takeFirst() operator will directly call the completion operation without throwing any exception.

Which operator to use depends entirely on whether you need to explicitly handle missing data.

Code Sample

You can check out a sample implementation of all the above code here: https://github.com/dlew/rxjava-multiple-sources-sample.

If you need a real-world example, check out the Gfycat App, which uses this pattern when fetching data. The project doesn’t use all of the features shown above (because it doesn’t need to), but it demonstrates the basic usage of concat().first().

<<:  Apple lowers its profile to make money: Watches are on retailers' shelves

>>:  Who created the programmer bubble?

Recommend

What did the bigwigs in the technology circle say at the two sessions?

[[128838]] Ten years ago, these Internet entrepre...

New trends in marketing and promotion in 2019!

When "What's Peppa Pig?" was all ov...

How do algorithms influence user decisions?

In the past two days, articles about Internet tec...

4,870 varieties, why does Peru collect so many potatoes?

In two inconspicuous small buildings in the south...

Information flow methodology helps you reduce costs by 40%!

There are a lot of form leads, but few transactio...

How to promote and operate APP? You need to understand the fundamentals!

With the continuous development of the APP indust...

An article to understand the communication between Android and Flutter

As a cross-platform solution, Flutter is often em...

Shadows in the Sun! The death list of smart hardware is here

This article is transferred from Sohu Media Platf...

Soy milk and milk, which one is more nutritious? How to choose?

Author: Yu Liang, registered nutritionist in Chin...