Using RxJava to quickly obtain massive data

Using RxJava to quickly obtain massive data

Imagine that when you need some dynamic data, you only need to request the network every time. However, a more efficient approach is to cache the data obtained from the network to disk or memory.

Specifically, the plan is as follows:

  1. Occasional networking operations are only used to obtain *** data.

  2. Read data as quickly as possible (by fetching previously cached network data).

I will implement this plan by using RxJava.

Basic Mode

Create an Observable<Data> for each data source (network, disk, and memory) and use the concat() and first() operators to construct a simple implementation.

The concat() operator holds multiple Observable objects and concatenates them into a queue in order. The first() operator only takes and emits the first event from the concatenated queue. Therefore, if you use concat().first(), no matter how many data sources there are, only the first event will be retrieved and emitted.

  1. // Our sources (left as an exercise for the reader)  
  2. Observable<Data> memory = ...;
  3. Observable<Data> disk = ...;
  4. Observable<Data> network = ...;
  5.  
  6. // Retrieve the first source with data  
  7. Observable<Data> source = Observable
  8. .concat(memory, disk, network)
  9. .first();

The key to this pattern is that the concat() operator subscribes to all Observable sources only when data is needed. Since the first() operator stops retrieving the queue early, there is no need to access the slower source if there is cached data. In other words, if memory returns the result, there is no need to worry about disk and network being accessed. Conversely, if there is no data in memory or disk, the network request is performed.

Note that the Observable data sources held by concat() are retrieved one by one in order.

Persistent Data

Obviously, the next step is to cache data. If you don't cache the results of network requests to disk, and cache the results of disk accesses to memory, then it's not called caching at all. The next code to write is to persist the network data.

My solution is to have each data source save or cache the data after sending the event.

  1. Observable<Data> networkWithSave = network.doOnNext( new Action1<Data>() {
  2. @Override   public   void call(Data data) {
  3. saveToDisk(data);
  4. cacheInMemory(data);
  5. }
  6. });
  7.  
  8. Observable<Data> diskWithCache = disk.doOnNext( new Action1<Data>() {
  9. @Override   public   void call(Data data) {
  10. cacheInMemory(data);
  11. }
  12. });

Now, if you use networkWithSave and diskWithCache, the data will be automatically saved after loading.

(Another advantage of this strategy is that networkWithSave and diskWithCache can be used anywhere, not just in our multi-data model.)

Stale data

Unfortunately, right now the code we have that saves the data is overdoing it. It always returns the same data, regardless of whether it's out of date or not. We want to occasionally connect to the server and grab the latest data.

The solution is to use the first() operator to filter, which is to set it to reject worthless data.

  1. Observable<Data> source = Observable
  2. .concat(memory, diskWithCache, networkWithSave)
  3. .first( new Func1<Data, Boolean>() {
  4. @Override   public Boolean call(Data data) {
  5. return data.isUpToDate();
  6. }
  7. });

Now, we only need to send events that are determined to be the latest data. Therefore, as long as the data of one data source expires, we will continue to retrieve the next data source until the latest data is found.

Comparison of first() and takeFirst() Operators

For this design pattern, either first() or takeFirst() operators can be used.

The difference between the two calling methods is that if the data of all data sources are expired and no valid data is sent as an event, first() will throw a NoSuchElementException (Translator's note: the first() operator always returns false), while the takeFirst() operator will directly call the completion operation without throwing any exception.

Which operator to use depends entirely on whether you need to explicitly handle missing data.

Code Sample

You can check out a sample implementation of all the above code here: https://github.com/dlew/rxjava-multiple-sources-sample.

If you need a real-world example, check out the Gfycat App, which uses this pattern when fetching data. The project doesn’t use all of the features shown above (because it doesn’t need to), but it demonstrates the basic usage of concat().first().

<<:  Apple lowers its profile to make money: Watches are on retailers' shelves

>>:  Who created the programmer bubble?

Recommend

How to deal with the H1N1 flu outbreak? Is it too late to get vaccinated now?

Although everyone has made some preparations to d...

How to use coupon activities to increase user conversion rate?

Whether offline or online, where there are transa...

Zhang Xiaolong: Get out of loneliness

[[127969]] Today, as the commercial value of WeCh...

B-side product operations: a guide to avoiding pitfalls in demand management

The methods of demand collection, analysis, and i...

Decoding the banana's genetic code: From wild fruit to modern delicacy

Produced by: Science Popularization China Author:...

How to avoid invalid traffic in advertising?

Recently, developers have frequently been banned ...

Brand promotion: 3 golden rules for creating hot products

For a long time, many people have had certain mis...

Short videos + full links, a new way of marketing movies in 2020

2020 was a turbulent year for the film industry. ...