Python Development: Introduction to Cache Mechanism

[51CTO Quick Translation] In today's article, we will start with a simple example to learn how to use the cache mechanism. After that, we will further use the functools module of the Python standard library to create a cache that suits our needs. Without further ado, let's get started.

Caching is a way of storing quantitative data in preparation for subsequent requests, in order to speed up data retrieval. In today's article, we will start with a simple example to learn how to use the cache mechanism. After that, we will further use the functools module of the Python standard library to create a cache that suits our needs. As a starting point, we first create a class to build our cache dictionary, and then expand it as needed. The following is the specific code:

 #####################################################################   
 
 class MyCache: 
 
 """"""   
 
 #-----------------------------------------------------------------------------   
 
 def __init__( self ): 
 
 """Constructor"""   
 
 self .cache = {} 
 
 self .max_cache_size = 10

There is nothing special in the above class example. We just create a simple class and set two class variables or attributes, cache and max_cache_size. Cache is an empty dictionary, and max_cache_size obviously represents the maximum cache capacity. Let's further flesh out the code to make it functional:

 import datetime 
 
 import random 
 
 #####################################################################   
 
 class MyCache: 
 
 """"""   
 
 #-----------------------------------------------------------------------------   
 
 def __init__( self ): 
 
 """Constructor"""   
 
 self .cache = {} 
 
 self .max_cache_size = 10   
 
 #-----------------------------------------------------------------------------   
 
 def __contains__( self , key): 
 
 """ 
 
 Returns True or False depending on whether the key exists in the cache. 
 
 """   
 
 return key in   self .cache 
 
 #-----------------------------------------------------------------------------   
 
 def update( self , key, value): 
 
 """ 
 
 Updates the cache dictionary, optionally removing the oldest entries 
 
 """   
 
 if key not   in   self .cache and len( self .cache) >= self .max_cache_size: 
 
 self .remove_oldest() 
 
 self .cache[key] = { 'date_accessed' : datetime.datetime.now(), 
 
 'value' : value} 
 
 #-----------------------------------------------------------------------------   
 
 def remove_oldest( self ): 
 
 """ 
 
 Delete the input data with the earliest access date 
 
 """   
 
 oldest_entry = None   
 
 for key in   self .cache: 
 
 if oldest_entry == None : 
 
                oldest_entry = key 
 
 elif   self .cache[key][ 'date_accessed' ] < self .cache[oldest_entry][ 
 
 'date_accessed' ]: 
 
                oldest_entry = key 
 
 self .cache.pop(oldest_entry) 
 
 #-----------------------------------------------------------------------------   
 
 @property   
 
 def size( self ): 
 
 """ 
 
 Returns the cache capacity 
 
 """   
 
 return len( self .cache)

Here we import the datetime and random modules, and we can see the class we created earlier. This time, we add a few methods to it. One of the methods does the magic, called _contains_. Although it is not necessary to use this method here, the basic idea is that it allows us to check the class instance to see if it contains the key we are looking for. In addition, the update method is responsible for updating the cache dictionary with the new key/value pair. Once the maximum capacity of the cache is reached or exceeded, it will also delete the oldest input data. In addition, the remove_oldest method is responsible for the specific removal of early data in the dictionary. Finally, we also introduced an attribute called size, which can return the specific capacity of the cache.

After adding the following code, we can test that the cache works as expected:

 if __name__ == '__main__' : 
 
 #Test cache   
 
    keys = [ 'test' , 'red' , 'fox' , 'fence' , 'junk' , 
 
 'other' , 'alpha' , 'bravo' , 'cal' , 'devo' , 
 
 'ele' ] 
 
 s = 'abcdefghijklmnop'   
 
 cache = MyCache() 
 
 for i, key in enumerate(keys): 
 
 if key in cache: 
 
 continue   
 
 else : 
 
            value = '' .join([random.choice(s) for i in range( 20 )]) 
 
            cache.update(key, value) 
 
 print ( "#%s iterations, #%s cached entries" % (i+ 1 , cache.size)) 
 
 print

In this example, we set up a number of predefined keys and loops. If the key does not exist yet, we add it to the cache. However, the sample code above does not mention how to update the access date, so you can explore it as an exercise. After running this code, you will notice that when the cache is full, it will correctly delete the older entries.

Now, let’s move on and see how to leverage another way to create a cache using Python’s built-in functools module.

Using functools.lru_cache

Python's functools module provides a very useful decorator, lru_cache. It should be noted that it was only added in version 3.2. According to the documentation, this decorator can "pack functions into callable memory to reduce the size of the most recently called functions." Next, we will write a basic function based on the example mentioned in the documentation, which contains multiple network pages. In this case, we can get the pages directly from the Python documentation site.

 import urllib.error 
 
 import urllib.request 
 
 from functools import lru_cache 
 
 @lru_cache (maxsize= 24 ) 
 
 def get_webpage(module): 
 
 """ 
 
 Get a specific Python module network page 
 
 """       
 
    webpage = "https://docs.python.org/3/library/{}.html" .format(module) 
 
 try : 
 
        with urllib.request.urlopen(webpage) as request: 
 
 return request.read() 
 
 except urllib.error.HTTPError: 
 
 return   None   
 
 if __name__ == '__main__' : 
 
    modules = [ 'functools' , 'collections' , 'os' , 'sys' ] 
 
 for module in modules: 
 
        page = get_webpage(module) 
 
 if page: 
 
 print ( "{} module page found" .format(module))

In the code above, we decorate the get_webpage function with lru_cache and set its maximum size to 24 calls. After that, we set a webpage string variable and pass it the module we want the function to get. In my experience, it works better if you run this in a Python interpreter, such as IDLE. This way, we can run multiple loops through the function. You can see that when you run the code first, the output is relatively slow. But if you run it again in the same session, it will be much faster, which means that lru_cache has cached the call correctly. You can experiment with your own interpreter instance and see the results for yourself.

Additionally, we can pass a typed parameter to the decorator. This is a Boolean that tells the decorator to cache different types of parameters separately when typed is set to True.

Summarize

Now you have a basic understanding of how to write your own cache mechanism in Python. This is a fun tool and can be very useful if you have a lot of high-intensity I/O calls or want to cache commonly used information such as login credentials.

Original title: Python Development: Introduction to Cache Mechanism

[Translated by 51CTO. Please indicate the original translator and source as 51CTO.com when reprinting on partner sites]

<<: When developing an app, what does a product manager need to do from beginning to end? -- Before the project starts

>>: Carelessness leads to failure: Lee Sedol loses in first round of man vs. machine match

How to conduct big data analysis and capture the fickle hearts of users?

China's first set of carbon-neutral Olympic award-winning equipment is full of details and is low-carbon and environmentally friendly!

Produced by: Science Popularization China Author:...

Python Development: Introduction to Cache Mechanism

How to conduct big data analysis and capture the fickle hearts of users?

Don’t worry if you forget to bring your ID card! A guide to using the 12306 temporary bus certificate

From now on, the best Android phones come from Google

Alipay launches a wave of new features: it is completely indispensable

Under the curled and yellowed leaves, a "war" is taking place

Use your phone to adjust the photo to look high-end——Snapseed mobile color adjustment tutorial

Brand Promotion Guide on Station B!

11 commonly used data analysis methods for product operations!

Analysis of user fission growth in Internet communities

4 modules, a brief analysis of "New User"

Recommend

9 steps to write a good title quickly!

What is the content of online promotion work and what are the specific responsibilities of promotion work?

How much does it cost to make a preschool education app in Alar?

How much is the price for being an agent of Shiyan Toy Mini Program? Shiyan Toys Mini Program Agent Price Inquiry

The hidden worries behind Netflix's stock price frenzy

How to do marketing promotion? Master these 14 methods!

How to develop a suitable user growth strategy?

Tencent's "Voice Emoji" is coming: multiple people dub one expression together

Unlocking agricultural black technology, farming is like playing a game

China's first set of carbon-neutral Olympic award-winning equipment is full of details and is low-carbon and environmentally friendly!

Kuaishou live broadcast promotion and traffic generation skills!

AI consciousness has "awakened", but humans are still kept in the dark?

Fortunately, there are still Mogao Grottoes in the world

Master super search skills from scratch in 21 days

Facebook engineers develop tools to improve VR content development efficiency