Python Development: Introduction to Cache Mechanism

Python Development: Introduction to Cache Mechanism

[51CTO Quick Translation] In today's article, we will start with a simple example to learn how to use the cache mechanism. After that, we will further use the functools module of the Python standard library to create a cache that suits our needs. Without further ado, let's get started.

Caching is a way of storing quantitative data in preparation for subsequent requests, in order to speed up data retrieval. In today's article, we will start with a simple example to learn how to use the cache mechanism. After that, we will further use the functools module of the Python standard library to create a cache that suits our needs. As a starting point, we first create a class to build our cache dictionary, and then expand it as needed. The following is the specific code:

  1. #####################################################################  
  2.  
  3. class MyCache:
  4.  
  5. """"""  
  6.  
  7. #-----------------------------------------------------------------------------  
  8.  
  9. def __init__( self ):
  10.  
  11. """Constructor"""  
  12.  
  13. self .cache = {}
  14.  
  15. self .max_cache_size = 10  

There is nothing special in the above class example. We just create a simple class and set two class variables or attributes, cache and max_cache_size. Cache is an empty dictionary, and max_cache_size obviously represents the maximum cache capacity. Let's further flesh out the code to make it functional:

  1. import datetime
  2.  
  3. import random
  4.  
  5. #####################################################################  
  6.  
  7. class MyCache:
  8.  
  9. """"""  
  10.  
  11. #-----------------------------------------------------------------------------  
  12.  
  13. def __init__( self ):
  14.  
  15. """Constructor"""  
  16.  
  17. self .cache = {}
  18.  
  19. self .max_cache_size = 10  
  20.  
  21. #-----------------------------------------------------------------------------  
  22.  
  23. def __contains__( self , key):
  24.  
  25. """
  26.  
  27. Returns True or False depending on whether the key exists in the cache.
  28.  
  29. """  
  30.  
  31. return key in   self .cache
  32.  
  33. #-----------------------------------------------------------------------------  
  34.  
  35. def update( self , key, value):
  36.  
  37. """
  38.  
  39. Updates the cache dictionary, optionally removing the oldest entries
  40.  
  41. """  
  42.  
  43. if key not   in   self .cache and len( self .cache) >= self .max_cache_size:
  44.  
  45. self .remove_oldest()
  46.  
  47. self .cache[key] = { 'date_accessed' : datetime.datetime.now(),
  48.  
  49. 'value' : value}
  50.  
  51. #-----------------------------------------------------------------------------  
  52.  
  53. def remove_oldest( self ):
  54.  
  55. """
  56.  
  57. Delete the input data with the earliest access date
  58.  
  59. """  
  60.  
  61. oldest_entry = None  
  62.  
  63. for key in   self .cache:
  64.  
  65. if oldest_entry == None :
  66.  
  67. oldest_entry = key
  68.  
  69. elif   self .cache[key][ 'date_accessed' ] < self .cache[oldest_entry][
  70.  
  71. 'date_accessed' ]:
  72.  
  73. oldest_entry = key
  74.  
  75. self .cache.pop(oldest_entry)
  76.  
  77. #-----------------------------------------------------------------------------  
  78.  
  79. @property  
  80.  
  81. def size( self ):
  82.  
  83. """
  84.  
  85. Returns the cache capacity
  86.  
  87. """  
  88.  
  89. return len( self .cache)

Here we import the datetime and random modules, and we can see the class we created earlier. This time, we add a few methods to it. One of the methods does the magic, called _contains_. Although it is not necessary to use this method here, the basic idea is that it allows us to check the class instance to see if it contains the key we are looking for. In addition, the update method is responsible for updating the cache dictionary with the new key/value pair. Once the maximum capacity of the cache is reached or exceeded, it will also delete the oldest input data. In addition, the remove_oldest method is responsible for the specific removal of early data in the dictionary. Finally, we also introduced an attribute called size, which can return the specific capacity of the cache.

After adding the following code, we can test that the cache works as expected:

  1. if __name__ == '__main__' :
  2.  
  3. #Test cache  
  4.  
  5. keys = [ 'test' , 'red' , 'fox' , 'fence' , 'junk' ,
  6.  
  7. 'other' , 'alpha' , 'bravo' , 'cal' , 'devo' ,
  8.  
  9. 'ele' ]
  10.  
  11. s = 'abcdefghijklmnop'  
  12.  
  13. cache = MyCache()
  14.  
  15. for i, key in enumerate(keys):
  16.  
  17. if key in cache:
  18.  
  19. continue  
  20.  
  21. else :
  22.  
  23. value = '' .join([random.choice(s) for i in range( 20 )])
  24.  
  25. cache.update(key, value)
  26.  
  27. print ( "#%s iterations, #%s cached entries" % (i+ 1 , cache.size))
  28.  
  29. print  

In this example, we set up a number of predefined keys and loops. If the key does not exist yet, we add it to the cache. However, the sample code above does not mention how to update the access date, so you can explore it as an exercise. After running this code, you will notice that when the cache is full, it will correctly delete the older entries.

Now, let’s move on and see how to leverage another way to create a cache using Python’s built-in functools module.

Using functools.lru_cache

Python's functools module provides a very useful decorator, lru_cache. It should be noted that it was only added in version 3.2. According to the documentation, this decorator can "pack functions into callable memory to reduce the size of the most recently called functions." Next, we will write a basic function based on the example mentioned in the documentation, which contains multiple network pages. In this case, we can get the pages directly from the Python documentation site.

  1. import urllib.error
  2.  
  3. import urllib.request
  4.  
  5. from functools import lru_cache
  6.  
  7. @lru_cache (maxsize= 24 )
  8.  
  9. def get_webpage(module):
  10.  
  11. """
  12.  
  13. Get a specific Python module network page
  14.  
  15. """      
  16.  
  17. webpage = "https://docs.python.org/3/library/{}.html" .format(module)
  18.  
  19. try :
  20.  
  21. with urllib.request.urlopen(webpage) as request:
  22.  
  23. return request.read()
  24.  
  25. except urllib.error.HTTPError:
  26.  
  27. return   None  
  28.  
  29. if __name__ == '__main__' :
  30.  
  31. modules = [ 'functools' , 'collections' , 'os' , 'sys' ]
  32.  
  33. for module in modules:
  34.  
  35. page = get_webpage(module)
  36.  
  37. if page:
  38.  
  39. print ( "{} module page found" .format(module))

In the code above, we decorate the get_webpage function with lru_cache and set its maximum size to 24 calls. After that, we set a webpage string variable and pass it the module we want the function to get. In my experience, it works better if you run this in a Python interpreter, such as IDLE. This way, we can run multiple loops through the function. You can see that when you run the code first, the output is relatively slow. But if you run it again in the same session, it will be much faster, which means that lru_cache has cached the call correctly. You can experiment with your own interpreter instance and see the results for yourself.

Additionally, we can pass a typed parameter to the decorator. This is a Boolean that tells the decorator to cache different types of parameters separately when typed is set to True.

Summarize

Now you have a basic understanding of how to write your own cache mechanism in Python. This is a fun tool and can be very useful if you have a lot of high-intensity I/O calls or want to cache commonly used information such as login credentials.

Original title: Python Development: Introduction to Cache Mechanism

[Translated by 51CTO. Please indicate the original translator and source as 51CTO.com when reprinting on partner sites]

<<:  When developing an app, what does a product manager need to do from beginning to end? -- Before the project starts

>>:  Carelessness leads to failure: Lee Sedol loses in first round of man vs. machine match

Recommend

Comet Zijinshan-Atlas has a reverse tail? What's going on?

Image caption: Comet Purple Mountain-Atlas in the...

Will toothpaste develop "drug resistance"? Do you need to change it frequently?

Are you used to using the same toothpaste all the...

Xiaohongshu Promotion: Why should Xiaohongshu content be vertical?

In 2020, there are many content platforms besides...

How to implement a lock screen widget for our App

One of the most requested features of iOS is a cu...

A complete analysis of the Internet marketing and operation plan!

With the rapid development of mobile Internet, th...

Google releases Android Studio 1.0 official version

[[124030]] Android Studio 1.0 is finally released...

Brand Marketing Promotion: How to design a poster?

Nowadays, countdowns are widely used, including b...

Android official emulator supports Fuchsia's Zircon kernel

[[251953]] Android Studio's official Android ...

iOS 16.3 battery life test is out, improving

Last Tuesday, Apple released the official iOS 16....