PrizmDoc® v14.2 Release - Updated
PrizmDoc / Administrator Guide / PrizmDoc Server / Configuring / Implement Caching Strategies
In This Topic
    Implement Caching Strategies
    In This Topic

    Introduction

    This topic covers common questions and recommendations to consider when implementing your caching strategy:

    Why does PrizmDoc Server Cache Files?

    The power behind PrizmDoc Services’ ability to deliver viewable web content quickly and efficiently lies with its cache management. Viewing a multipage document requires that each document page be converted into a web compatible format such as JPEG, PNG or ideally SVG (which gives the highest fidelity upon scaling). Unfortunately, the conversion process is not instantaneous, which means there is some delay before a page can be made viewable. Because PrizmDoc Server assumes a document will be viewed by more than one person over multiple sessions, it converts all the pages into web viewable intermediate objects that are stored in its cache folders.

    The conversion process begins when the viewing session is started or with the first request to view a document page by a given viewing session. Typically, the viewable page data that is generated will then be made available to any subsequent request for the same pages, reducing the time to view to only the time it takes to download the page data to the browser. To summarize, the cached files help deliver viewing performance because the viewing objects are pre-generated and stored in the cache folders.

    What is the Cost of the PrizmDoc Server Cache?

    The cached files require storage on some media device for some period of time. Cached files created for viewing may take up a considerable amount of space, so there is a need to have some control on the growth of the cache files. Fortunately, PrizmDoc Server does provide ways to deal with the storage usage demand of the cache with options for controlling both where the files are stored, and how long they are stored there. In fact, the cache contains different purposed folders which can be relocated to different devices which can spread the cache burden out to different devices if necessary.

    How do I Optimize Cache Performance?

    The majority of the PrizmDoc Server cache is made up of pre-generated document pages which are readily available on demand. Caching these files is already a help in performance when the same document is viewed repeatedly. While there are three configurable cache folders locations, placing certain ones on more responsive media can result in better viewing experience with less burden on the server hosting the PrizmDoc Server service. The use of solid state drives (SSD) or Shared Memory (Linux only) minimizes input/output (I/O) latency and access times for cached files but these storage devices are typically much more confined in storage capacity.

    When a Document is Uploaded to PAS, Where is the Original Document Cached, and for How Long?

    The original document is stored in the cache directory, cache.directory, which is a user configurable location. The duration for which the original document is stored is also configurable:

    When a Document is Converted using PrizmDoc Server, Where is the Converted Content Stored, and for How Long?

    The original document is stored in the cache directory, cache.directory, which is a user configurable location. The duration for which the converted content is stored can be extended:

    • See minSecondsAvailable and serverCaching properties in the Viewing Sessions topic.
    • If you reuse a document before it expires, then the new operation will automatically extend the duration for which it is stored.

    NOTE: For related information on caching, see Adjust Caching Parameters and Implement Caching Strategies topics.

    Cache Strategies and Tradeoff Scenarios

    Several scenarios are proposed below with purposed cache configuration solutions. The user should be familiar with the central configuration file settings as outlined in Central Configuration Options. Along with the central configuration file, there is a property in the JSON object which the application posts when requesting a new viewing session from PrizmDoc Server (refer to the How To Adjust Caching Parameters for PrizmDoc Server topic).

    The default settings in the central configuration file will cause viewing sessions to timeout after 20 minutes, and cached files to expire after one day. Also by default, the PrizmDoc Server cache folders will all be created within the same parent directory on the root drive. These default settings give a reader 20 minutes to read a document once the viewing session is started. After that time period, a new viewing session will need to be created for them to continue reading the document, either by refreshing their browser, or another mechanism you implement in your application.

    The next time the same document is viewed, PrizmDoc Server will simply deliver the viewing objects that were created in the first viewing session to the same reader, or to any other reader viewing the same document, for about 24 hours after the first viewing session was created. When a reader (same or new) requests to read the document a day later, the cache process starts over because PrizmDoc will have already deleted the cached pages and will have to re-generate all the viewable content of the document again.

    NOTE: If you set the cache to 1 day, the timer will start over if someone accesses a file that is in the cache.

    To manually delete the cache:

    1. Stop the Prizm service.
    2. Go to the cache folder: (On Windows: C:\Prizm\cache. On Linux: <data_folder>/cache, where <data_folder> is the folder that you mapped as /data when creating the container. See Installing / Using Docker for more details.).
    3. You can delete all files and folders within the cache folder.
    4. Start the Prizm service again. PrizmDoc will generate a new cache.

    The file paths for the Central Configuration file are:

    • Docker: <config_folder>/prizm-services-config.yml, where <config_folder> is the folder that you mapped as /config when creating the container. See Installing / Using Docker for more details.
    • Windows: C:\Prizm\prizm-services-config.yml (assuming the standard install location.)

    Scenario 1:

    Viewing response appears slow even with caching enabled as lots of readers are interested in viewing the document.

    Solution:

    Set the cache.directory setting in the central configuration file to a faster SSD device or with Linux environments, set the content to a folder of the Shared Memory device (i.e. /dev/shm).

    Example for Shared Memory Device

    cache.directory: /dev/shm/Accusoft/Prizm/
    
    

    The above setting in central configuration sets the cache directories to folders in Shared Memory on a Linux OS environment. Being faster than standard disk drives, PrizmDoc Server response will be typically quicker with less overall stress on the server to deliver viewing content.

    Scenario 2:

    Viewing Clients are getting errors and the storage device used for the PrizmDoc Server cache is showing errors because the devices are full.

    Solution:

    Depending on available storage capacity of the selected device, the cache expiration period specified by viewing.cacheLifetime in central configuration may need to be shortened to accommodate cache load. Please note that the time period for viewing.cacheLifetime should not be any shorter than the viewing.sessionLifetime time period. Otherwise, the viewing.sessionLifetime will take precedence and the cache expiration period will be forced to the same value. The viewing.sessionLifetime time period can be shortened but at the penalty of reducing the amount of time a user has to read a document in a single viewing session.

    Rather than changing the viewing session timeout period, try changing the size of the (fast) storage device.

    Example for Quicker Cache Cleanup

    viewing.sessionLifetime: 15m
    viewing.cacheLifetime: 20m
    
    

    The above settings set the viewing session timeout to 15 minutes and the life expectancy of any cached file to 20 minutes. After approximately 35 to 45 minutes, the cached files for a given document will be deleted. The exact time of cleanup can vary based on the scheduled nature of the cleanup processes and current load on the server.

    Scenario 3:

    Your application views a lot of large documents and users are not able to read them in time before they get a viewing session timeout error.

    Solution:

    The default setting in the central configuration file for viewing.sessionLifetime is 20 minutes. It can be increased to a larger value but that means PrizmDoc Server will have more resources to track at any given moment which could affect performance and host server capacity.

    Example of Longer Viewing Session Duration

    viewing.sessionLifetime: 1h
    viewing.cacheLifetime: 1d
    
    

    The above settings increase the ability for users to peruse a given document for an hour. Cache resources for the document will be removed 25+ hours later. As above, there is variability for cache cleanup based on the scheduled nature of the cleanup processes and current load on the server.

    Scenario 4:

    The documents served are fairly random and not typically shared with others.

    Or:

    The image is watermarked uniquely for each Viewer and should not be shared.

    Solution:

    In this scenario, the cache resources are not likely to be needed except for the initial user. There is a property in the JSON object which the application posts when requesting a new viewing session from PrizmDoc Server that can be used to disable caching on a per-viewing-session basis. The property, serverCaching, should be set explicitly to the string value none when the application requests a POST operation to get a new viewing session ID. Each document uploaded to PrizmDoc Server will be converted without PrizmDoc Server looking for an existing copy of the document. After the viewing session times out, the cached items for the document will be removed on a predetermined schedule which should be fairly quick because no other viewing sessions are using the data. For example:

    Example

    POST /ViewingSession
    {
    ...
        "serverCaching": "none",
    ...
    }
    
    

    After the viewing session timeout, the cache items should be removed fairly soon.

    Summary

    The PrizmDoc Server cache provides a mechanism to deliver document content in a timely matter. However, each application is different and may tax server resources differently or have more demanding requirements. Balancing resource constraints against user experience can be a difficult task that may require compromises. Faster hardware, more specifically high speed storage devices, coupled with an understanding of the options for adjusting how the PrizmDoc Server cache behaves should allow you to reach a desired level of performance while maintaining a good user experience.