VirtualViewer uses a memory cache (EhCache v3.3.1) to reduce document retrieval time. When VirtualViewer retrieves a document from the content handler, the document is inserted into the cache to speed up subsequent retrieval.
What is cached?
VirtualViewer’s most-used cache is the document cache. This holds two different types of objects. The first is a wrapper for document data. The second holds a significantly complex object representing the layout of certain document formats, including Office formats. Each VirtualViewer instance uses its own cache. To stay synchronized among instances, VirtualViewer removes a document from the cache when it is modified and saved.
VirtualViewer also uses two other memory caches, though to a lesser degree:
- VirtualViewer will cache OCR data: once OCR is complete, it returns a PDF defining positional text data, and this PDF is cached to avoid redoing OCR in the same session.
- VirtualViewer maintains a cache called the validation cache; if the content handler allows or disallows use of the document cache for a certain document, that response will be stored in the validation cache.
The memory cache is configured in the file
WEB-INF/ehcache.xml. There are three caches configured in
ehcache.xml. The first is the main cache, labelled “
vvDocumentCache”; if VirtualViewer documentation mentions a cache, it is referring to this main document cache. The second is the OCR cache, labelled “
vvOcrCache.” This caches OCR data, so OCR does not have to be repeated in a session. Finally, “
vvValidationCache” caches responses from the content handler method
Each cache is described by an alias attribute in the main cache tag, and two tags called
value-type. None of these values may be changed.
The expiry tag and the resources tag may be modified. By default, the document cache will remove entries that haven’t been used in 60 minutes, and the validation cache will remove entries after 5 minutes; the OCR cache will keep entries without an expiry time.
Within the resources tag, the heap tag configures the maximum size of the cache. In each cache, the heap is described in the unit “
entries.” This means that the cache will limit how much it can store based on the count of entries rather than their size. While it is possible to set the units attribute to some memory unit like MB or GB, this is not recommended.
Using a unit other than “
entries” will cause ehcache to try to figure out how large an entry is by walking the entire tree of that entry when it is inserted into the cache. This will significantly decrease performance, and will increase memory usage.
Additional configuration can be found in
enableDocumentCache takes a boolean. If this is set to
false, the document cache will not be used. It is highly recommended to leave the document cache enabled; disabling the cache will cause significant performance degradation. The document cache should never be disabled if users are viewing document formats that use SnowDoc, like Microsoft Office formats. SnowDoc formats require the document cache for performance optimization. For other format types, however, the document cache could be disabled in favor of another cache solution implemented in the content handler.
clearCacheOnSave also takes a boolean. If this is set to
true, when a user saves a document, the document will be removed from the cache. The document will then be re-requested from the content handler if it needs to be displayed again. This allows the content handler to implement synchronization of user sessions. It is recommended to keep this item set to true.
Server & Client API
Aside from configuration, there are several ways to control cache behavior dynamically.
On the client, the API functions
virtualViewer.seedCache(documentId, pages, clientInstanceId) and
virtualViewer.removeDocumentFromCache(documentId, clientInstanceId) will respectively add and remove documents from the document cache:
seedCachewill retrieve a document from the content handler and add it to the cache. For SnowDoc documents, this may also initiate page layout operations. This function takes two parameters. The
documentIdparameter is the document to be added to the cache and is mandatory. The
pagesparameter is optional and only affects Sparse Documents; it would hold an array of page numbers to add to the cache. Finally, the
clientInstanceIdparameter is optional, and is a way to directly pass a
clientInstanceId, which is a piece of data that will be passed all the way to the content handler.
removeDocumentFromCachewill manually remove a document from the cache. It takes two parameters, the mandatory ID of the document to remove, and the optional
On the server, implementing the content handler interface
CacheValidator allows fine-grained control over which documents are allowed to enter the cache. The interface defines one function,
validateCache is called before each document is stored in or retrieved from the VirtualViewer document cache. It can confirm the operation or prevent it on a document-by-document basis.
The response for each document and operation is cached for a short time in VirtualViewer to prevent asking about the same operation multiple times in quick succession. In other words, if a specific document is prevented from being cached, VirtualViewer will not ask again for a few minutes and the document will remain uncached for that time. To modify how long a response from
validateCache will be remembered, configure the expiry time attribute of the validation cache in
Like all content handler API functions,
validateCache takes a
ContentHandlerInput object and returns a
ContentHandlerInput object contains the following values:
- The key
KEY_CACHE_ACTIONgets a value of either
ContentHandlerInput.VALUE_CACHE_PUT, the action to be confirmed for the specified document.
GETasks whether the document should be retrieved from the cache, while
PUTasks if it should be stored.
- The key
KEY_DOCUMENT_IDstores the ID value that represents the document. This can be retrieved with the code
String documentId = input.getDocumentId();
- The key
KEY_CLIENT_INSTANCE_IDstores a custom configurable value used to pass data from client to content handler. If not set then will be the session ID. This can be retrieved with the code
String clientInstanceId = input.getClientInstanceId();
- The key
KEY_HTTP_SERVLET_REQUESTstores the request that called this method. This can be retrieved with the code
HttpServletRequest request = input.getHttpServletRequest();The returned
ContentHandlerResultmust contain one value:
- The key
KEY_USE_OF_CACHE_ALLOWEDmust store a boolean value. True allows the operation to continue, and false prevents it. This response will be remembered for a few minutes.
- The key
Have questions, corrections, or concerns about this topic? Please let us know!