PrizmDoc® v14.3 Preview Release - Updated
PrizmDoc / Administrator Guide (Self-Hosted)
In This Topic
    Administrator Guide (Self-Hosted)
    In This Topic

    Introduction

    This section explains how to deploy and administer your own self-hosted PrizmDoc® backend.

    Architecture Overview

    The PrizmDoc backend is comprised of two tiers, PAS and PrizmDoc Server:

    • PrizmDoc Server, the technical heart of the product, is a document processing and conversion engine. It is compute intensive but has no permanent storage.

    • PAS is a layer in front of PrizmDoc Server responsible for viewing concerns, such as saving and loading of annotations. As such, PAS requires access to a storage system which you manage (there are several supported options; see more below).

    An application can send REST API requests to both of these tiers:

    • For viewing, an application will send HTTP requests to PAS (and PAS will, in turn, send requests to PrizmDoc Server):

    • For document processing (like file conversion or redaction), an application will send HTTP requests directly to PrizmDoc Server:

    Database Requirements

    The purpose of this section is to provide you with an overview of the database requirements for both PrizmDoc Server and PAS based on release versions:

    • PrizmDoc Release v14.0 (and previous versions)
    • PrizmDoc Release v14.1 (and later versions) - This includes the database requirements from v14.0 (and previous versions), plus the new requirements listed below.

    Release v14.0 & Previous Versions

    For release v14.0 and previous versions, a database is required for the PrizmDoc Server and PAS functions noted below:

    • PrizmDoc Server

      • Workfile sharing requires a database for storing file metadata and requires S3 storage for storing files. Shared workfile storage, if enabled by workFiles.sharedStorage.type, may use up to 1 KB per work file. The number of work files kept by PrizmDoc can be roughly estimated as the number of documents uploaded by users per day, multiplied by workFiles.lifetime days.
      • Online and Offline Metered licensing for MySQL only (this may use up to 100 KB of storage).
    • PAS

      • V2 viewing packages require a database for storing file metadata (MySQL or SQL Server). File content can be either stored on the disk, S3, or Azure blob storage.
      • Shared document and markup cache file metadata can be stored either locally or in the database (MySQL or SQL Server). File content can be either stored on the disk, S3, or Azure blob storage.
      • Online and Offline Metered licensing (MySQL or SQL Server).

    Release v14.1 & Later

    This section provides FAQs to explain in detail why a database is now required by default. Note that release v14.1 (and later) still requires all the database requirements from v14.0 (and previous). These new requirements do not replace the previous requirements.

    New Requirement

    • What is the new requirement in v14.1? In v14.1, PrizmDoc requires a MySQL database by default.
    • Why was the database requirement introduced with v14.1 and not v14.0? With v14.0, we focused on delivering improved Microsoft Office rendering, modular licensing, and security improvements. We wanted to announce future breaking changes in the v14.0 release to allow our customers ample time to prepare for a smooth transition.
    • Are there plans to support other SQL Servers in the future? PAS supports Microsoft’s SQL Server. Currently, PrizmDoc Server does not, but that may change in a future release.
    • Do you foresee any issue with the MySQL database existing in the AWS Aurora service? No, there shouldn’t be any issue using Aurora since it was designed to support MySQL.

    Benefits

    • What are the benefits of the new database requirement? Currently, you can use Hybrid Viewing to view documents faster and with less storage space. For more details, refer to Get Started with Hybrid Viewing. For the future, we want to increase reliability, security, performance, scalability, and ease of use:
      • Eliminate server affinity by default. Removing the need for server affinity simplifies the API and allows PrizmDoc to more effectively balance the load between servers in a cluster. Currently, if a server in a cluster does some work on a file, only that server has access to that work. Anything else involving that file has to be sent to that specific server. If we can store that work in the shared database instead, then any server in the cluster can do more work on that file, and the API user doesn't need to pass that affinity token everywhere to say "only this server can work on this file".
      • Make viewing packages a default part of all viewing activities, to allow a seamless upgrade to new product versions that introduce significant rendering improvements that otherwise would cause annotation misalignment issues (v14 - Office and HTML).
      • Make it easier to add other enterprise solutions that may require database use.
    • What are the security benefits for moving to v14.x that will not exist in future releases of v13.28? The v14.0 release contains a substantial update to LibreOffice which addresses a number of security issues which will not be addressed in v13.28 as those updates would result in breaking changes to fidelity. Similarly, updates to HTML rendering have been made in v14.1 which will not be addressed in v13.28 due to breaking changes.

    Size & Storage

    • How many databases are required? Only a single database is required in most cases, although it is possible to run different databases for PrizmDoc Server and PAS. If you have a hybrid cluster (for example, multiple instances of PrizmDoc on both Windows and Linux), we recommend the Windows and non-Windows instances use different databases.
    • What size database do I need and how much storage is required? For both PrizmDoc Server and PAS, a MySQL database running on a system of 2-4 cores, with 8-16GB of RAM will meet most customer's needs. Storage will depend on licensing type, the use of the shared work file storage, and the complexity of documents when creating Viewing Packages. For example, assuming an average of 20 pages per document, expect roughly 200 KBs of database storage to be used per legacy viewing package, and roughly 5 KBs of database storage per PDF-only viewing package. Note that the PDF-only Viewing Packages will consume much less storage than the traditional Viewing Packages. We recommend testing storage usage with a representative set of customer documents to assess the usage required.
    • What is stored in the database?
      • PrizmDoc Server - Temporarily stores information on shared work files and metered licensing usage information:
        • Workfile sharing requires a database for storing file metadata and requires S3 storage for storing files. Shared workfile storage, if enabled by workFiles.sharedStorage.type, may use up to 1 KB per work file. The number of work files kept by PrizmDoc can be roughly estimated as the number of documents uploaded by users per day, multiplied by workFiles.lifetime days.
        • Online and Offline Metered licensing for MySQL only (this may use up to 100 KB of storage).
      • PAS - Metered licensing usage information, Annotations, and Viewing Package metadata if viewing packages are used:
        • V2 viewing packages require a database for storing file metadata (MySQL or SQL Server). File content can be either stored on the disk, S3, or Azure blob storage.
        • Shared documents and markup are only stored as a file on either a filesystem, S3, or Azure blob storage. Refer to Storage Entities & Storage Providers for more information.
        • Online and Offline Metered licensing (MySQL or SQL Server).

    Usage

    • Do we need to create the database in MySQL by ourselves or will it be created while installing Prizm Server or PAS? No, you do not need to create the database in MySQL by yourself. You will need an empty database, a user with permissions for that empty database, and the database server must be running (no setup scripts need to be run). PrizmDoc Server and PAS will automatically set up the database. For more information on the permissions required, refer to Configure the Central Database for PrizmDoc Server and Database Admin & Maintenance for PAS.
    • Where do we specify the connection string for the MySQL database? The connection string needs to be in the PrizmDoc Server and PAS's central configuration files: prizm-services-config.yml for PrizmDoc Server and pas\pcc.win.yml for PAS.
      • If you're installing on Windows, the PrizmDoc Server and PAS wizards will prompt you for the connection string, so you don't need to modify the configuration manually.
      • If you are installing on Linux, we recommend you use a separate Docker container with the database (PrizmDoc Server - Docker image and PAS - Docker image). You will need to create a user with the necessary permissions so that PrizmDoc can use it to access the database over the network. You will also need to modify the PrizmDoc Server and PAS central configuration files. For Kubernetes, our hello-prizmdoc-viewer-with-kubernetes sample provides an example manifest for the database. Note that you can also use a database service provided by a cloud provider, such as AWS Aurora.
    • Do we need to consume or update anything in the database while using PrizmDoc Server? No, you don’t need to manually update or use anything in the database. It's for internal PrizmDoc use only.
    • Will there be any changes on the API side or will it be the same as how we used them in the previous versions? The API has not changed due to the database requirement. The API may change in the future for improvements like removing affinity tokens.
    • Are there any viewer side changes needed to support MySQL in PrizmDoc v14.1+? PAS also now requires a database, but apart from that there are no client or viewer changes due to requiring a database.
    • For the PAS database setup, a script is provided and reviewed by advanced users to better understand the use case. Is there a similar script our database administrator could review for the Central Database? PAS requires setting up database tables in advance, so we are shipping those scripts for customers to run. PrizmDoc Server sets up all database tables automatically, so we are not shipping database setup scripts for it. Once PrizmDoc Server starts up, a database administrator can inspect the tables it created.

    Load Balancing and Cluster Management

    It is common in a production deployment to run multiple instances of PAS and PrizmDoc Server. The PAS and PrizmDoc Server tiers are conceptually independent, and each should be fronted by a load balancer of your choice:

    Application developers needing to make PAS or PrizmDoc Server REST API calls should only send their requests to the load balancer sitting in front of the tier they are calling.

    For PAS, any incoming request to the PAS tier can be routed to any PAS instance. You can use any off-the-shelf load balancer in front of your PAS instances (there is no need for anything like sticky sessions). Any PAS instance is capable of directly handling any request.

    For PrizmDoc Server, the process is similar. Any incoming request to the PrizmDoc Server tier can be sent to any PrizmDoc Server instance. You can use any off-the-shelf load balancer in front of your PrizmDoc Server instances (there is no need for anything like sticky sessions). However, this is not because any PrizmDoc Server is capable of directly handling a request. Rather, this is because PrizmDoc Server instances are capable of routing any incoming request to the correct PrizmDoc Server instance in the cluster (the instance where document processing is actually occurring).

    In order for PrizmDoc Server's automatic routing to work correctly, each PrizmDoc Server instance MUST have an up-to-date, accurate list of all of the instances in the PrizmDoc Server cluster. Each time you add or remove PrizmDoc Server instances to your cluster, you must inform each instance that the server list has changed (you can do this with a REST API call to each instance). For more information about this, see Cluster Server Environments > PrizmDoc Server.

    Supported Storage Options for PAS

    Your PAS instances will need to be configured to use some sort of shared storage. PAS supports several options for this:

    • File system (such as network attached storage)
    • Microsoft SQL Server
    • MySQL
    • Amazon S3
    • Azure Blob Storage (beta)

    For more information, see PAS Configuration.

    Licensing

    Self-hosted licensing is based upon your use of PrizmDoc Server. Each PrizmDoc Server instance you run must be configured with your license key (PAS instances do not require a license key to run).

    For your convenience, PrizmDoc Server automatically runs in evaluation mode without a license: images may be displayed with a watermark on them and occasionally dialogs may be posted reminding you that PrizmDoc is in evaluation mode with a fixed feature set. When you are ready to deploy to production, a license will be required to run PrizmDoc Server in your production environment. Please contact info@accusoft.com to obtain a deployment license. For more information, see Deployment Licensing.

    Deployment Options

    For both PAS and PrizmDoc Server, we offer ready-to-run Docker images as well as Windows installers.

    PAS

    For PAS, deploying is easiest with Docker, and we recommend you use the PAS Docker image.

    Deployment Option PAS Pre-Installed
    Docker Yes
    Windows Installer No

    PrizmDoc Server

    For deploying PrizmDoc Server, we recommend you use the PrizmDoc Server Docker image. It largely depends on whether you need to render with Microsoft Office. If you do, then your only option is to deploy to a Windows machine with our Windows installer. However, if you plan to use PrizmDoc Server's built-in LibreOffice rendering, you can use any of our deployment options, including the Docker image.

    Deployment Option PrizmDoc Server Pre-Installed Office Rendering
    Docker Yes LibreOffice
    Windows Installer No LibreOffice or Microsoft Office

    Minimal Backend Quick Start

    If you just want to quickly setup a single instance of PAS and PrizmDoc Server running on a single machine (perhaps for evaluation), check out our Minimal Backend Quick Start.