Introduction

Workers are designed to scale out well. By default, multiple workers can be connected to an instance of the Process Manager, which will automatically distribute work to idle worker instances without the need for additional load balancing.

Instantiate as many Worker Docker containers as needed.
Ensure that each instance is configured to connect to a shared Process Manager instance.
Workers rely on instances of PrizmDoc Server for workfile retrieval and processing. Ensure that all Workers connected to a single instance of the Process Manager are also connected to a single instance of PrizmDoc.
In most cases, each instance of the Process Manager will only need one Worker at a time, but additional instances can be run if needed, for example, to deal with high request volume.

Automatic Scaling

When expecting high but periodic request volume, an Autoscaler can be configured to increase the number of worker instances automatically in response to operating conditions.

In Kubernetes, the simplest method of doing so is to configure a HorizontalPodAutoscaler to respond to CPU utilization, but there are certain downsides to this approach. Each request to a worker incurs significant processing burden, usually causing CPU utilization to reach 100% while the image is being scanned for recognizable text. As a result, configuring any kind of autoscaler to respond to CPU utilization will likely result in additional workers being brought on line and kept idle even when not needed.

A more efficient but also more complicated option would be to configure an autoscaler based on the process_manager_processes_queued metric, which is provided by the Process Manager when the GENERATE_OPTIONAL_METRICS environment variable has been passed to it on startup. By querying this metric for the number of pending requests, as in the following example query, instances of the Worker can be started only when there are surplus requests waiting for processing:

OCR Metrics example:

process_manager_processes_queued{process_type="ocrReaders", state="pending"}

AI Hub Metrics example:

process_manager_processes_queued{process_type="aiTasks", state="pending"}