PrizmDoc v13.3 - Updated
HTML5 Viewing
API Reference > PrizmDoc Server RESTful API > HTML5 Viewing

HTML5 Viewing

Available URLs

URL Description
GET /PCCIS/V1/Document/q/Attributes Gets a page count for the source document of a viewing session.
GET /PCCIS/V1/Page/q/{PageNumber}/Attributes Gets metadata for a page of the source document of a viewing session.
GET /PCCIS/V1/Page/q/{PageNumber} Gets SVG or an image for a page of the source document of a viewing session.
GET /PCCIS/V1/Page/q/{PageNumber}/Tile/{x}/{y}/{width}/{height} Gets a "tile" image, a part of a page, for a page of the source document of a viewing session.
GET /PCCIS/V1/Page/q/{PageNumber}/{Width}x{Height} Gets a thumbnail image for a page of the source document of a viewing session.
GET /PCCIS/V1/Document/q/{PageNumberBegin}-{PageNumberEnd}/Text Gets currently-available text and text metadata for a range of pages for the source document of a viewing session.
GET /v2/viewingSessions/{viewingSessionId}/revisionData Gets objects which describe known changes between the two documents used as input to a comparison viewing session.

GET /PCCIS/V1/Document/q/Attributes?DocumentID=[e,u]{ViewingSessionId}&DesiredPageCountConfidence={DesiredPageCountConfidence}

Gets a page count for the source document of a viewing session.

Request

Query String Parameters

Parameter Description
DocumentID Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded.
DesiredPageCountConfidence An integer from 0 and 100 inclusive which specifies the minimum required confidence in the page count before a value is returned. A value of 50 or lower is more likely to result in an estimated page count being returned. Default is 100, requiring that the actual page count be returned.

Successful Response

JSON metadata about the document page count.

Example

GET prizmdoc_server_base_url/PCCIS/V1/Document/q/Attributes?DocumentID=uXYZ...
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8

{
    "pageCount": 3,
    "pageCountConfidence": 100
}

GET /PCCIS/V1/Page/q/{PageNumber}/Attributes?DocumentID=[e,u]{viewingSessionId}&ContentType={ContentType}

Gets metadata for a page of the source document of a viewing session.

Request

URL Parameters

Parameter Description
{PageNumber} Zero-indexed page number to get information about.

Query String Parameters

Parameter Description
DocumentID Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded.
ContentType Used to indicate whether you want attributes for SVG page content or raster page content. Use svg (or svga or svgb) to get page attributes for SVG content or png to get page attributes for raster content. Default is svg.

Successful Response

JSON metadata about the page.

Examples

Get attributes for the SVG form page 0. The two most-valuable properties are "imageHeight" and "imageWidth" which indicate the width and height of the SVG page in unspecified units. The rest are hard-coded values:

HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8

{
  "version": "7.1",
  "contentType": "jpeg,png,svg",
  "imageBitDepth": 16,
  "imageHeight": 842,
  "imageWidth": 595,
  "imageXResolution": 90,
  "imageYResolution": 90
}

Get attributes for the raster form of page 0:

GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0/Attributes?DocumentID=uXYZ...?ContentType=png
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8

{
  "version": "7.1",
  "contentType": "jpeg,png",
  "imageBitDepth": 8,
  "imageHeight": 1755,
  "imageWidth": 1240,
  "imageXResolution": 150,
  "imageYResolution": 150
}

GET /PCCIS/V1/Page/q/{PageNumber}?DocumentID=[e,u]{ViewingSessionId}&Scale={Scale}&ContentType={ContentType}

Gets SVG or an image for a page of the source document of a viewing session.

Request

Request Headers

Name Description
Accept-Encoding Specify gzip to allow gzip compression of the response. Gzip compression will only be applied to SVG responses (it is not used for PNG and JPEG responses) and it may be skipped if the SVG is small. If a response is compressed it will contain a Content-Encoding: gzip response header. Because all modern browsers support Content-Encoding: gzip responses, we recommend you always provide an Accept-Encoding: gzip request header.

URL Parameters

Parameter Description
{PageNumber} Zero-indexed page number whose content should be returned.

Query String Parameters

Parameter Description
DocumentID Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded.
ContentType Type of content to be returned. Default is png. Possible values:
  • svgb - Fully-optimized SVG (uses a unicode inline font to store glyph definitions). Smallest possible SVG, but may not be compatible with some browsers. Recommended whenever possible.
  • svga - Partially-optimized SVG (uses a non-unicode inline font to store only the most frequently-occurring glyph definitions). May not be compatible with some browsers. Use only if svgb content is not compatible with the target browser.
  • svg - Unoptimized SVG (no font is used; glyph definitions are expressed as SVG path operations). Broadest compatibility with browsers but typically much larger and slower than svgb and svga. Not recommended. Use only as a fallback if both svgb and svga are not compatible with the target browser.
  • png - PNG image.
  • jpeg - JPEG image.
Scale Scaling factor to apply when returning a PNG or JPEG image. The image will be resized by multiplying its width and height by this value. A value of 1.0 leaves the image unscaled, values less than 1.0 make the image smaller, and values greater than 1.0 make the image larger. For example, a value of 2.0 would return an image whose width and height have been doubled. A value of 0.5 would return an image whose width and height have been halved. Only applies when ContentType is png or jpeg, ignored otherwise. Default is 1.0 (no scaling applied).

Successful Response with Page Content

Response Headers

Name Description
Content-Type The type of content returned. Possible values:
  • image/svg+xml - When the request query string parameter ContentType was svg, svga, or svgb.
  • image/png - When the request query string parameter ContentType was png.
  • image/jpeg - When the request query string parameter ContentType was jpeg.
Content-Encoding May be set to gzip if the request used Accept-Encoding: gzip.
Accusoft-Data-Encrypted Indicates whether or not page content has been encrypted. true when page content is encrypted, false otherwise. See Enabling Content Encryption.

Response Body

SVG, PNG, or JPEG for the requested page.

Examples

Get page 0 as SVG

Request:

GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0?DocumentID=uXYZ...&ContentType=svgb

Response:

HTTP/1.1 200 OK
Content-Type: image/svg+xml
Content-Encoding: gzip
Accusoft-Data-Encrypted: false

<svg height="842" style="font-family:qsnvcgduoqekywbefqyyjjhodpw;font-size:12px;" version="1.2" viewBox="0 0 595 842" width="595"
  xmlns="http://www.w3.org/2000/svg"
  xmlns:xlink="http://www.w3.org/1999/xlink">
  ...
</svg>

Get page 0 as a PNG
GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0?DocumentID=uXYZ...&ContentType=png

Response:

HTTP/1.1 200 OK
Content-Type: image/png
Accusoft-Data-Encrypted: false

<<PNG bytes>>

Response When SVG Content is Not Available

SVG content is typically preferred but not always available. For example, if the source document is raster (such as a TIFF), then only raster page content (PNG and JPEG) will be available. It is common for a client viewer to always try and request SVG content first. Then, once it becomes clear SVG is not available, the client viewer can fallback to only requesting PNG or JPEG page content.

For this reason, if SVG is requested but is not available, we respond with a successful HTTP 200 but with a JSON body indicating that SVG is not available. Additionally, we include raster page attributes metadata in the JSON so that the viewer does not need to issue an additional request for page attributes.

Example

Request:

GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0?DocumentID=uXYZ...&ContentType=svgb

Response when SVG is not available:

HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8

{
  "errorCode": "SvgNotAvailable",
  "pageAttributes": {
    "version": "7.1",
    "contentType": "jpeg,png",
    "imageBitDepth": 8,
    "imageHeight": 280,
    "imageWidth": 593,
    "imageXResolution": 72,
    "imageYResolution": 72
  }
}

GET /PCCIS/V1/Page/q/{PageNumber}/Tile/{X}/{Y}/{Width}/{Height}?DocumentID=[e,u]{ViewingSessionId}&ContentType={ContentType}&Scale={Scale}

Gets a "tile" image, a part of a page, for a page of the source document of a viewing session.

Request

URL Parameters

Parameter Description
{PageNumber} Zero-indexed page number to extract a tile from.
{X} Where the left edge of the tile should begin, expressed as the number of pixels from the left edge of the page. Must be less than the pixel width of the (scaled) page.
{Y} Where the top of the tile should begin, expressed as the number of pixels from the top edge of the page. Must be less than the pixel height of the (scaled) page.
{Width} Width of the tile in pixels, extended right from {X}. If the right edge of the tile extends beyond the right edge of the page then the tile will be trimmed so that only actual page content is returned. You can safely use a {Width} value that goes beyond the right edge of the page, but note that the actual width of the returned image may be different than what you requested.
{Height} Height of the tile in pixels, extended down from {Y}. If the bottom edge of the tile extends beyond the bottom edge of the page then the tile will be trimmed so that only actual page content is returned. You can safely use a {Height} value that goes beyond the bottom edge of the page, but note that the actual height of the returned image may be different than what you requested.

Query String Parameters

Parameter Description
DocumentID Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded.
ContentType Type of image to be returned. Default is png. Possible values:
  • png
  • jpeg - Not recommended if you are requesting multiple tiles to be "stitched" together due to alignment artifacts that will occur at tile boundaries.
Scale Scaling factor to apply to the entire page before cropping to the specified tile region. The full page image will be resized by multiplying its width and height by this value. A value of 1.0 leaves the page unscaled, values less than 1.0 make the page smaller, and values greater than 1.0 make the page larger. Default is 1.0 (no scaling applied).

Successful Response

Response Headers

Name Description
Content-Type The type of content returned. Possible values:
  • image/png - When the request query string parameter ContentType was png.
  • image/jpeg - When the request query string parameter ContentType was jpeg.
Accusoft-Data-Encrypted Indicates whether or not page content has been encrypted. true when page content is encrypted, false otherwise. See Enabling Content Encryption.

Response Body

Tile image.

Example

Request a 512x512 PNG tile for page 0 starting at x-position 1024 (from left) and y-position 1536 (from top):

GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0/Tile/1024/1536/512/512?DocumentID=uXYZ...
HTTP/1.1 200 OK
Content-Type: image/png

<<PNG bytes>>

GET /PCCIS/V1/Page/q/{PageNumber}/{Width}x{Height}?DocumentID=[e,u]{ViewingSessionId}&ContentType={ContentType}

Gets a thumbnail image for a page of the source document of a viewing session.

The page will be resized, maintaining aspect ratio, to fit within the {Width} and {Height} specified in the URL.

Request

URL Parameters

Parameter Description
{PageNumber} Zero-indexed page number to create a thumbnail image for.
{Width} Maximum allowed width, in pixels, of the thumbnail. Must be an integer greater than 0.
{Height} Maximum allowed height, in pixels, of the thumbnail. Must be an integer greater than 0.

Query String Parameters

Parameter Description
DocumentID Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded.
ContentType Type of image to be returned. Default is png. Possible values:
  • png
  • jpeg

Successful Response

Response Headers

Name Description
Content-Type The type of content returned. Possible values:
  • image/png - When the request query string parameter ContentType was png.
  • image/jpeg - When the request query string parameter ContentType was jpeg.

Response Body

Thumbnail image.

Example

Request a thumbnail image of page 0 that fits within a 200x200 square:

GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0/200x200?DocumentID=uXYZ...
HTTP/1.1 200 OK
Content-Type: image/png

<<PNG bytes>>

GET /PCCIS/V1/Document/q/{PageNumberBegin}-{PageNumberEnd}/Text?DocumentID=[e,u]{ViewingSessionId}

Gets currently-available text and text metadata for a range of pages for the source document of a viewing session.

NOTE: This URL is designed to support our viewer. If you want to simply programmatically extract text from a document, use the Search Contexts API instead, specifically POST /v2/searchContexts and GET /v2/searchContexts/{contextId}/records.

Request

URL Parameters

Parameter Description
{PageNumberBegin} Zero-indexed page number of the first page in the range.
{PageNumberEnd} Zero-indexed page number of the last page in the range.

Query String Parameters

Parameter Description
DocumentID Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded.

Request Headers

Name Description
Accept-Encoding Specify gzip to allow gzip compression of the response. Gzip compression may be skipped if the overall response size is small. If a response is compressed it will contain a Content-Encoding: gzip response header. Because all modern browsers support Content-Encoding: gzip responses, we recommend you always provide an Accept-Encoding: gzip request header.

Successful Response

Response Headers

Name Description
Content-Encoding May be set to gzip if the request used Accept-Encoding: gzip.
Accusoft-Data-Encrypted Indicates whether or not page content has been encrypted. true when page content is encrypted, false otherwise. See Enabling Content Encryption.

Response Body

JSON containing page text and text positioning metadata.

Examples

Request text for pages 0 through 9:

GET prizmdoc_server_base_url/PCCIS/V1/Document/q/0-9/Text?DocumentID=xXYZ...

Response when text is not yet available:

HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
Accusoft-Data-Encrypted: false

{
  "pages": []
}

Response when text is available (where ... indicates that data has been omitted for brevity):

HTTP/1.1 200 OK
Content-Type: application/json

{
  "pages": [
    {
      "number": 0,
      "text": "the page text",
      "width": 648.00,
      "height": 828.00,
      "rectangles": [
        [
          202.25,
          135.05,
          27.00,
          73.26
        ],
        [
          229.25,
          135.05,
          30.00,
          73.26
        ],
        ...
      ]
      "markup": [
        {
          "changeType": "Add",
          "markType": "DocumentHyperlink",
          "properties": {
            "rectangle": {
              "height": 14.71,
              "width": 86.20,
              "y": 73.50,
              "x": 71.31
            },
            "borderHorizontalRadius": 0.0,
            "borderVerticalRadius": 0.0,
            "borderThickness": 0.0,
            "href": "http://www.google.com/",
            "borderOpacity": 255
          }
        },
        ...
      ]
    },
    ...
  ]
}

GET /v2/viewingSessions/{viewingSessionId}/revisionData?limit={limit}&continueToken={continueToken}

Gets objects which describe known changes between the two documents used as input to a comparison viewing session.

This URL is designed to give you an array of changes in chunks as the individual changes become available. Each GET request will return the currently-known changes up to a limit (default is 100). If a response contains a continueToken, it indicates that additional changes may be available and that you should issue another GET request using that continueToken as a query string parameter to skip the changes you have already received. As long as a response contains a continueToken, use it to issue a subsequent GET for more changes. When you encounter a response which does not have a continueToken, you have received all of the changes and no more GET requests are necessary.

In order to optimize the number of network requests you make, any response which contains a continueToken will also contain a continueAfter value with a recommended number of milliseconds you should wait before sending the next GET request.

Request

URL Parameters

Parameter Description
{limit} The maximum number of changes to return for this HTTP request. Must be an integer greater than 0. Default is 100.
{continueToken} Used to continue getting changes from the point where a previous GET request left off.

Request Headers

Name Description
Accept-Encoding Set to gzip to get a gzipped response body.

Successful Response

Response Headers

Name Description
Content-Encoding Will be set to gzip if the request used Accept-Encoding: gzip

Response Body

JSON with any available changes.

Error Responses

Status Code JSON errorCode Description
404 No viewing session with the provided viewingSessionId could be found.
480 "InvalidInput" An invalid input value was used. See errorDetails in the response body.
580 "InternalError" The server encountered an internal error when handling the request.

Example

Here is an example sequence of requests and responses illustrating how you would acquire the full set of changes for a comparison viewing session (for brevity, the total number of changes in this example is small).

You would start with an initial GET:

GET prizmdoc_server_base_url/v2/viewingSessions/luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lkBs8mk63aArufxZ9jaXZ0ykG5LsMlWorI6u3Ui6YApkw/revisionData
Accept-Encoding: gzip
HTTP/1.1 200 OK
Content-Type: application/json
Content-Encoding: gzip

{
  "changes": [],
  "continueToken": "luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lmZwRo30ojTLjaT0J2D2f8D",
  "continueAfter": 500
}

In this case, the initial response did not return any changes at all (the changes array is empty), but the presence of a continueToken indicates they may simply not have been available yet. We should issue another GET request after waiting 500 milliseconds (the amount of time recommended by continueAfter).

So, half a second later, we issue a follow-up request with the continueToken passed in as a query string parameter:

GET prizmdoc_server_base_url/v2/viewingSessions/luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lkBs8mk63aArufxZ9jaXZ0ykG5LsMlWorI6u3Ui6YApkw/revisionData?continueToken=luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lmZwRo30ojTLjaT0J2D2f8D
Accept-Encoding: gzip
HTTP/1.1 200 OK
Content-Type: application/json
Content-Encoding: gzip

{
  "changes": [
    {
      "id": 0,
      "endPageIndex": 0,
      "type": "contentInserted"
    },
    {
      "id": 1,
      "endPageIndex": 0,
      "type": "contentDeleted"
    }
  ],
  "continueToken": "luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lklhqP2L79Yero0nM9aoZ9r",
  "continueAfter": 500
}

This time we receive two changes. The presence of a new continueToken tells us there may be more, so we submit another request with the new continueToken.

Notice in the next response that the changes which have already been given to us are not repeated:

GET prizmdoc_server_base_url/v2/viewingSessions/luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lkBs8mk63aArufxZ9jaXZ0ykG5LsMlWorI6u3Ui6YApkw/revisionData?continueToken=luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lklhqP2L79Yero0nM9aoZ9r
Accept-Encoding: gzip
HTTP/1.1 200 OK
Content-Type: application/json
Content-Encoding: gzip

{
  "changes": [
    {
      "id": 2,
      "endPageIndex": 5,
      "type": "styleChanged"
    }
  ]
}

This time we get a new change, and the lack of a continueToken tells us we have received all of the changes, so there are no more GET requests to make.