Introduction
Available URLs
URL | Description |
---|---|
GET /PCCIS/V1/Document/q/Attributes | Gets a page count for the source document of a viewing session. |
GET /PCCIS/V1/Page/q/{PageNumber}/Attributes | Gets metadata for a page of the source document of a viewing session. |
GET /PCCIS/V1/Page/q/{PageNumber} | Gets SVG or an image for a page of the source document of a viewing session. |
GET /PCCIS/V1/Page/q/{PageNumber}/Tile/{x}/{y}/{width}/{height} | Gets a "tile" image, a part of a page, for a page of the source document of a viewing session. |
GET /PCCIS/V1/Page/q/{PageNumber}/{Width}x{Height} | Gets a thumbnail image for a page of the source document of a viewing session. |
GET /PCCIS/V1/Document/q/{PageNumberBegin}-{PageNumberEnd}/Text | Gets currently-available text and text metadata for a range of pages for the source document of a viewing session. |
GET /v2/viewingSessions/{viewingSessionId}/revisionData | Gets objects which describe known changes between the two documents used as input to a comparison viewing session. |
Deprecated URLs
URL | Description |
---|---|
GET /PCCIS/V1/License/ClientViewer | Deprecated. |
GET /PCCIS/V1/Document/q/Attributes?DocumentID=[e,u]{ViewingSessionId}&DesiredPageCountConfidence={DesiredPageCountConfidence}
Gets a page count for the source document of a viewing session.
Request
Query String Parameters
Parameter | Description |
---|---|
DocumentID |
Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded. |
DesiredPageCountConfidence |
An integer from 0 and 100 inclusive which specifies the minimum required confidence in the page count before a value is returned. A value of 50 or lower is more likely to result in an estimated page count being returned. Default is 100 , requiring that the actual page count be returned. |
Successful Response
JSON metadata about the document page count.
pageCount
(Integer) - Currently determined number of pages in the document.pageCountConfidence
(Integer) - An integer from0
to100
indicating the confidence thatpageCount
is accurate. When less than100
,pageCount
is still being determined and the current value should be considered an estimate. You can repeat the request to see if a more accuratepageCount
is available. When100
, thepageCount
has been finalized and the value should be considered the actual number of pages in the document.
Example
GET prizmdoc_server_base_url/PCCIS/V1/Document/q/Attributes?DocumentID=uXYZ...
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
{
"pageCount": 3,
"pageCountConfidence": 100
}
GET /PCCIS/V1/Page/q/{PageNumber}/Attributes?DocumentID=[e,u]{viewingSessionId}&ContentType={ContentType}
Gets metadata for a page of the source document of a viewing session.
Request
URL Parameters
Parameter | Description |
---|---|
{PageNumber} |
Zero-indexed page number to get information about. |
Query String Parameters
Parameter | Description |
---|---|
DocumentID |
Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded. |
ContentType |
Used to indicate whether you want attributes for SVG page content or raster page content. Use svg (or svga or svgb ) to get page attributes for SVG content or png to get page attributes for raster content. Default is svg . |
Successful Response
JSON metadata about the page.
version
(String) - Deprecated. Value will always be"7.1"
.contentType
(String) - Comma-separated types of content available, hard-coded to a specific set of values depending on the requestedContentType
:- Will be
"jpeg,png,svg"
when requestedContentType
issvg
(orsvga
, orsvgb
) - Will be
"jpeg,png"
when requestedContentType
ispng
- Will be
imageBitDepth
(Integer) - Bit depth of raster content. Only relevant when requestedContentType
ispng
. When requestedContentType
issvg
(orsvga
orsvgb
), the value will be hard-coded to16
.imageHeight
(Integer) - Height of the page:- in pixels when requested
ContentType
ispng
- in unspecified units when requested
ContentType
issvg
(orsvga
orsvgb
)
- in pixels when requested
imageWidth
(Integer) - Width of the page:- in pixels when requested
ContentType
ispng
- in unspecified units when requested
ContentType
issvg
(orsvga
orsvgb
)
- in pixels when requested
imageXResolution
(Integer) - Relative horizontal resolution of raster content when requestedContentType
ispng
(like pixels per inch except that the unit is unspecified and will not necessarily be inches). When requestedContentType
issvg
(orsvga
orsvgb
), the value will be hard-coded to90
.imageYResolution
(Integer) - Relative vertical resolution of raster content when requestedContentType
ispng
(like pixels per inch except that the unit is unspecified and will not necessarily be inches). When requestedContentType
issvg
(orsvga
orsvgb
), the value will be hard-coded to90
.
Examples
Get attributes for the SVG form page 0. The two most-valuable properties are "imageHeight"
and "imageWidth"
which indicate the width and height of the SVG page in unspecified units. The rest are hard-coded values:
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
{
"version": "7.1",
"contentType": "jpeg,png,svg",
"imageBitDepth": 16,
"imageHeight": 842,
"imageWidth": 595,
"imageXResolution": 90,
"imageYResolution": 90
}
Get attributes for the raster form of page 0:
GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0/Attributes?DocumentID=uXYZ...?ContentType=png
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
{
"version": "7.1",
"contentType": "jpeg,png",
"imageBitDepth": 8,
"imageHeight": 1755,
"imageWidth": 1240,
"imageXResolution": 150,
"imageYResolution": 150
}
GET /PCCIS/V1/Page/q/{PageNumber}?DocumentID=[e,u]{ViewingSessionId}&Scale={Scale}&ContentType={ContentType}
Gets SVG or an image for a page of the source document of a viewing session.
Request
Request Headers
Name | Description |
---|---|
Accept-Encoding |
Specify gzip to allow gzip compression of the response. Gzip compression will only be applied to SVG responses (it is not used for PNG and JPEG responses) and it may be skipped if the SVG is small. If a response is compressed it will contain a Content-Encoding: gzip response header. Because all modern browsers support Content-Encoding: gzip responses, we recommend you always provide an Accept-Encoding: gzip request header. |
URL Parameters
Parameter | Description |
---|---|
{PageNumber} |
Zero-indexed page number whose content should be returned. |
Query String Parameters
Parameter | Description |
---|---|
DocumentID |
Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded. |
ContentType |
Type of content to be returned. Default is png . Possible values:
|
Scale |
Scaling factor to apply when returning a PNG or JPEG image. The image will be resized by multiplying its width and height by this value. A value of 1.0 leaves the image unscaled, values less than 1.0 make the image smaller, and values greater than 1.0 make the image larger. For example, a value of 2.0 would return an image whose width and height have been doubled. A value of 0.5 would return an image whose width and height have been halved. Only applies when ContentType is png or jpeg , ignored otherwise. Default is 1.0 (no scaling applied). |
Successful Response with Page Content
Response Headers
Name | Description |
---|---|
Content-Type |
The type of content returned. Possible values:
|
Content-Encoding |
May be set to gzip if the request used Accept-Encoding: gzip . |
Accusoft-Data-Encrypted |
Indicates whether or not page content has been encrypted. true when page content is encrypted, false otherwise. See Enabling Content Encryption. |
Response Body
SVG, PNG, or JPEG for the requested page.
Examples
Get page 0 as SVG
Request:
GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0?DocumentID=uXYZ...&ContentType=svgb
Response:
HTTP/1.1 200 OK
Content-Type: image/svg+xml
Content-Encoding: gzip
Accusoft-Data-Encrypted: false
<svg height="842" style="font-family:qsnvcgduoqekywbefqyyjjhodpw;font-size:12px;" version="1.2" viewBox="0 0 595 842" width="595"
xmlns="http://www.w3.org/2000/svg"
xmlns:xlink="http://www.w3.org/1999/xlink">
...
</svg>
Get page 0 as a PNG
GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0?DocumentID=uXYZ...&ContentType=png
Response:
HTTP/1.1 200 OK
Content-Type: image/png
Accusoft-Data-Encrypted: false
<<PNG bytes>>
Response When SVG Content is Not Available
SVG content is typically preferred but not always available. For example, if the source document is raster (such as a TIFF), then only raster page content (PNG and JPEG) will be available. It is common for a client viewer to always try and request SVG content first. Then, once it becomes clear SVG is not available, the client viewer can fallback to only requesting PNG or JPEG page content.
For this reason, if SVG is requested but is not available, we respond with a successful HTTP 200 but with a JSON body indicating that SVG is not available. Additionally, we include raster page attributes metadata in the JSON so that the viewer does not need to issue an additional request for page attributes.
Example
Request:
GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0?DocumentID=uXYZ...&ContentType=svgb
Response when SVG is not available:
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
{
"errorCode": "SvgNotAvailable",
"pageAttributes": {
"version": "7.1",
"contentType": "jpeg,png",
"imageBitDepth": 8,
"imageHeight": 280,
"imageWidth": 593,
"imageXResolution": 72,
"imageYResolution": 72
}
}
GET /PCCIS/V1/Page/q/{PageNumber}/Tile/{X}/{Y}/{Width}/{Height}?DocumentID=[e,u]{ViewingSessionId}&ContentType={ContentType}&Scale={Scale}
Gets a "tile" image, a part of a page, for a page of the source document of a viewing session.
Request
URL Parameters
Parameter | Description |
---|---|
{PageNumber} |
Zero-indexed page number to extract a tile from. |
{X} |
Where the left edge of the tile should begin, expressed as the number of pixels from the left edge of the page. Must be less than the pixel width of the (scaled) page. |
{Y} |
Where the top of the tile should begin, expressed as the number of pixels from the top edge of the page. Must be less than the pixel height of the (scaled) page. |
{Width} |
Width of the tile in pixels, extended right from {X} . If the right edge of the tile extends beyond the right edge of the page then the tile will be trimmed so that only actual page content is returned. You can safely use a {Width} value that goes beyond the right edge of the page, but note that the actual width of the returned image may be different than what you requested. |
{Height} |
Height of the tile in pixels, extended down from {Y} . If the bottom edge of the tile extends beyond the bottom edge of the page then the tile will be trimmed so that only actual page content is returned. You can safely use a {Height} value that goes beyond the bottom edge of the page, but note that the actual height of the returned image may be different than what you requested. |
Query String Parameters
Parameter | Description |
---|---|
DocumentID |
Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded. |
ContentType |
Type of image to be returned. Default is png . Possible values:
|
Scale |
Scaling factor to apply to the entire page before cropping to the specified tile region. The full page image will be resized by multiplying its width and height by this value. A value of 1.0 leaves the page unscaled, values less than 1.0 make the page smaller, and values greater than 1.0 make the page larger. Default is 1.0 (no scaling applied). |
Successful Response
Response Headers
Name | Description |
---|---|
Content-Type |
The type of content returned. Possible values:
|
Accusoft-Data-Encrypted |
Indicates whether or not page content has been encrypted. true when page content is encrypted, false otherwise. See Enabling Content Encryption. |
Response Body
Tile image.
Example
Request a 512x512 PNG tile for page 0 starting at x-position 1024 (from left) and y-position 1536 (from top):
GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0/Tile/1024/1536/512/512?DocumentID=uXYZ...
HTTP/1.1 200 OK
Content-Type: image/png
<<PNG bytes>>
GET /PCCIS/V1/Page/q/{PageNumber}/{Width}x{Height}?DocumentID=[e,u]{ViewingSessionId}&ContentType={ContentType}
Gets a thumbnail image for a page of the source document of a viewing session.
The page will be resized, maintaining aspect ratio, to fit within the {Width}
and {Height}
specified in the URL.
Request
URL Parameters
Parameter | Description |
---|---|
{PageNumber} |
Zero-indexed page number to create a thumbnail image for. |
{Width} |
Maximum allowed width, in pixels, of the thumbnail. Must be an integer greater than 0 . |
{Height} |
Maximum allowed height, in pixels, of the thumbnail. Must be an integer greater than 0 . |
Query String Parameters
Parameter | Description |
---|---|
DocumentID |
Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded. |
ContentType |
Type of image to be returned. Default is png . Possible values:
|
Successful Response
Response Headers
Name | Description |
---|---|
Content-Type |
The type of content returned. Possible values:
|
Response Body
Thumbnail image.
Example
Request a thumbnail image of page 0 that fits within a 200x200 square:
GET prizmdoc_server_base_url/PCCIS/V1/Page/q/0/200x200?DocumentID=uXYZ...
HTTP/1.1 200 OK
Content-Type: image/png
<<PNG bytes>>
GET /PCCIS/V1/Document/q/{PageNumberBegin}-{PageNumberEnd}/Text?DocumentID=[e,u]{ViewingSessionId}
Gets currently-available text and text metadata for a range of pages for the source document of a viewing session.
NOTE: This URL is designed to support our viewer. If you want to simply programmatically extract text from a document, use the Search Contexts API instead, specifically [POST /v2/searchContexts
] and [GET /v2/searchContexts/{contextId}/records
]._
Request
URL Parameters
Parameter | Description |
---|---|
{PageNumberBegin} |
Zero-indexed page number of the first page in the range. |
{PageNumberEnd} |
Zero-indexed page number of the last page in the range. |
Query String Parameters
Parameter | Description |
---|---|
DocumentID |
Required. The viewingSessionId which identifies the viewing session, prefixed with u if in unencoded plaintext form or prefixed with e if base-64 encoded. |
Request Headers
Name | Description |
---|---|
Accept-Encoding |
Specify gzip to allow gzip compression of the response. Gzip compression may be skipped if the overall response size is small. If a response is compressed it will contain a Content-Encoding: gzip response header. Because all modern browsers support Content-Encoding: gzip responses, we recommend you always provide an Accept-Encoding: gzip request header. |
Successful Response
Response Headers
Name | Description |
---|---|
Content-Encoding |
May be set to gzip if the request used Accept-Encoding: gzip . |
Accusoft-Data-Encrypted |
Indicates whether or not page content has been encrypted. true when page content is encrypted, false otherwise. See Enabling Content Encryption. |
Response Body
JSON containing page text and text positioning metadata.
pages[]
(Array of Objects) Always present in case of success. Optional in case of failure. Will contain an array of objects, each containing text data for a page, for pages where text has been successfully extracted. Note, however, that text extraction takes time and text may not yet be available for the range of pages requested. If the array is empty or contains fewer items than the number of pages included in your page range, then the text for the requested page range has not been fully extracted. Repeating the request should eventually produce an array with the expected number of items. Note also that the order of the records is not guaranteed; you must use thenumber
property of each returned item to know its page index. Items may contain:number
(Integer) Always present. Page index (zero-indexed page number). The property is named simplynumber
for backwards compatibility reasons.text
(String) Page text.errorCode
(Integer) When text cannot be extracted for this page, present with a value of1
.errorDescription
(String) When text cannot be extracted for this page, a descriptive error message explaining why (such as"No page data was found."
) or an empty string if the cause of the error is unknown.width
(Number) Page width.height
(Number) Page height.rectangles[]
(Array of Arrays) Bounding boxes for individual glyphs on the page. Each item will contain four numbers:[0]
(Number) Distance from the left edge of the page to the left edge of the glyph bounding box.[1]
(Number) Distance from the top edge of the page to the top edge of the glyph bounding box.[2]
(Number) Width of the glyph bounding box.[3]
(Number) Height of the glyph bounding box.
markup[]
(Array of Objects) Objects describing hyperlinks, if any. Each item may contain:changeType
(String) Value will always be"Add"
.markType
(String) Value will always be"DocumentHyperlink"
.properties
(Object) Properties of the hyperlink.href
(String) Destination URL.rectangle
(Object) Dimensions of the hyperlink bounding box on the page.x
(Number) Distance from the left edge of the page to the left edge of the hyperlink bounding box.y
(Number) Distance from the top edge of the page to the top edge of the hyperlink bounding box.width
(Number) Width of the hyperlink bounding box.height
(Number) Height of the hyperlink bounding box.
borderThickness
(Number) Border thickness which should be applied.borderHorizontalRadius
(Number) Horizontal border radius which should be applied.borderVerticalRadius
(Number) Vertical border radius which should be applied.borderOpacity
(Integer) Border opacity which should be applied. Value will be from0
to255
, where0
represents fully transparent and255
represents fully opaque.
errorCode
(Integer) Might be present in case of failure and missingpages[]
object. When text cannot be extracted for this page, present with a value of1
.errorDescription
(String) Might be present in case of failure and missingpages[]
object. When text cannot be extracted for this page, a descriptive error message explaining why (such as "No page data was found.") or an empty string if the cause of the error is unknown.
Examples
Request text for pages 0 through 9:
GET prizmdoc_server_base_url/PCCIS/V1/Document/q/0-9/Text?DocumentID=xXYZ...
Response when text is not yet available:
HTTP/1.1 200 OK
Content-Type: application/json;charset=utf-8
Accusoft-Data-Encrypted: false
{
"pages": []
}
Response when text is available (where ...
indicates that data has been omitted for brevity):
HTTP/1.1 200 OK
Content-Type: application/json
{
"pages": [
{
"number": 0,
"text": "the page text",
"width": 648.00,
"height": 828.00,
"rectangles": [
[
202.25,
135.05,
27.00,
73.26
],
[
229.25,
135.05,
30.00,
73.26
],
...
]
"markup": [
{
"changeType": "Add",
"markType": "DocumentHyperlink",
"properties": {
"rectangle": {
"height": 14.71,
"width": 86.20,
"y": 73.50,
"x": 71.31
},
"borderHorizontalRadius": 0.0,
"borderVerticalRadius": 0.0,
"borderThickness": 0.0,
"href": "http://www.google.com/",
"borderOpacity": 255
}
},
...
]
},
...
]
}
GET /v2/viewingSessions/{viewingSessionId}/revisionData?limit={limit}&continueToken={continueToken}
Gets objects which describe known changes between the two documents used as input to a comparison viewing session.
This URL is designed to give you an array of changes in chunks as the individual changes become available. Each GET request will return the currently-known changes up to a limit (default is 100
). If a response contains a continueToken
, it indicates that additional changes may be available and that you should issue another GET request using that continueToken
as a query string parameter to skip the changes you have already received. As long as a response contains a continueToken
, use it to issue a subsequent GET for more changes. When you encounter a response which does not have a continueToken
, you have received all of the changes and no more GET requests are necessary.
In order to optimize the number of network requests you make, any response which contains a continueToken
will also contain a continueAfter
value with a recommended number of milliseconds you should wait before sending the next GET request.
Request
URL Parameters
Parameter | Description |
---|---|
{limit} |
The maximum number of changes to return for this HTTP request. Must be an integer greater than 0 . Default is 100 . |
{continueToken} |
Used to continue getting changes from the point where a previous GET request left off. |
Request Headers
Name | Description |
---|---|
Accept-Encoding |
Set to gzip to get a gzipped response body. |
Successful Response
Response Headers
Name | Description |
---|---|
Content-Encoding |
Will be set to gzip if the request used Accept-Encoding: gzip |
Response Body
JSON with any available changes
.
changes
(Array of Objects) Always present. Array of newly-available changes, objects which each describe a difference between the two documents being compared. If no new changes are available, this array will be empty.id
(Integer) Unique number assigned to this change.endPageIndex
(Integer) Zero-indexed page number where this change ends in the document.type
(String) Type of the change. Will be one of the following:"contentInserted"
"contentDeleted"
"propertyChanged"
"paragraphNumberChanged"
"paragraphPropertyChanged"
"tablePropertyChanged"
"sectionPropertyChanged"
"styleDefinitionChanged"
"contentMovedFrom"
"contentMovedTo"
"tableCellInserted"
"tableCellDeleted"
"tableCellsMerged"
continueToken
(String) When present, indicates that more changes may be available. An additional GET request should be made for more changes using this value as thecontinueToken
query string parameter. When not present, indicates that all changes have been obtained and no further changes will be available.continueAfter
(Number) Recommended milliseconds to delay before issuing the next GET request for more changes.
Error Responses
Status Code | JSON errorCode |
Description |
---|---|---|
404 |
- | No viewing session with the provided viewingSessionId could be found. |
480 |
"InvalidInput" |
An invalid input value was used. See errorDetails in the response body. |
580 |
"InternalError" |
The server encountered an internal error when handling the request. |
Example
Here is an example sequence of requests and responses illustrating how you would acquire the full set of changes for a comparison viewing session (for brevity, the total number of changes in this example is small).
You would start with an initial GET:
GET prizmdoc_server_base_url/v2/viewingSessions/luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lkBs8mk63aArufxZ9jaXZ0ykG5LsMlWorI6u3Ui6YApkw/revisionData
Accept-Encoding: gzip
HTTP/1.1 200 OK
Content-Type: application/json
Content-Encoding: gzip
{
"changes": [],
"continueToken": "luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lmZwRo30ojTLjaT0J2D2f8D",
"continueAfter": 500
}
In this case, the initial response did not return any changes at all (the changes
array is empty), but the presence of a continueToken
indicates they may simply not have been available yet. We should issue another GET request after waiting 500
milliseconds (the amount of time recommended by continueAfter
).
So, half a second later, we issue a follow-up request with the continueToken
passed in as a query string parameter:
GET prizmdoc_server_base_url/v2/viewingSessions/luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lkBs8mk63aArufxZ9jaXZ0ykG5LsMlWorI6u3Ui6YApkw/revisionData?continueToken=luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lmZwRo30ojTLjaT0J2D2f8D
Accept-Encoding: gzip
HTTP/1.1 200 OK
Content-Type: application/json
Content-Encoding: gzip
{
"changes": [
{
"id": 0,
"endPageIndex": 0,
"type": "contentInserted"
},
{
"id": 1,
"endPageIndex": 0,
"type": "contentDeleted"
}
],
"continueToken": "luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lklhqP2L79Yero0nM9aoZ9r",
"continueAfter": 500
}
This time we receive two changes. The presence of a new continueToken
tells us there may be more, so we submit another request with the new continueToken
.
Notice in the next response that the changes which have already been given to us are not repeated:
GET prizmdoc_server_base_url/v2/viewingSessions/luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lkBs8mk63aArufxZ9jaXZ0ykG5LsMlWorI6u3Ui6YApkw/revisionData?continueToken=luMJZGIeGQr20veYl5JQwsv77iIvaFsvHAW4x1L88lklhqP2L79Yero0nM9aoZ9r
Accept-Encoding: gzip
HTTP/1.1 200 OK
Content-Type: application/json
Content-Encoding: gzip
{
"changes": [
{
"id": 2,
"endPageIndex": 5,
"type": "styleChanged"
}
]
}
This time we get a new change, and the lack of a continueToken
tells us we have received all of the changes, so there are no more GET requests to make.
GET /PCCIS/V1/License/ClientViewer
NOTE: This URL has been deprecated and will be removed from the public API in a future release. It no longer functions and returns HTTP 500 Internal Server Error
.