PrizmDoc v13.1 - Updated
Content Conversion Service
API Reference > PrizmDoc Server RESTful API > Content Conversion Service

Content Conversion

The Content Conversion API allows you to convert files from a variety of input formats to several common output formats.

To convert a file:

  1. Upload a file you want to use as input using the WorkFile API.
  2. Start a conversion operation by using the POST URL below.
  3. Check the status of the conversion by (repeatedly) using the GET URL below.
  4. When complete, a separate output file will exist which you can download via the WorkFile API.

Available URLs

URL Purpose
POST /v2/contentConverters Create and start a content conversion
GET /v2/contentConverters/{processId} Get the status of a content conversion

Note that these URLs begin with /v2, not /PCCIS/V1.

POST /v2/contentConverters

Creates a new contentConverter resource which represents the conversion process and begins converting one or multiple input files which you have previously uploaded using the WorkFile API. A successful response will include a unique processId which identifies this contentConverter. You will use this processId in subsequent GET calls to get the status and final results of the conversion.

Request Headers

Name Value Details
Content-Type application/json required
Accusoft-Affinity-Token

Affinity token returned in post response body for work file specified by fileId parameter in request body.

Example: "rcqmuB9pAa8+4V7fhO1SXzawy/YMQU1g8lLdNDe5l7w="

Only required if PrizmDoc is running in cluster mode.

Request Body

At a high level, your request body should be JSON containing an input object with details about the sources and dest for the conversion.

Here is a minimal example:

POST http://localhost:18681/v2/contentConverters
Content-Type application/json
{
    "input": {
        "sources": [
            { 
                "fileId": "ek5Zb123oYHSUEVx1bUrVQ"
            }
        ],
        "dest": {
            "format": "pdf"
        }
    }
}

Additional options are available. Here is the full reference:

input.sources

The input.sources object specifies an array of objects, one for each input file.

Currently multiple input files are only supported when the destination format is pdf or tiff, but a future version of the product may allow you to submit multiple input source files for other destination formats.

Name Description Details
input.sources[n].fileId

The id of the WorkFile to use as input.

See Supported Input File Formats

string, required

Example: "ek5Zb123oYHSUEVx1bUrVQ"

input.sources[n].pages

Page numbers and/or page ranges separated by commas.

Currently pages is only supported when the destination format is pdf or tiff, and is ignored otherwise. We expect this to change in a future version, in which case pages will be supported for other destination formats.

string, optional

Example: 1,3,5-10

input.sources[n].password

The password to be used for a document associated with the fileId.

Currently password is only supported when the source format is PDF, MS Word, MS Excel, MS PowerPoint or OpenDocument, and is ignored otherwise. Please note that only Office Open XML versions of MS Word, MS Excel and MS PowerPoint are supported when fidelity.msOfficeDocumentsRenderer is set to "libreoffice". We expect this to change in a future version, in which case password will be supported for other source formats.

string, optional

Example: "secret"

input.src (deprecated)

The input.src object specifies the file to use as input. This property has been deprecated, please use input.sources instead.

Name Description Details
input.src.fileId

The id of the WorkFile to use as input.

See Supported Input File Formats

string, required

Example: "ek5Zb123oYHSUEVx1bUrVQ"

input.dest

The input.dest object specifies the destination file format and any additional details which control how the content is converted.

Name Description Details
input.dest.format

Specifies the output file format. Must be one of the following:

  • "jpeg"
  • "pdf"
  • "png"
  • "svg"
  • "tiff"
string, required
input.dest.jpegOptions Additional options when input.dest.format is "jpeg". object, optional
input.dest.pdfOptions Additional options when input.dest.format is "pdf". object, optional
input.dest.pngOptions Additional options when input.dest.format is "png". object, optional
input.dest.tiffOptions Additional options when input.dest.format is "tiff". object, optional
input.dest.header Specifies the header to be appended to each page of a document. The original page content will be left unaltered. The overall page dimensions will be expanded to allow space for the additional header content. object, optional
input.dest.footer Specifies the footer to be appended to each page of a document. The original page content will be left unaltered. The overall page dimensions will be expanded to allow space for the additional footer content. object, optional

input.dest.jpegOptions

Name Description Details
input.dest.jpegOptions.maxWidth The maximum pixel width of the output image, expressed as a CSS-style string, e.g. "800px". When specified, the output image is guaranteed to never be wider than the specified value and its aspect ratio will be preserved. This is useful if you need all of your output images to fit within a single column.

string, optional

Example: "800px"

input.dest.jpegOptions.maxHeight The maximum pixel height of the output image, expressed as a CSS-style string, e.g. "600px". When specified, the output image is guaranteed to never be taller than the specified value and its aspect ratio will be preserved. This is useful if you need all of your output images to fit within a single row.

string, optional

Example: "600px"

For CAD input, you must specify either maxWidth or maxHeight.

input.dest.pdfOptions

Name Description Details
input.dest.pdfOptions.forceOneFilePerPage

If true, the conversion process will produce single-page PDF files, one file for each page of content (instead of a single PDF with multiple pages).

Default is false.

boolean, optional
input.dest.pdfOptions.ocr

Specifies the options for text recognition. Applies when the source file is raster or PDF with a single raster image per page.

object, optional
input.dest.pdfOptions.ocr.language

The language with which to recognize. Currently, only English is supported. Value must be "english".

string, required
input.dest.pdfOptions.ocr.defaultDpi

If the input raster file does not contain resolution DPI information, these values will be used in place.

Default is { x: 300, y: 300 }

object, optional
input.dest.pdfOptions.ocr.defaultDpi.x

Horizontal DPI value.

integer, required
input.dest.pdfOptions.ocr.defaultDpi.y

Vertical DPI value.

integer, required

When converting PDF documents to a single PDF with multiple pages or a set of single-page PDF files, the result PDF file(s) will lose bookmarks and intra-document links due to restructuring of the PDF content.

Strikethrough text will not be recognized.

input.dest.pngOptions

Name Description Details
input.dest.pngOptions.maxWidth The maximum pixel width of the output image, expressed as a CSS-style string, e.g. "800px". When specified, the output image is guaranteed to never be wider than the specified value and its aspect ratio will be preserved. This is useful if you need all of your output images to fit within a single column.

string, optional

Example: "800px"

input.dest.pngOptions.maxHeight The maximum pixel height of the output image, expressed as a CSS-style string, e.g. "600px". When specified, the output image is guaranteed to never be taller than the specified value and its aspect ratio will be preserved. This is useful if you need all of your output images to fit within a single row.

string, optional

Example: "600px"

For CAD input, you must specify either maxWidth or maxHeight.

input.dest.tiffOptions

Name Description Details
input.dest.tiffOptions.forceOneFilePerPage

If true, the conversion process will produce single-page TIFF files, one file for each page of content (instead of a single TIFF with multiple pages).

Default is false.

boolean, optional
input.dest.tiffOptions.maxWidth The maximum pixel width of the output image, expressed as a CSS-style string, e.g. "800px". When specified, the output image is guaranteed to never be wider than the specified value and its aspect ratio will be preserved. This is useful if you need all of your output images to fit within a single column.

string, optional

Example: "800px"

input.dest.tiffOptions.maxHeight The maximum pixel height of the output image, expressed as a CSS-style string, e.g. "600px". When specified, the output image is guaranteed to never be taller than the specified value and its aspect ratio will be preserved. This is useful if you need all of your output images to fit within a single row.

string, optional

Example: "600px"

For CAD input, you must specify either maxWidth or maxHeight.

input.dest.header

Name Description Details
input.dest.header.lines

This is a multi-dimensional array that allows you to easily position text in a particular line and column. The first string in any inner array will always be placed on the left (left-justified), the second string placed in the center (center-justified), and the third string placed on the right (right-justified). The number of items in the outer array defines the total number of text lines. You may provide between one and three lines of text for a header.

array, optional
input.dest.header.fontFamily

Specifies the name of the font that is used for the header (e.g. "fontFamily": "Courier"). The font name provided must be present on the server to be applied.

string, optional
input.dest.header.fontSize

Specifies the size of the font, in points. If provided, the value must be a string with a number followed by "pt" (e.g. "12pt").

string, optional
input.dest.header.color

Specifies the color of the text. Valid values are any valid CSS HEX value (e.g. "#FF0000").

string, optional

Currently, the input.dest.header property is only supported when converting all pages of a single document to either "pdf" or "tiff", and forceOneFilePerPage is false.

Text may overlap other text and/or overflow the page bounds. The caller specifies the text position and size, and the product simply renders the text. For example, if the font size is too big, text on the left may overlap text in the center, or if the text is so long it can't fit on the page width, it may overflow the page bounds.

For input.dest.header code examples refer to Conversion Input Examples.

Name Description Details
input.dest.footer.lines

This is a multi-dimensional array that allows you to easily position text in a particular line and column. The first string in any inner array will always be placed on the left (left-justified), the second string placed in the center (center-justified), and the third string placed on the right (right-justified). The number of items in the outer array defines the total number of text lines. You may provide between one and three lines of text for a footer.

array, optional
input.dest.footer.fontFamily

Specifies the name of the font that is used for the footer (e.g. "fontFamily": "Courier"). The font name provided must be present on the server to be applied.

string, optional
input.dest.footer.fontSize

Specifies the size of the font, in points. If provided, the value must be a string with a number followed by "pt" (e.g. "12pt").

string, optional
input.dest.footer.color

Specifies the color of the text. Valid values are any valid CSS HEX value (e.g. "#FF0000").

string, optional

Currently, the input.dest.footer property is only supported when converting all pages of a single document to either "pdf" or "tiff", and forceOneFilePerPage is false.

Text may overlap other text and/or overflow the page bounds. The caller specifies the text position and size, and the product simply renders the text. For example, if the font size is too big, text on the left may overlap text in the center, or if the text is so long it can't fit on the page width, it may overflow the page bounds.

For input.dest.footer code examples refer to Conversion Input Examples.

minSecondsAvailable

Allows you to specify a minimum number of seconds in which you can continue to GET the status of this conversion operation after the initial POST has been submitted. The default value is 60, ensuring that you have at least 60 seconds to get the result status of any conversion operation.

Response Body

A successful response will return JSON which contains:

  1. The input object submitted in the request, normalized to include default values.
  2. Information about the status of the conversion.

Here is an example:

200 OK
Content-Type: application/json
{
    "input": {
        "sources": [
            { 
                "fileId": "ek5Zb123oYHSUEVx1bUrVQ",
                "pages": ""
            }
        ],
        "dest": {
            "format": "pdf",
            "pdfOptions": {
                "forceOneFilePerPage": false
            }
        }
    },
    "expirationDateTime": "2015-12-17T20:38:39.796Z",
    "processId": "ElkNzWtrUJp4rXI5YnLUgw",
    "state": "processing",
    "percentComplete": 0
}

Conversion Status Details

Name Description Details
processId The id of the contentConverter resource which represents the file conversion operation. string
expirationDateTime The date and time (in ISO 8601 Extended Format) when the contentConverter resource will be deleted.

string

Example: "2015-12-17T20:38:39.796Z"

state

The current state of the conversion process, which will be one of the following:

  • "processing" - The conversion is still in progress.
  • "complete" - The conversion has completed successfully.
  • "error" - The conversion failed due to a problem.

For the initial POST, this value will almost always be "processing". Results are typically only available with a subsequent GET.

string
percentComplete An integer from 0 to 100 that indicates what percentage of the conversion is complete.

integer

Example: 0

errorCode An error code string if a problem occurred during the conversion process.

string

Example: "InvalidInput"

affinityToken Affinity token echoed from request header. This value will only be present if PrizmDoc is running in cluster mode.

string

Example: "rcqmuB9pAa8+4V7fhO1SXzawy/YMQU1g8lLdNDe5l7w="

HTTP Status Codes and Response JSON Error Codes

HTTP Status "state" in response JSON body "errorCode" in response JSON body Description
200 processing N/A The contentConverter was created and the conversion process was started.
400 error CouldNotReadRequestData Could not read request data.
405 N/A N/A POST HTTP method was not used.
480 error InvalidJson Json error details are in errorDetails.
480 N/A InvalidDimensionValue Invalid dimension value specified for rasterization. See details in errorDetails.
480 N/A InvalidInput Invalid input. Invalid request data is referenced in the errorDetails.
480 N/A InvalidPageSyntax Invalid page specification. See errorDetails.
480 N/A ForceOneFilePerPageNotSupportedWhenUsingHeaderOrFooter forceOneFilePerPage mode is not supported when using header or footer options. Supported forceOneFilePerPage option is referenced in errorDetails.
480 N/A MaxWidthOrMaxHeightMustBeSpecifiedWhenRasterizingCadInput Max width or max height must be specified when rasterizing CAD input. See errorDetails.
480 N/A MissingInput Missing input. See errorDetails.
480 N/A MultipleSourcesAreNotSupportedForThisDestinationFormat Multiple source files or pages are not supported for this destination format.
480 N/A MultipleSourceDocumentsNotSupportedWhenUsingHeaderOrFooter Multiple source documents are not supported when using header or footer.
480 N/A PagesPropertyNotSupportedWhenUsingHeaderOrFooter Pages property is not supported for conversion with header or footer. The property is referenced in errorDetails
480 N/A UnrecognizedExpression Unrecognized expression. See errorDetails.
480 N/A UnsupportedConversion Unsupported conversion. See errorDetails.
480 N/A UnsupportedDestinationFormatWhenUsingHeaderOrFooter Unsupported destination format when using header or footer. Supported destination formats are listed in errorDetails.
480 N/A UnsupportedSourceFileFormat Unsupported source file format. Unsupported file is referenced in errorDetails.
480 N/A UnsupportedSourceFileFormatForOCR Unsupported source file format for OCR. Unsupported file is referenced in errorDetails.
480 N/A WorkFileDoesNotExist Specified work file does not exist.
480 N/A FeatureNotLicensed The server's license does not allow the use of the requested feature. The unlicensed feature will be referenced by the errorDetails object in the response.
480 N/A LicenseCouldNotBeVerified The server's license could not be verified. Make sure your license has been correctly installed.
580 N/A InternalServiceError Internal service error. This error can be returned for a number of different reasons. Please contact support.

GET /v2/contentConverters/{processId}

Gets the status of a content conversion operation and its final output if available.

In general, the response JSON will contain:

  1. The input object submitted in the POST request, normalized to include default values.
  2. Information about the status of the conversion.
  3. Information about the output of the conversion, if available.

Requests can be sent to this URL repeatedly while the state is "processing".

When the state is "complete", the output section will list one or more WorkFile ids for each output file, and the files themselves can be downloaded using the WorkFile API.

Parameters

Name Description Details
processId The processId for a particular contentConverter. This processId was returned in the response for the initial POST. string, required

Request Headers

Name Value Details
Accusoft-Affinity-Token

Affinity token returned in post response body for content converter specified by processId parameter.

Example: "rcqmuB9pAa8+4V7fhO1SXzawy/YMQU1g8lLdNDe5l7w="

Only used if PrizmDoc is running in cluster mode.

Response Body

While processing, the response will return JSON with only the processing details. For example:

200 OK
Content-Type: application/json
{
    "input": {
        "sources": [
            { 
                "fileId": "ek5Zb123oYHSUEVx1bUrVQ",
                "pages": ""
            }
        ],
        "dest": {
            "format": "pdf",
            "pdfOptions": {
                "forceOneFilePerPage": false
            }
        }
    },
    "expirationDateTime": "2015-12-17T20:38:39.796Z",
    "processId": "ElkNzWtrUJp4rXI5YnLUgw",
    "state": "processing",
    "percentComplete": 82
}

Once the processing has completed, the response will return JSON showing the WorkFile id of the output file or files.

If the output format supports multiple pages (e.g. PDF or TIFF), then only a single output file will be created. For example:

200 OK
Content-Type: application/json
{
    "input": {
        "sources": [
            { 
                "fileId": "ek5Zb123oYHSUEVx1bUrVQ",
                "pages": ""
            }
        ],
        "dest": {
            "format": "pdf",
            "pdfOptions": {
                "forceOneFilePerPage": false
            }
        }
    },
    "expirationDateTime": "2015-12-17T20:38:39.796Z",
    "processId": "ElkNzWtrUJp4rXI5YnLUgw",
    "state": "complete",
    "percentComplete": 100,
    "output": {
        "results": [
            {
                "fileId": "KOrSwaqsguevJ97BdmUbXi",
                "sources": [{ "fileId": "ek5Zb123oYHSUEVx1bUrVQ", "pages": "1-3" }],
                "pageCount": 3
            }
        ]
    }
}

If the output format does not support multiple pages (e.g. JPEG), then multiple output files will be created. For example:

200 OK
Content-Type: application/json
{
    "input": {
        "sources": [
            { 
                "fileId": "ek5Zb123oYHSUEVx1bUrVQ"
            }
        ],
        "dest": {
            "format": "jpeg"
        }
    },
    "expirationDateTime": "2015-12-17T20:38:39.796Z",
    "processId": "ElkNzWtrUJp4rXI5YnLUgw",
    "state": "complete",
    "percentComplete": 100,
    "output": {
        "results": [
            {
                "fileId": "N6uDE11Ed6+JQPy0POu+8A",
                "sources": [{ "fileId": "ek5Zb123oYHSUEVx1bUrVQ", "pages": "1" }],
                "pageCount": 1
            },
            {
                "fileId": "+4b6QW90Fb9yjDak+ALFEg",
                "sources": [{ "fileId": "ek5Zb123oYHSUEVx1bUrVQ", "pages": "2" }],
                "pageCount": 1
            },
            {
                "fileId": "Lx/4z8AyJKV5eMjWKsBm5w",
                "sources": [{ "fileId": "ek5Zb123oYHSUEVx1bUrVQ", "pages": "3" }],
                "pageCount": 1
            }
        ]
    }
}

Conversion Status Details

Name Description Details
processId The id of the contentConverter resource which represents the file conversion operation. string
expirationDateTime The date and time (in ISO 8601 Extended Format) when the contentConverter resource will be deleted.

string

Example: "2015-12-17T20:38:39.796Z"

state

The current state of the conversion process, which will be one of the following:

  • "processing" - The conversion is still in progress.
  • "complete" - The conversion has completed successfully.
  • "error" - The conversion failed due to a problem.
string
percentComplete An integer from 0 to 100 that indicates what percentage of the conversion is complete.

integer

Example: 0

errorCode An error code string if a problem occurred during the conversion process.

string

Example: "CouldNotConvertFile"

affinityToken Affinity token echoed from request header. This value will only be present if PrizmDoc is running in cluster mode.

string

Example: "rcqmuB9pAa8+4V7fhO1SXzawy/YMQU1g8lLdNDe5l7w="

Conversion Output Details

Name Description Details
output.results An array of objects, one for each output file created. object
output.results[n].fileId The WorkFile id for an output file. Use this id to download the output file using the WorkFile API. string
output.results[n].pageCount The total number of pages in the output file. integer
output.results[n].sources An array of objects, one for each source file which contributed to this output file. array
output.results[n].sources[n].fileId The WorkFile id of the source input file. string
output.results[n].sources[n].pages

The page or pages used from the source file.

This will be a string value using one-based indexing. For example, if the output file represents page 2 of the source document, pages would have a value of "2". If the output file represents all 20 pages of a source document, pages would have a value of "1-20".

string

Examples: "1-3" or "2"

output.results[n].src (deprecated) An array with a single object which corresponds to input.src. This will only appear in the output if you used the deprecated input.src property instead of the new input.sources in the original POST request. array

HTTP Status Codes and Response JSON Error Codes

HTTP status "state" in response JSON body "errorCode" in response JSON body Additional "errorCode" location in response JSON body Additional "errorCode" in response JSON body Description
200 processing N/A N/A N/A The contentConverter was created and the conversion process was started.
200 complete N/A N/A N/A The conversion process was completed.
200 complete N/A output.results[n].errorCode NoSuchPage No such page. Problem fileId and page number are listed in output.results[n].sources[0].fileId, output.results[n].sources[0].page
200 error CouldNotConvert output.results[n].errorCode CouldNotConvertFile Could not convert file. Problem fileId is listed in output.results[n].sources[0].fileId
200 error CouldNotConvert output.results[n].errorCode CouldNotConvertPage Could not convert page. Problem fileId and page number are listed in output.results[n].sources[0].fileId, output.results[n].sources[0].page
200 error CouldNotConvert output.results[n].errorCode InvalidPassword Password is incorrect or missing. Problem fileId, page number and password if it was passed are listed in output.results[n].sources[0].fileId, output.results[n].sources[0].page, output.results[n].sources[0].password
200 error CouldNotConvert output.results[0].errorCode RequestedHeaderOrFooterFontIsNotAvailable Requested header or footer font is not available. Name of the font which is not available is listed in input.dest.header.fontFamily or input.dest.footer.fontFamily
200 error CouldNotConvertAllFilesOrPages output.results[n].errorCode One or more occurrences of either of the following codes: CouldNotConvertFile, CouldNotConvertPage, NoSuchPage, InvalidPassword

Could not convert all files or pages.

For CouldNotConvertFile error, problem fileId is listed in output.results[n].sources[0].fileId.

For CouldNotConvertPage, InvalidPassword and NoSuchPage errors, problem fileId and pageNumber are listed in output.results[n].sources[0].fileId, output.results[n].sources[0].page

404 N/A ContentConverterDoesNotExist N/A N/A Content converter does not exist. Invalid processId was specified in the request.
405 N/A N/A N/A N/A POST HTTP method was not used.
580 N/A InternalServiceError N/A N/A Internal service error. This error can be returned for a number of different reasons. Please contact support.

Appendix

Supported Input File Formats

For a complete list of image and document source types supported by CCS, please refer to: File Formats Reference.

Supported Output File Formats

Note that the conversion from the source PDF to the output PDF file is implemented to remove all JavaScript during the source file processing. After the source file processing is complete, there is no JavaScript in the output PDF file(s).