PrizmDoc® v14.1 Release - Updated
PrizmDoc / Developer Guide / PrizmDoc Server / How To Examples / Convert Content with Content Conversion Service
In This Topic
    Convert Content with Content Conversion Service
    In This Topic

    This section describes how to use the PrizmDoc Server content converters REST API and provides examples of the kinds of operations you can perform with it.

    For application development in .NET, we recommend using the PrizmDoc Server .NET SDK instead of using the PrizmDoc Server REST API directly. See the .NET SDK How to Guides for examples of how to perform file conversion, document merging, and more with the .NET SDK.

    The following steps walk you through using the PrizmDoc Server content converters REST API:

    Step 1: Upload Your Source Document

    • Upload the source document that you want to convert.
    • This can be a document of any format supported by the PrizmDoc RESTful Web Services.
    • In response to this request you will receive a file ID that is used to reference the source document in later requests.

    Example

    POST http://192.168.0.1:18681/PCCIS/V1/WorkFile?FileExtension=doc
    Content-Type: application/octet-stream
    [binary data]
    
    200 OK
    Content-Type: application/json
    {
        "fileId": "5qTYa3gzN9gYUb5SzqUhqg",
    }
    
    

    Step 2: Start the Content Conversion Process

    • Using the file ID you obtained for the source document in Step 1, you can now start the process to convert the document. This is accomplished by sending a POST request which will start a process that runs asynchronously on the PrizmDoc Server to produce the converted document(s).
    • Specify in the POST request the output format you wish to convert the source document to. This format may be SVG, JPEG, PNG, TIFF, or PDF.
    • For raster output formats (JPEG, TIFF, PNG), you may optionally specify a maxWidth and/or a maxHeight. If either of these attributes is present then the output document will be scaled to fit them as closely as possible while maintaining its original aspect ratio. At least one of maxWidth and maxHeight must be specified if the source document is in a vector format.
    • If output format is PDF or TIFF, you may specify whether to convert each page of the source document to a separate output file or to convert all pages to a single document. This attribute is optional with the default value set to convert all pages to a single document.
    • If input format is PDF, MS Word, MS Excel, MS PowerPoint or OpenDocument, you may optionally specify a password.

    Example

    POST http://192.168.0.1:18681/v2/contentConverters
    Content-Type: application/json
    {
        "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "pdf",
                "pdfOptions": {
                    "forceOneFilePerPage": true
                }
            }
        }
    }
    
    200 OK
    Content-Type: application/json
    {
        "processId": "bQpcuixhvGmNqn5ElskO6Q",
        "expirationDateTime": "2014-12-03T18:30:49.460Z",
        "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg",
                    "pages": ""
                }
            ],
            "dest": {
                "format": "pdf",
                "pdfOptions": {
                    "forceOneFilePerPage": true
                }
            }
        },
        "state": "processing",
        "percentComplete": 0
    }
    
    

    Step 3: Check Status of the ContentConverter Resource

    • The process to generate a converted document(s) runs asynchronously on the PrizmDoc Server. The POST request you sent in Step 2 will return immediately and before the output is ready. This means you will need to check the status of the process by sending a GET request to the resource you just created.
    • In response to this request, JSON will be returned that includes a state property. When this property is "complete", the JSON response will also include an output property which means you can proceed to the next step.
    • See the Content Converter API for more details of this request.

    Example

    GET http://192.168.0.1:18681/v2/contentConverters/bQpcuixhvGmNqn5ElskO6Q
    
    200 OK
    Content-Type: application/json
    {
        "processId": " bQpcuixhvGmNqn5ElskO6Q ",
        "expirationDateTime": "2014-12-03T18:30:49.460Z",
        "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg",
                    "pages": ""
                }
            ],
            "dest": {
                "format": "pdf",
                "pdfOptions": {
                    "forceOneFilePerPage": true
                }
            }
        },
        "state": "complete",
        "percentComplete": 100,
        "output": {
            "results": [
                {
                    "fileId": "ek5Zb123oYHSUEVx1bUrVQ",
                    "sources": [
                        {
                            "fileId": "5qTYa3gzN9gYUb5SzqUhqg",
                            "pages": "1"
                        }
                    ],
                    "pageCount": 1
                },
                {
                    "fileId": "KOrSwaqsguevJ97BdmUbXi",
                    "sources": [
                        {
                            "fileId": "5qTYa3gzN9gYUb5SzqUhqg",
                            "pages": "2"
                        }
                    ],
                    "pageCount": 1
                },
                {
                    "fileId": "o349chskqw93kwaqsgfevJ",
                    "sources": [
                        {
                            "fileId": "5qTYa3gzN9gYUb5SzqUhqg",
                            "pages": "3"
                        }
                    ],
                    "pageCount": 1
                }
            ]
        }
    }
    
    

    Step 4: Download the Converted Document(s)

    • Once the content conversion process completes successfully, the new, converted document(s) are available for download.
    • A work file ID is made available for each successfully converted document in the output property from the JSON response retrieved in Step 3.
    • See the Work File API for more details about downloading work files.

    Example

    GET http://192.168.0.1:18681/PCCIS/V1/WorkFile/ek5Zb123oYHSUEVx1bUrVQ
    
    200 OK
    Content-Type: application/pdf
    [binary data]
    
    

    Conversion Input Examples

    Below are example JSON strings that can be used as input in Step 2 above to create various ContentConverter processes.

    Multipage Word Document to Multipage PDF

    This example will convert all pages of a Word document to a single PDF document:

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "pdf",
                "pdfOptions": {
                    "forceOneFilePerPage": false
                }
            }
    }
    
    

    Multipage Password Protected Word Document to Multipage PDF

    This example will convert all pages of a password protected Word document to a single PDF document:

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg",
                    "password": "secret"
                }
            ],
            "dest": {
                "format": "pdf",
                "pdfOptions": {
                    "forceOneFilePerPage": false
                }
            }
    }
    
    

    Single-page Word Document to Scaled PNG

    This will convert a single page Word Document to a PNG image, scaled to 800 pixels width. Height will adjust automatically to maintain aspect ratio:

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "png",
                "pngOptions": {
                    "maxWidth": "800px"
                }
            }
    }
    
    

    Multipage Word Document to Multiple PNG Images

    This will convert a multipage Word Document to multiple, single page PNG images. As PNG is not a multipage format, each page of the Word Document will be converted to a separate PNG:

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "png"
            }
    }
    
    

    JPEG to PNG

    This example will convert a JPEG image to a PNG image, scaled to fit within 800 pixels width and 600 pixels height. The output PNG will be as large as it can be while maintaining aspect ratio and remaining within these bounds:

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "png",
                "pngOptions": {
                    "maxWidth": "800px",
                    "maxHeight": "600px"
                }
            }
    }
    
    

    JPEG to Indexed TIFF

    This example will convert a JPEG image to TIFF image with indexed 8-bits per pixel (256) colors. The output TIFF will have some minor quality loss and its size is usually several times smaller if it is created without the color.mode: "indexed" option:

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "tiff",
                "tiffOptions": {
                    "color": {
                        "mode": "indexed"
                    }
                }
            }
    }
    
    

    PDF to Bitonal TIFF

    This example will convert a PDF to a bitonal (black and white, 1-bit per pixel) TIFF using Group 4 compression:

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "tiff",
                "tiffOptions": {
                    "compression": {
                        "type": "group4"
                    },
                    "color": {
                        "mode": "bitonal"
                    }
                }
            }
    }
    
    

    Single-page document to TIFF with specific resolution

    This example will convert a single-page document to a TIFF with default compression and color mode using specific resolution (in dots per inch):

    "input": {
            "sources": [
                {
                    "fileId": "5qTYa3gzN9gYUb5SzqUhqg"
                }
            ],
            "dest": {
                "format": "tiff",
                "tiffOptions": {
                    "compression": {
                        "type": "auto"
                    },
                    "color": {
                        "mode": "auto"
                    },
                    "dpi": 300
                }
            }
    }
    
    

    Specific Pages of Two Multipage Documents to Multipage TIFF

    This example will merge first page of a Word document with the second and third pages of a PDF document and convert them to a single TIFF document:

    "input": {
        "sources": [
            {
                "fileId": "drxx_2sNVu9VIZTS4VH2Dg",
                "pages": "1"
            },
            {
                "fileId": "qkMQmjk6CxSzt5UEY-UdFQ",
                "pages": "2-3"
            }
        ],
        "dest": {
            "format": "tiff"
        }
    }
    
    

    All Pages of Three Multipage Documents Including Password Protected Document to Multipage PDF

    This example will merge together all pages of the first PDF document with all pages of the second password protected Word document and all pages of the third TIFF document and convert them to a single PDF document.

    "input": {
        "sources": [
            {
                "fileId": "TP4TX_SxCNF86suTfHHFSw"
            },
            {
                "fileId": "oJo8CWXAgFJ0dns8UF_AzQ",
                "password": "secret"
            },
            {
                "fileId": "EYsfBhL0JbYgNk80sbnxEg"
            }
        ],
        "dest": {
            "format": "pdf"
        }
    }
    
    

    Positioning and Text Justification within Header and Footer

    Multi-dimensional array of lines indicates positioning and text justification of a header or footer content.

    To put an address in the top left of every page, you can use a header with lines like this:

    "input": {
        "sources": [
            {
                "fileId": "EYsfBhL0JbYgNk80sbnxEg"
            }
        ],
        "dest": {
            "format": "pdf",
            "header": {
                "lines": [
                     [ "Accusoft", "", "" ],
                     [ "4001 N Riverside Dr", "", "" ],
                     [ "Tampa, FL 33603", "", "" ]
                ],
                "fontFamily": "Courier",
                "fontSize": "12pt",
                "color": "#F57B20"
            }
        }
    }
    
    

    By placing the text in the center position of the inner array, it will be positioned in the center of the page. For example, to print CONFIDENTIAL centered at the bottom of every page, you can define a footer with lines like this:

    "input": {
        "sources": [
            {
                "fileId": "EYsfBhL0JbYgNk80sbnxEg"
            }
        ],
        "dest": {
            "format": "pdf",
            "footer": {
                "lines": [
                    [ "", "CONFIDENTIAL", "" ]
                ],
                "fontFamily": "Courier",
                "fontSize": "12pt",
                "color": "#F57B20"
            }
        }
    }
    
    

    Use the following example to apply header and footer in a single call:

    "input": {
        "sources": [
            {
                "fileId": "EYsfBhL0JbYgNk80sbnxEg"
            }
        ],
        "dest": {
            "format": "pdf",
            "header": {
                "lines": [
                     [ "Accusoft", "", "" ],
                     [ "4001 N Riverside Dr", "", "" ],
                     [ "Tampa, FL 33603", "", "" ]
                ],
                "fontFamily": "Courier",
                "fontSize": "12pt",
                "color": "#F57B20"
            },
            "footer": {
                "lines": [
                    [ "", "CONFIDENTIAL", "" ]
                ],
                "fontFamily": "Courier",
                "fontSize": "12pt",
                "color": "#F57B20"
            }
        }
    }
    
    

    Dynamic Page Numbering and Page Count with Optional Zero Padding within Header and Footer

    You can insert the current page number and/or total page count into header or footer text using the special syntax {{pageNumber}} or {{pageCount}}.

    For example, to produce a footer showing "Page 1 of 12" for the first page of a twelve-page document, you can define a footer with lines like this:

    "input": {
        "sources": [
            {
                "fileId": "EYsfBhL0JbYgNk80sbnxEg"
            }
        ],
        "dest": {
            "format": "pdf",
            "footer": {
                "lines": [
                    [ "",  "Page {{pageNumber}} of {{pageCount}}", "" ]
                ],
                "fontFamily": "Courier",
                "fontSize": "12pt",
                "color": "#F57B20"
            }
        }
    }
    
    

    You can optionally pad page number and total page count values with zeroes to guarantee that they fit a particular character width using the syntax {{pageNumber,n}} or {{pageCount,n}}, where n is the minimum character width. If the actual page number or page count value does not meet the minimum character width, it will be left-padded with zeroes. This can be useful for bates numbering.

    For example, the following code would produce a header with "Jones000097" in the top left of page 97:

    "input": {
        "sources": [
            {
                "fileId": "EYsfBhL0JbYgNk80sbnxEg"
            }
        ],
        "dest": {
            "format": "pdf",
            "header": {
                "lines": [
                    [ "Jones{{pageNumber,6}}", "", "" ]
                ],
                "fontFamily": "Courier",
                "fontSize": "12pt",
                "color": "#F57B20"
            }
        }
    }
    
    

    Bates Numbering Across Multiple Output Documents

    You can apply Bates numbering to multiple output documents, continuing the numbering across the documents. You can do this by calculating the count of pages in already converted documents and then passing this count as a page number offset for the next conversion. Specify the offset using the syntax {{pageNumber+c}} where c is an integer constant.

    The total number of pages for a converted document can be obtained from output.results[n].pageCount field of the response body returned for successfully completed conversion. Here is an example response where page count of converted document is equal to 15:

        {
      "input": {
        "dest": {
          "format": "pdf"
        },
        "sources": [
          {
            "fileId": "px4x3scw_8OqzZlM24tmnQ",
            "pages": "1-15"
          }
        ]
      },
      "expirationDateTime": "2017-03-24T15:22:02.532Z",
      "processId": "kQVvYfCtmatmWzigemW8Xw",
      "state": "complete",
      "percentComplete": 100,
      "output": {
        "results": [
          {
            "fileId": "ZLa9F-Jg7M5gq1Wgx82ejg",
            "sources": [
              {
                "fileId": "px4x3scw_8OqzZlM24tmnQ",
                "pages": "1-15"
              }
            ],
            "pageCount": 15
          }
        ]
      }
    }
    
    

    See the Content Converter API for more details of this.

    You can optionally pad the result with zeroes using the syntax {{pageNumber+c,n}}, where n is the minimum character width. If the actual page number value does not meet the minimum character width, it will be left-padded with zeroes.

    For example, if you have already converted a document containing 15 pages, and want to continue the numbering in the next conversion, using 8-digit padding, you can define a footer with lines like this:

    "input": {
        "sources": [
            {
                "fileId": "EYsfBhL0JbYgNk80sbnxEg"
            }
        ],
        "dest": {
            "format": "pdf",
            "footer": {
                "lines": [
                    [ "",  "{{pageNumber+15,8}}", "" ]
                ],
                "fontFamily": "Courier",
                "fontSize": "12pt",
                "color": "#F57B20"
            }
        }
    }
    
    

    Raster Document to a Searchable PDF

    This example will convert a raster file to a searchable PDF document. The resulting PDF document will contain the original image and the recognized text in a separate invisible layer, with each text character position matching its image counterpart. This will allow you to search, select and copy the text in the resulting PDF document.

    NOTE: If you are attempting to make a searchable PDF from an existing PDF document, please note that the source PDF file should be an image-only PDF. PrizmDoc will not create a searchable file from already-existing vector content.

    "input": {
        "sources": [
            {
                "fileId": "LtrN8HwBiQOaKXvCcn9J8Q"
            }
        ],
        "dest": {
            "format": "pdf",
            "pdfOptions": {
                "ocr": {
                    "language": "english"
                }
            }
        }
    }