Interface OCRDataProviderInterface

    • Method Detail

      • getOCRDataForDocument

        ContentHandlerResult getOCRDataForDocument​(ContentHandlerInput input)
                                            throws VirtualViewerAPIException

        Returns a JSON file, passed as a byte array, of OCRed text data for a document. This method will be called when the document itself is retrieved. This function should not run any OCR operations, since those are slow and would significantly affect document load on the viewer; it should return existing OCR data only. The JSON schema of the OCR data is available with VirtualViewer documentation.

        Parameters:
        input - ContentHandlerInput containing the following values:
        Every row is an expected value in the ContentHandlerInput. The first column is the string key for the value. The second column is the type of the value. The third column is the detailed description of the value.
        KeyTypeDescription
        KEY_DOCUMENT_ID java.lang.String The key representing the document. Can be retrieved with String documentId = input.getDocumentId(); .
        KEY_CLIENT_INSTANCE_ID java.lang.String Custom configurable value used to pass data from client to content handler. If not set then will be set to the session ID. Can be retrieved with String clientInstanceId = input.getClientInstanceId();
        KEY_HTTP_SERVLET_REQUEST javax.servlet.http.HttpServletRequest Request that called this method. Can be retrieved with HttpServletRequest request = input.getHttpServletRequest();
        Returns:
        null, or ContentHandlerResult with the following values:
        Every row is an expected value in the ContentHandlerResult. The first column is the string key for the value. The second column is the type of the value. The third column is the detailed description of the value.
        KeyTypeDescription
        KEY_OCR_DATA_JSON byte[] A byte array of the OCR JSON data. If both this parameter and KEY_OCR_DATA_JSON_STRING are set, the string representation will be used and this parameter will be ignored.
        KEY_OCR_DATA_JSON_STRING java.lang.String A JSON string representation of the OCR JSON data. If both this parameter and KEY_OCR_DATA_JSON are set, this parameter will be used and the byte array representation will be ignored.
        Throws:
        VirtualViewerAPIException - if content handler throws exception
      • getOCRDataOnPerformOCR

        ContentHandlerResult getOCRDataOnPerformOCR​(ContentHandlerInput input)
                                             throws VirtualViewerAPIException

        Returns a JSON file, passed as a byte array, of OCRed text data for a document. This method will be called when the user asks VirtualViewer to perform OCR; it will only be called if OCR is enabled. This function may return existing OCR data, since the OCR cache and the document cache may get out of sync; this will re-insert OCR data into the cache. Additionally, if OCR will be run on demand, it should be run in this function rather than getOCRDataForDocument. The JSON schema of the OCR data is available with VirtualViewer documentation.

        Parameters:
        input - ContentHandlerInput containing the following values:
        Every row is an expected value in the ContentHandlerInput. The first column is the string key for the value. The second column is the type of the value. The third column is the detailed description of the value.
        KeyTypeDescription
        KEY_DOCUMENT_ID java.lang.String The key representing the document. Can be retrieved with String documentId = input.getDocumentId(); .
        KEY_CLIENT_INSTANCE_ID java.lang.String Custom configurable value used to pass data from client to content handler. If not set then will be set to the session ID. Can be retrieved with String clientInstanceId = input.getClientInstanceId();
        KEY_DOCUMENT_OCR_LANGUAGE java.lang.String Optional three-letter language code specified on the client as the language to assume when performing OCR.
        KEY_HTTP_SERVLET_REQUEST javax.servlet.http.HttpServletRequest Request that called this method. Can be retrieved with HttpServletRequest request = input.getHttpServletRequest();
        Returns:
        null, or ContentHandlerResult with the following values:
        Every row is an expected value in the ContentHandlerResult. The first column is the string key for the value. The second column is the type of the value. The third column is the detailed description of the value.
        KeyTypeDescription
        KEY_OCR_DATA_JSON byte[] A byte array of the OCR JSON data. If both this parameter and KEY_OCR_DATA_JSON_STRING are set, the string representation will be used and this parameter will be ignored.
        KEY_OCR_DATA_JSON_STRING java.lang.String A JSON string representation of the OCR JSON data. If both this parameter and KEY_OCR_DATA_JSON are set, this parameter will be used and the byte array representation will be ignored.
        Throws:
        VirtualViewerAPIException - if content handler throws exception
        See Also:
        getOCRDataForDocument(com.snowbound.contenthandler.ContentHandlerInput)