OP_PDF2D: Read a DIB from PDF
OP_PDF2D supports reading a PDF file and returning a given pages embedded image. (See opcode specific data structure.)
General Notes
- OP_PDF2D supports reading:
- 1-bit indexed, 8-bit indexed, and 24-bit uncompressed images,
- 1-bit CCITT G3 1-Dimensional encoded images,
- 1-bit CCITT G3 2-Dimensional encoded images,
- 1-bit CCITT G4 encoded images,
- 1-bit JBIG2 encoded images,
- 8-bit gray scale sequential JPEG encoded images,
- 24-bit sequential JPEG encoded images,
- 32-bit CMYK sequential JPEG encoded images,
- 8-bit JPEG2000 encoded images, and
- 24-bit JPEG2000 encoded images,
- The output image type is determined by the bit depth of the image being read (i.e., 1-bit compressed results in a 1-bit DIB, 8-bit compressed results in an 8-bit DIB, etc.).
- All Head and ColorTable members of PIC_PARM as well as the Compression, StripSize, NumPages, and WidthPad members of the PDF_UNION will be initialized during REQ_INIT to enable Put queue allocation.
- PDF files read must conform to version 1.0 of the PDF specification or later up to version 1.5.
- ImageNumber of the PIC_PARM structure specifies the page to be read.
- Setting the PF2_SwapBW PicFlags2 flag in the PDF union will swap the color indexes of black and white in the color table and in the resulting image if the image is encoded as a 1-bit image.
- After REQ_INIT, StripSize is the minimum allocation for the Put buffer accepted.
- If PDFCompress_SequentialJPEG, the LumFactor, ChromFactor, and SubSampling are set in PDF_UNION upon completion of REQ_EXEC.
- If PDFCompress_SequentialJPEG, the image is always decompressed with cross-block-smoothing enabled and full deblocking.
- PF_SwapRB may be set to indicate the image data was encoded in RGB order instead of BGR.
- Q_REVERSE is not supported on the Get queue.
- Cropping is not supported.
- See the PDF structure for more information.
- For decompression, this opcode uses other opcodes, depending on the method of compression. The decompressor opcode registration data is passed to this opcode in the PicParm.PIC2List P2PktRegistration packets. See the PIC2List Functions section for more information.
Reading an Image
In order to read an image from a page in a given PDF file, place the PDF file in the Get queue prior to calling REQ_INIT. Either the entire file, or a partial file may be placed in the queue, but if a partial file is used, then seeking must be supported. For more information, please see the OP_D2PDF discussion regarding seek operations.
Prior to REQ_INIT the page desired for reading must be specified using the ImageNumber member of the PIC_PARM structure. During REQ_INIT information about the image to be extracted is read and returned via the PIC_PARM structure. In particular, the Head and ColorTable members as well as the PDF specific StripSize and WidthPad members are initialized. Using this information, a Put queue must be allocated of at least StripSize. The decoded DIB will be placed in the Put queue during processing.