MULTIMEDIA

Full Name	Multimedia, Motion Picture Expert Group (MPEG)
Format ID	IG_FORMAT_MUL = 67
File Extension(s)	.avi, .mpeg, *.mpg
Data Type	Raster Image
Data Encoding	Binary
Color Profile Support	No
Multi-Page Support	Yes
Alpha Channel Support	No
ImageGear Platforms Support	WIN32, WIN64

To support the MULTIMEDIA format, attach the ImageGear Multimedia Component to Core ImageGear. See Attaching Components.

ImageGear Supported Versions:

AVI
MPEG-1 (Standard)
MPEG-2
MPEG-3
MPEG-4

ImageGear Supported Features:

IG_FLTR_DETECTSUPPORT - autodetection
IG_FLTR_PAGEREADSUPPORT - single page file reading
IG_FLTR_MPAGEREADPSUPPORT - multi-page file reading

ImageGear Read Support:

IG_COMPRESSION_NONE - RGB: 24 bpp

ImageGear Write Support:

ImageGear Filter Control Parameters:

Filter Control Parameter	Type	Default Value	Available Values	Description
FILENAME	LPSTR	""	Any string	Name of the input MPEG or AVI file.

See the Global Control Parameter, MULT.PREFER_DIRECTSHOW_FOR_AVI_MPEG.

Comments:

The first version of the MPEG format is optimized for CD-ROM. It uses discrete cosine transform (DCT) and Huffman compression to remove spatially redundant data within a frame and block-based motion compensated prediction (MCP) to remove data which is temporally redundant between frames. Audio is compressed using subband compression.

MPEG-2 is a variant of the MPEG video and audio compression algorithm and file format, optimized for broadcast quality video for digital storage media up to 4.0Mbits/second. The file extension is .MP2. MPEG-2 has been approved as International Standard IS-13818.

MPEG-3 is a variant of the MPEG video and audio compression algorithm and file format. The file extension is .MP3. This variant no longer exists and has been merged into MPEG-2.

MPEG-4 is a variant of the MPEG video and audio compression algorithm and file format used for low bandwidth video telephony. The file extension is .MP4. The International Standards Organization (ISO) has adopted the QuickTime 3 file format to form the starting point for a unified digital media storage format for the MPEG-4 specification

No actual structured MPEG file format has been defined. Everything required to play back MPEG data is compressed directly in the data stream. Therefore, no header or other type of wrapper is necessary. It is likely that when needed, a multimedia standards committee - perhaps MHEG or the DSM (Digital Storage Medium) MPEG subgroup - will one day define an MPEG file format.

MPEG starts with a relatively low resolution video sequence (possibly decimated from the original) of about 352 by 240 frames by 30 frames in the US (the numbers are different in Europe), but original high (CD) quality audio. The images are in color, but converted to YUV space, and the two chrominance channels (U and V) are decimated further to 176 by 120 pixels. You can get us much less resolution in those channels and not notice it, at least in natural (not computer generated) images.

The basic scheme is to predict motion from frame to frame in the temporal direction, and then to use DCTs (Discrete Cosine Transforms) to organize the redundancy in the spatial directions. The DCTs are performed on 8x8 blocks. The motion prediction is performed in the luminance (Y) channel on 16x16 blocks. In other words, given the 16x16 block in the current frame that you are trying to compress, you look for a close match to that block in a previous or future frame (there are backward prediction modes where later frames are sent first to allow interpolating between frames). The DCT coefficient (of either the actual data, or the difference between this block and the close match) are quantized, which means that you divide them by some value to drop bits off the bottom end. Hopefully, many of the coefficients will then end up being zero. The quantization can change for every macroblock (a macroblock is 16x16 of Y and the corresponding 8x8's in both U and V). The results of all of this, which include the DCT coefficients, the motion vectors, and the quantization parameters (and other stuff) is Huffman compressed using fixed tables. The DCT coefficients have a special Huffman table that is two-dimensional in that one code specifies a run-length of zeros and the non-zero value that ends the run. Also, the motion vectors and the DCT components are DPCM (subtracted from the last one) compressed.

There are three types of compressed frames.

I or intra frames are simply a frame compressed as a still image, not using any past history.
P or predicted frames are predicted from the most recently reconstructed I or P frame. (Described from the decompressor view point) Each macroblock in a P frame can either come with a vector and difference DCT coefficients for a close match in the last I or P, or it can just be intra compressed (like in the I frames) if there is no good match.
B or bidirectional frames are predicted from the closest two I or P frames, one in the past and one in the future. You search for matching blocks in those frames, and try three different things to see which works best. (Described from the compressor view point.) You first use the forward vector and the backward vector, then average the two blocks from future and past frames, and then subtract that from the block being compressed. If none of these methods work well, you can intra-compress the block.

The sequence of decompressed frames is as follows:

	Copy Code
IBBPBBPBBPBBIBBPBBPB...

Where there are 12 frames from I to I (for US and Japan). This is based on a random access requirement that you need a starting point; at least once every 0.4 seconds or so. The ratio of P's to B's is based on experience. Of course, for the decompressor to work, you have to send that first P before the first two B's, so that the compressed data stream appears as follows:

	Copy Code
0xx312645...

where those are frame numbers. "xx" may be nothing (if this is the true starting point), or it may be the B's of frames -2 and -1 if it is the middle of the stream.

Decompress the I, then the P. Keep both in memory, and then decompress the two B's. You can display the I while you're decompressing the P, and display the B's as you're decompressing them. Display the P as you are decompressing the next P, and so on.