The meaning of «zpeg»

ZPEG is a motion video technology that applies a human visual acuity model to a decorrelated transform-domain space, thereby optimally reducing the redundancies in motion video by removing the subjectively imperceptible. This technology is applicable to a wide range of video processing problems such as video optimization, real-time motion video compression, subjective quality monitoring, and format conversion.

The ZPEG company produces modified versions of x264, x265, AV1, and FFmpeg under the name ZPEG Engine (see § Video optimization).

Pixel distributions are well-modeled as stochastic process, and a transformation to their ideal decorrelated representation is accomplished by the Karhunen–Loève transform (KLT) defined by the Karhunen–Loève theorem. The Discrete Cosine Transform (DCT) is often used as a computationally efficient transform that closely approximates the Karhunen–Loève transform for video data due to the strong correlation in pixel space typical of video frames.....[1] As the correlation in the temporal direction is just as high as that of the spatial directions, a three-dimensional DCT may be used to decorrelate motion video[2]

A Human Visual Model may be formulated based on the contrast sensitivity of the visual perception system.[3] A time-varying Contrast Sensitivity model may be specified, and is applicable to the three-dimensional Discrete cosine transform (DCT).[4] A three-dimensional Contrast Sensitivity model is used to generate quantizers for each of the three-dimensional basis vectors, resulting in a near-optimal visually lossless removal of imperceptible motion video artifacts[5]

The perceptual strength of the Human Visual Model quantizer generation process is calibrated in visiBels (vB), a logarithmic scale roughly corresponding to perceptibility as measured in screen heights. As the eye moves further from the screen, it becomes less able to perceive details in the image. The ZPEG model also includes a temporal component, and thus is not fully described by viewing distance. In terms of viewing distance, the visiBel strength increases by six as the screen distance halves. The standard viewing distance for Standard Definition television (about 7 screen heights) is defined as 0vB. The normal viewing distance for High-definition video, about 4 screen heights, would be defined as about −6 vB (3.5 screen heights).

The ZPEG pre-processor optimizes motion video sequences for compression by existing motion estimation-based video compressors, such as Advanced Video Coding (AVC) (H.264) and High Efficiency Video Coding (HEVC) (H.265). The human visual acuity model is converted into quantizers for direct application to a three-dimensional transformed block of the motion video sequence, followed by an inverse quantization (signal processing) step by the same quantizers. The motion video sequence returned from this process is then used as input to the existing compressor.

