Tingnan ang lahat ng mga set ng pag-aaral

T6: Compression Techniques (Ch 11,12,13,14)

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Progressive Mode

"Starts with nothing -> Blurry -> More and more refined" -Amari Lewis, 2017

JPEG Header

(Frame and Scan Header) Contains info about frame (ie, width, height of image, precision, # of components, unique ID, sampling factors, Q table) Contains huffman table user app data

Image Compression: Image Preparation

Analog-to-Digital Conversion

Compression Technique: Source

Considers data semantics prioritizes importance of information Prediction, Transformation, Layered Approaches Quantization on less important information

Compression Environments: Retrieval

Data is coded only once. User only receives data, so only needs decoding.

Huffman Coding

Huffman coding aims to derive more optimal codes by using occurrence probabilities of codes. Unlike fixed-length coding KNOW HOW TO ENCODE HUFFMAN (T6 page 30) also know how to calculate average length of code

Image Redundancy Types

Human perceptual sensitivity to different color channels - lower resolution coding in chrominance (UV, or CrCb) components Spatial redundancy - reduce by intra-frame coding Perceptual redundancy - reduce by quantization Coding redundancy - reduce by Run-length and Huffman codings

IDCT

I = Inverse Used to convert frequency domain to spatial domain in an image

MCU

Minimum Coded Unit: used to define the image region level an operation is applied on

Why Compression?

Motivation for Compression: Reduction in storage space, reduction in transmission time for large amount of information. Cost for Compression: Lossy and Lossless.

MPEG

Moving Picture Expert Group Mission - To develop standards for coded representation of motion pictures and audio at a bit rate of up to 1.5Mb/s. MPEG-1: 1.5 Mbits/sec (VCD) MPEG-2: higher quality 2-10Mb/s (DVD)

Bottleneck for Digital Video

Much greater than audio. (1000x greater) Compression is required. Cannot send video data RAW.

Why 8 x 8 DCT Block?

N=8 gives best compromise between error & complexity Reduce coding error (larger N --> smaller error) Computational complexity (larger N --> more complex)

Captured Image Format

spatial resolution: pixel x pixel color encoding: bits/pixel dependent on hardware and software

JPEG: Color Image Preparation

take into consideration R,G,B. However, RGB is designed for Hardware so we need to convert to YUV (luminance (Y) and chrominance(U,V))

MPEG Frame Header

temporal references, frame type, frame structure...

Matching Criterion for Macroblock

MSE (Minimum Mean Square Error) MAD (Minimum Absolute Error)

General Compression Phrase

1. Data Preparation 2. Data Processing 3. Quantization 4. Entropy Encoding

How does Entropy Encoding Work

1. Order the coeffs in zig-zag sequence shown (most trailing coeffs will be zero, after quantization...higher frequency = 0) 2. Use run-length encoding to encode the resulting sequence 3. Use Huffman or Arithmetic coding to minimize the coding sequence (Apply Huffman after RLE)

JPEG Requirements

1. should be independent of image size 2. image content may be of any complexity 3. should have a good compression ratio without sacrificing image quality 4. should be able to run on most standard platforms 5. should have a sequential decoding and progressive decoding abilities

Macroblock

16 x 16 pixel (or 2x2 DCT) blocks macroblocks, as motion compensation units

Image Sizes

1995: 307.2K 2013: 15MP

Data Processing

2nd phase in general compression phase. First step of compression

Intensity Value: Color Image

3 8-bit integers are used. (total 24 bits per pixel, 0-255 for each integer) RGB channel

Basic Compression Techniques

3 Techniques: Entropy, Source, and Hybrid

Quantization

3rd phase. Processing results of previous step: mapping real numbers into integers resulting in loss of precision. STAGE FOR LOSSY COMPRESSION. THERE IS NO LOSS IN OTHER 3 STAGES

Entropy Encoding

4th/Last Phase. Compress sequential data (integers from previous step) without loss. (e.g., run-length coding or Huffman coding)

Intensity Value: Gray level Images

8-bit integer used (0-255, from black to white)

Video Compression Redundancy

Additional: Temporal Redundancy (reduce by inter-frame coding)

MPEG Macroblock

Address, Type, Q Scale, Motion Vectors, Blocks....

Encoding Algorithm:

Algorithm: 1. Down-sample the original image by a factor of (multiples of) 2 in each dimension 2. Encode this reduced image used the standard JPEG compression 3. Decode this compressed image and up-sample it 4. Use this up-sampled image as a prediction of the original image and encode the difference using standard JPEG compression

Types of Video Compression (MPEG)

Asymmetric Compression: Compression process is done only once and at the time of storage Examples: Video-on-Demand and News-on-Demand servers MPEG is an asymmetric standard Symmetric Compression: Equal use of compression and decompression process Examples: video conferencing, video telephone, and desktop video publishing

Advantages of Interpolative Coding

Better compression ratio Deals with uncovered areas properly better statistical properties and noise reduction Can decouple between prediction and coding

Motion Compensation Predictors:

Causal Predictors: Pure Predictive Coding P-frames = prediction frames Noncausal Predictors: Interpolative coding Contents of a frame is generated based on both a previous and a successive frame Interpolative Frame = B-Frame

Compression Technique: Hybrid

Combination of Entropy and Source. Most standards use different schemes at different stages of compression.

JPEG 2000

Complements JPEG. Better quality at lower bit-rates. Lossy and Lossless compression with Discrete Wavelet Transform (DWT) Progressive Coding Increased robustness to errors Content-based description Protective image security (Watermarking) Interfacing with MPEG-4

DCT Transformation

Done in Image Processing Stage. 8 x 8 block of pixels (DCT: N=8) maps to frequency domain 1DC value 63 AC values or 64 orthogonal basis functions. Refer to slide T6.35 for example

Predictor Encoding

Each pixel is encoded a pair of 8 bits (One group of 8 bits for prediction defines 8 possible prediction values) number of the chosen predictor & the difference of the prediction to the actual value are entropy encoded

MPEG Intraframe Coding

Encoding of a single picture Discrete Cosine Transform- Converts spatial to frequency domain Quantization of spectral coefficients DPCM to encode DC terms Zigzag scan to group zeros into long sequences, followed by run-length coding Lossless, Variable Length Coding to encode AC coefficients SIMILAR TO IMAGE COMPRESSION (JPEG) *Difference is DPCT instead of FDCT

Hierarchical Mode II

Encoding of an image is done at successively lower resolutions as shown Decoding can be started at the lowest resolution and repeated till the highest resolution is reached Example: Showing lower resolution first then high resolution later

Dialog Requirements

End-to-End delay should not exceed 150ms (compress, network, protocol processing, data transfer delays)

EXIF

Exchangeable Image File Format. Used by digital cameras IMPORTANT METADATA FOR IMAGE RETRIEVAL

Retrieval Requirements

Fast Forward, Fast Rewind, Random Access, Decompression from a random standing point in data stream

Data Preparation

First phase of compression. Analog - Digital conversion (compression has not begun)

Image Compression: Image Processing

First step for compression (e.g., applying transform coding to convert representation form pixel to frequency)

General Requirements

Frame size/rate independency, Supporting various rates for different types of compression (audio/video) Synchronization Economical Implementation Portability

Three step search

From larger blocks to smaller blocks. Split frame into large blocks first, then from that one block, split again into smaller blocks, and then again. Logarithmic strategy Idea: Halve the matching distance each time to obtain finer resolution estimates Step 1: Search at nine points marked by 1's and 0. 1 = motion, 0 = no motion. (Refers to splitting frame into large blocks). Select the 1 with best MSE and MAD Step 2: Evaluate points around step 1 chosen block. Step 3: Continue process to choose best match position

MPEG Sequence

GoP Header, Frame Header, Frame....

MPEG1 Characteristics for I,P,B GoP

I only: Compression: Low Random Access: Highest (can point to exact frame) Coding Delay: Low I and P: Compression: Medium Random Access: Medium Coding Delay: Medium I, P, B: Compression: High Random Access: Medium Coding Delay: High Tradeoff between coding delay and compression

P(E): probability of occurrences of a random event E

I(E) = -logP(E) where I(E) = unit of information for E. If P(E) = 1 (event occurs with 100% certainty), there is no information for E since I(E) will = 0. Use less bits to code more frequent occurrences

average self-information generated by the production of a single source symbol

I(aj) = -log {p(aj)}

MPEG Coding Frames

I,P,B Frames I = Image P = Prediction B = Bidirectional (Interpolative)

Epitome of Compression

Identify and remove redundancy. Final quality is acceptable if lossy. Price-performance tradeoff

Image Process

Image -> Scanner -> Captured Format -> Stored Format -> Database

Phases of Image Compression

Image Preparation -> Image Processing -> Quantization -> Entropy Encoding

Purpose of DCT Transform

In frequency domain, we can remove more of high frequency components which contribute more to details of an image. DCT = LOSSLESS Example: Zig-Zag sequence Compression is also achieved by removing high frequency terms in AC

How is intensity in a pixel of an image measured?

Intensity at each pixel is represented by an integer. Value of integer got from the analog (continuous) image by averaging over a small neighborhood around the pixel location. 2^P possible values for each pixel, where P = # of bits for each pixel.

Video Frame Coding for Compression

Intra-coded frames - good for random access Inter-coded frames - higher compression rate

JPEG

JPEG: Joint Photographers Expert Group Established: 1986, Adopted: 1991 Applies to gray-scale and color images have .jpg or .JPEG file extension

Image Compression: Entropy Encoding

Last step -- compress sequential data (integers from previous step) without loss. (e.g., runlength coding or Huffman coding)

Compression Types

Lossless (maintain original quality). Lossy (inferior to original quality by controllable amount)

MPEG Encoding Characteristics:

Lossy compression Trade off image quality with bit rate according to objective or subjective criteria Intraframe coding Interframe coding Group of Picture (GOP) - one group of I, P, and B-frames

MPEG 4 Motivations

Object Oriented Concepts New Compression Techniques

Hierarchical Mode

Obtain I_0 from I_-1 + Diff(I_0 - I_-1) I_0 = original image, I_-1 = new image from sampling Basically: can obtain original image with new image and the difference between old image and new image...k, got it? good.

Analog Video

One or more analog signals that contain time-varying 2-D intensity (monochrone or color) patterns and timing info to align the pictures Examples: Component Analog Video (CAV) = RGB, YUV, YCrCB video Composite Video = NTSC S-Video

Lossless Encoding

Operates at pixel level instead of an 8 x 8 block Uses predictive techniques instead of DCT to remove redundancy in data. Image -> Predictor -> Entropy Encoder (/w table) -> Compressed Image

Frame Sequence

Organize frame sequence in terms of Group of Picture (GoP) Example: [ I B B P B B P] [ I B B P B B P] .... GoP needs to be uniform Each GoP is an independent entity I-Frame provides random access point, & as re-synchronization point P and B frames allows for greater compression efficiency

DCT Transform Process

Original Image -> FDCT -> DCT Image Storage -> IDCT -> Display Image

MPEG Picture

Picture Header, Slice Header, Macroblocks

MPEG Sequence Header

Picture Width, Height, Aspect Ratio, Bit Rate, Picture Rate

Block-based Motion Compensation

Principle: predict contents of current frames (at block level) from previous or subsequent frames Motion information comprises the amplitude and direction of displacement of the contents Advantages: Low overhead - needs only one motion vector per block Availability of low-cost VLSI implementation Disadvantages Fails for zoom, rotation motion and under local deformation Discontinuity at block boundaries Serious blocking artifacts, especially at low bit-rate

Image Compression: Quantization

Processing results of previous step: mapping real numbers into integers resulting in loss of precision

Common Image Formats

RIFF: Resource Interchange File Format GIF: Graphics Interchange Format TIFF: Tagged Image File Format JPEG: Joint Photographers Expert Group

Desired Features of Video

Random Access Fast Forward and Fast Rewind Think YOUTUBE, NETFLIX Reverse Playback Audio-Visual Synchronization: Robustness to Errors Coding/Decoding Delay: Total System Delay < 150 ms Editability Format Flexibility

Requirements of MPEG Encoding

Random access requirements --> pure intra-frame coding Higher compression rates ---> inter-frame coding

Principles of Compression

Redundancy and Matching user Expectations

MPEG Interframe Coding

Remove temporal redundancies between frames Use extensively in MPEG-1 and MPEG-2 Based on estimation of motion between video frames Use of motion vectors to describe displacement of pixels from one frame to the next One motion vector can represent the motion of a block of pixels.

How does Quantization Work?

Rounding to the nearest number by some reference (ie 10). then reversing the process. Smaller quantization step greater accuracy, less error. Smaller quantization steps for lower frequency components psycho-visual analysis to decide how much to quantize and why in the 8x8 DCT array

MPEG1 Encoding Scheme

STEPS Partitioning of images into Macroblocks (MB) size 16X16 Intraframe coding on one out of every K images - GOP size = K Motion estimation on MBs Generate (K-1) predicted frames Encode residual error images

Imaging and Video Requirements

Simple Image (307.2KByte) Color Image (921.6 KByte) Video (27.648MBytes/sec) ~30 frames/second

Conditional Replenishment

Skipped MB - Zero motion vector, the MB is neither encoded nor transmitted Inter MB - Motion Prediction is valid, the MB type and address, motion vector and the coded DCT coefficients are transmitted Intra MB - Encoded DCT coefficients of the MB are transmitted. No Motion Compensation is used

Stages of Quantization

Stage 1: Use Human Visual System (Psychovisual Features) Stage 2: Use FDCT Coefficients and obtain Q Matrix Stage 3: Scale each coefficient by the Q factor Stage 4: Most entries become 0 after applying Q matrix

Stored Image Format

Storage Options: 2D Array of Values. (Each value represents data from image pixel) Bitmap: binary digit value Color Image: Numbers or Color Lookup Table If enough space, store in RGB triplets (for color image) or compress the info Other necessary information: Width, Height, Image Depth, Creator, etc. (EXIF info)

Compression Audio Requirements

Telephone Speech sampled at 8kHz (8 bit/sample) (64Kbits/second) Audio CD Quality sampled at 44.1KHz (16 bits/sample) (176.4 KBytes/second)

Temporal Redundancy

Temporal redundancy: subsequent frames carry similar but slightly varying content

Compression Requirements

Text (Lowest common size = 640 x 480 screen size, 9.6KBytes) Vector Graphics (~500 lines, 2,875Bytes)

Lossless Coding Theorem

The minimum bit rate that can be achieved by lossless coding of discrete memoryless source X is given by: min {R} = H(X) + ε bits per symbol where R is the transmission rate, H is the entropy of the source and ε is a positive quantity which can be made arbitrarily close to zero

Search Strategies in Motion Compensation

Three step search Cross search

Image Decoding

Work backwards: Decode Huffman/RLE Dequantize Data with same Q matrix Apply IDCT to transform back to Spatial Domain (some loss here because of limitations)

Intensity Value: Black and White Image

Two Values (0 and 1). This is a binary-value image

Video: Scanning

Two types: Progressive Scanning: Each frame is rendered completely in each display cycle Interlaced Scanning: Each frame is split into two fields, and only half the frames is rendered in each cycle. Tradeoffs between speed and quality NEW SCAN METHODS AVAILABLE

Run-length Coding

Type: ENTROPY. Replaces repeated byte sequences with the byte and the number of occurrences. The number of occurrences is indicated by a special flag like "!" E.g. : BBBBBBBBB --> B!9 This technique is useful for images which have large regions of uniform colors (or gray values) GIF essentially uses run-length encoding - hence it is good for cartoon images but not for natural scenery images

FDCT

Used to convert spatial domain to frequency domain in an image

Compression Environments: Dialog Mode

User both sends/receives data. (Both transmitter and receiver)

Compression Technique: Entropy

Uses image content knowledge to reduce redundancy. Ignore media and human characteristics. (ie Runlength-coding, Huffman coding)

Structure of MPEG (Syntax Layers)

Video -> Sequence (group of pictures) -> picture (or frame) -> slice (first row in picture) -> macro block (2x2) in slice -> block -> (one block in macroblock)

MJPEG

Video JPEG where each frame is coded using JPEG

YUV, YCrCb

Y = luminance (grayscale component) UV = color components Chrome = high frequency Luminance = low frequency Humans are more sensitive to luminance so we can afford to lose a lot more info in chrominance components than in luminance component CUT CHROMINANCE FIRST (usually be 2:1...2h2v, 2h1v sampling (horizontal, vert)) Typical choice: Leave Luminance component alone at full resolution Chrominance components are often reduced 2:1 horizontally and either 2:1 or 1:1 (no change) vertically We often call these alternatives 2h2v or 2h1v sampling This immediately reduces the data size by 1/2 or 1/3.

Evolution of Computer Standards

from 16K color graphics to up to 786.4K with 256 colors.

What does an 8 x 8 DCT Block look like

top left corner is DC value (uniform...called DCT coefficient) first row = vertical variations first col = horizontal variations Remember: 1 DC, 63 AC

T6: Compression Techniques (Ch 11,12,13,14)

Kaugnay na mga set ng pag-aaral

CH.12 Env. Sci.

EAQ

Spain Final

Pharm ch 48 - Immunosuppressant Drugs

Unit 9 - The Government Regulators

Growth & development Exam 2- Chapter 6

Physical Science Final part 6

Unit 11 Vocab

Adrenal Medicine

Forensics-Chapters 6-9, 13,16

Unit 1 : House Keeping (Pre-Test questions & answers)

IF2

Gantz

Selecting a Bank

Product Owner Evidence-Based Management

Evolution & Diversity

Unit 6 NSC 101

6.10A Layers of Earth

Saunders Adult GI Part 1

Chapter 7 Tax -- T/F; MC