BANDWIDTH COMPRESSION

DIGITAL S COMPONENT DIGITAL VIDEO FORMAT

Yoshimichi Nagaoka	Kiyoshi Honma	David Gifford	Neil Neubert
Victor Co. of Japan	Victor Co. of Japan	Victor Co. of Japan	JVC Professional Products Co., USA

DIGITAL S VIDEO TAPE RECORDING FORMAT AND PROCESSING

Introduction

The JVC Digital S video tape recording format combines a unique dual digital video compression processing system with a one half inch video tape transport adapted and highly upgraded from the S-VHS video tape transport. The upgraded transport provides rugged and reliable service used with metal particle videotape and running at higher tape and drum rotation speeds. The processing and transport combination yields a high performance but low cost digital video tape recording system. The Digital S video tape recording format has been standardized by the Society of Motion Picture and Television Engineers and possesses the SMPTE digital VCR designation, D-9.

A Digital S, D-9 compressed digital video interface has been developed to permit the connection of D-9 signals between VCR’s and other video equipment without the need to decode and encode between non-compressed digital video formats (ITU-R BT.601).

D-9 VIDEO TAPE TRACKS

D-9 Video Tape Tracks

D-9 employs a two-track, parallel recording system with a pair of heads located on each side of the drum, and aligned exactly opposite each other (180°). A single frame of video is recorded on 10 tracks (5 track pairs) for 525/60, and 12 tracks (6 track pairs) for 625/50 television systems. A track drawing is shown below.

D-9 VIDEO TAPE TRACKS D-9 HELICAL VIDEO TAPE TRACK SECTORS

The detailed D-9 videotape track pattern is shown in the figure below. The compressed digital video data, non-compressed digital audio data, and subcode (including system data) are written on individual sectors along the helical tracks by the rotating heads. Guard band gaps are provided between all of the sectors to permit independent editing of each sector.

D-9 employs ½ inch metal particle videotape housed in a videocassette that is very similar to an S-VHS videocassette. The D-9 tape transport is an upgrade and adaptation the of S-VHS tape transport. Adaptation of the video cassette and tape transport for D-9 results in economy, time proven quality and reliability, and playback compatibility for analog S-VHS videotapes. Certain D-9 players will accept and play back S-VHS, as well as D-9 videotapes.

D-9 Video Tape

D-9 uses modern, high performance dual coat metal particle videotape. The tape applies ultra-fine metal particles for reliable, high output, low error, high-density recording. Specifications include coercivity of approximately 1830 Oe. The D-9 videocassette is very similar to an S-VHS videocassette. It uses a high precision cassette shell, static electricity free lid, and a shield that prevents entry of dust while the cassette is not in use. Maximum recording time with 14.4 µm thick tape is 104 minutes.

D-9 VIDEO COMPRESSION AND SIGNAL PROCESSING

D-9 Processing

D-9 applies dual compression sections in a form of parallel compression processing to simultaneously encode (compress) equal halves of the input video signal. A block diagram illustrating one example of D-9 encoding is shown below.

D-9 - POSSIBLE IMPLEMENTATION

D-9 utilizes the DCT (Discrete Cosine Transform) process to transform an 8H × 8V pixel arrangement to an 8 × 8 matrix of DCT coefficients. DCT blocks of an 8 × 8 dimension are used for luminance (Y), and each color component, R–Y, and B–Y. D-9 DCT blocks are combined to form macro blocks. One D-9 macro block is made up of four DCT blocks; two luminance (Y) DCT blocks, and one each DCT block for the (R-Y) and (B–Y) color components. One macro block is the basic element of the error concealment process in D-9. Macro blocks are combined to form super blocks. Twenty-seven adjacent macro blocks, 9 Horizontal × 3 Vertical macro blocks, form a single super block. D-9 DCT, macro and super blocks are illustrated in the drawing below.

D-9 - PIXELS TO DCT BLOCKS TO MACRO BLOCKS TO SUPER BLOCKS TO TAPE TRACKS

Super blocks are arranged in the video frame so linear sequences of them, representing coherent and continuous horizontal rows in the video frame, can be recorded on the video tape tracks to permit good visible picture display in shuttle, search, and slow motion playback modes. D-9 provides good picture display at search speeds up to 32 times normal playback speed in forward and reverse, and high quality D-9low motion playback between ± ¹/₃ times normal playback speed.

D-9 writes six track pairs to record a 625/50 video frame on tape. Track pairs are written by two head pairs, located 180 degrees apart on the head drum. The heads in each pair are of opposite azimuth angles. The first track of a pair possesses a positive azimuth angle, and the second, a negative azimuth angle. Super blocks representing the same portion of every video frame would, therefore, be recorded on, and played back from the same tracks, by the same head pairs, for every video frame since there are two head pairs, and six is an even number of track pairs. Consequently, odd and even rows of super blocks are alternately recorded on the first (positive) and second (negative) azimuth tracks of succeeding frames so that the played back picture may be renewed every two frames in the event of the failure of a single record or playback head.

Five track pairs, an odd number, are written by the same two head pairs, an even number, for 525/60 systems. Thus, each track is recorded and played back by the opposite head pair for each succeeding video frame, and alternate odd-even super block recording is not required.

D-9 Shuffling

D-9 SHUFFLE PATTERN AND VIDEO SEGMENT FORMATION

D-9 creates a sequential stream of digital video data "segments" as input to the compression process. A segment is made up of five macro blocks that have been selected from all over the video frame by the shuffling process. Shuffling strives to assure that the amount of data contained in all of the video segments is as equal as possible. The assembly of "shuffled" macro blocks into video segments for input to the encoder is illustrated above.

D-9 Compressed Video Segment

The D-9 dual compression system can process macro blocks containing six DCT blocks, four luminance (Y), and two color component (R-Y) and (B-Y) in each of the parallel compression processes. D-9 macro blocks, however, contain four DCT blocks, possessing only two, instead of the four, luminance (Y) DCT blocks that can be processed. D-9 adds two new DCT blocks to each of its macro blocks so that six DCT blocks are contained in the macro block that is to be processed.

AC coefficient data samples for each DCT block are stored in the areas reserved for them within their associated DCT blocks. AC coefficient data that exceeds the storage capacity of its DCT block, can be stored in vacant AC coefficient capacity in other DCT blocks within the same macro block, or even within the same video segment. The two new DCT blocks that D-9 adds to each of its macro blocks prior to processing, are used to supply additional AC coefficient data storage capacity, thus, permitting more high frequency luminance and color components of the image to be recorded. The capacity of the additional new DCT blocks can be used to store excess AC coefficient data from any of the original four DCT blocks in the same macro block, and, excess data from other DCT blocks contained in the same video segment. The following illustration shows the added DCT blocks and how five video segments are combined to form one compressed video segment.

D-9 Compression Processing

D-9 employs conventional DCT transformation of the time based digital video data to a stream of frequency based DCT coefficients. DCT coefficient weighting is applied before further processing. Initial scaling transforms the AC coefficients from 10 to 9 bits by rounding them off. DCT blocks are classified into one of four groups determined by the maximum absolute value of the AC coefficients each contains.

Estimation is a process of selecting the quantization factor to be applied to each D-9 video segment (5 macro block sequence). Output data cannot exceed a maximum size of 385 bytes for each video segment. The quantization factor is chosen by a process of estimating the output data from the DCT AC coefficient data present at the quantizer input for each video segment. Factors are chosen to achieve the least compression and, therefore, the best possible picture quality within the output limit of 385 bytes.

The applied quantization factor, that is, an actual number that the DCT AC coefficient data is divided by in the quantizer, is determined by the estimation process. Four defined frequency dependent areas of the 8 × 8 DCT blocks can be quantized by different divisors and there are nine possible divisor combinations for these four areas. The divisors are determined by the class number of the DCT data in each video segment and the QNO, or quantization number from 0 to 15, that assures the output data will be 385 bytes or less. One figure below defines the areas of the 8 ´ 8 DCT coefficient matrix. THE other figure below shows the nine quantization (QTZ) divisor combinations and the areas within the 8 ´ 8 DCT coefficient matrix to which they are applied.

D-9 DCT BLOCK AREAS D-9 QUANTIZE (QTZ) TABLES

Variable length coding utilizes a two-dimensional Huffman code. Code word length is determined by the relationship between the "run", the length of consecutively recurring zeroes, and the "amp", the amplitude of the any data value other than zero that immediately follows a "run" of zeroes in each quantized DCT block.

D-9 Video/Audio Sector

Data structure of the video and audio sectors is illustrated in the figures that follow. Digital audio is not compressed in the D-9 system. Sync block structure is the same for video and audio and consists of a sync area of two bytes, ID code of three bytes, data of 77 bytes and inner parity of 8 bytes. Total is 90 bytes. Reed-Solomon code (85, 77) is used for the inner code and is the same for both video and audio. Outer parity is Reed-Solomon (149, 138) for video, and Reed-Solomon (14, 9) for audio.

D-9 Modulation

The D-9 modulation scheme employs a combination of randomized bit streams and interleaved NRZI coding. These offer less influence on low frequencies than do NRZI methods that utilize PR4 coding. The modulation process is completed by applying the 24-25 transform. The result is a high performance modulation circuit from less circuitry than that required for other block modulation methods such as 8-14, for instance.

Data is randomized using the M series polynomial function X⁷+X³+1. 24-25 modulation inserts an extra bit at the beginning of three consecutive randomized bytes resulting in a 25 bit code word. The extra bit can be a "1" or "0" and is chosen to prevent long consecutive sequences of like bits, that is, to maintain the AC characteristic of the modulation signal.

D-9 INTERFACE

D-9 Interface

D-9 provides a compressed digital data interface element known as the DIF packet. The D-9 DIF packet is specified by the SMPTE 314M Standard. A stream of DIF packets can be used to connect compressed D-9 signals between D-9 VCRs without the need to decode to, and encode from ITU-R BT.601. The DIF packet stream can be mapped using the SMPTE 321M Standard, to the SMPTE 305M SDTI (Serial Data Transport Interface) and transported over the conventional serial digital interface (SMPTE 259M / ITU-R BT.656) that is in common use. The DIF packet stream can be formatted to other digital interfaces such as Fibre Channel and ATM (Asynchronous Transfer Mode), as well.

D-9 COMPRESSED DIGITAL VIDEO AND AUDIO (NON-COMRESSED) INTERFACE

The data in one video frame is divided into ten DIF sequences for 525/60 systems and 12 DIF sequences for 625/50 systems. Each DIF sequence contains a header, subcode, V aux., and audio/video data section and is made up of 300 DIF blocks. The SMPTE 314M Standard specifies this data structure in detail. The transmission rate is (300 ¸ 1.001) DIF sequences per second for 525/60 systems, and 300 DIF sequences per second for 625/50 systems.

D-9 SUMMARY

D-9 Summary Specifications

D-9 SPECIFICATION TABLE

The table above provides a summary of the most important D-9 specifications. The D-9 videotape recording format was designed to satisfy the requirements of many video recording applications. D-9 possesses exceptional features, performance, and specifications, that will satisfy the video recording needs and requirements of tele-producers, broadcasters, and video professionals alike. In addition, D-9 promises to satisfy all such requirements at a significant new level of economy, unknown to digital videotape recording before now. Finally, the D-9 compressed digital video interface offers a much needed method of transferring compressed D-9 video signals between VCR’s and other video equipment without the need to decode and encode between non-compressed digital video formats.