# IMPLEMENTATION AND DEVELOPMENT OF HIGH ACCURACY H.265/HEVC STANDARD USING 45nm TECHNOLOGY

D.Raja Ramesh, Dr. H C Hadimani, Dr. Srinivasarao Udara

Research Scholar, Dept. of E&CE, G M Institute of Technology, Davangere, India Professor & HOD, Dept. of E&CE, G M Institute of Technology, Davangere, India Professor, Dept. of E&CE, S T J Institute of Technology Ranebennur, India rajaramesh09@gmail.com, hchadimani2017@gmail.com, srinivasarao\_udara@yahoo.com

## Abstract

One of the most and commonly adopted video compression standards today is the advanced H.265/HEVC, commonly used for many applications due to high speed, optimized area and good throughput. Examples of such major applications are communications, networking, image processing digital signal processing and real time video applications. The current design having the use of the modified inverse discrete cosine transforms to perform video decompression. A new detailed implementation described to analyze the behavior of the encoding and also the transform takes residual data and transforms it from the spatial domain to the frequency domain. This frequency domain representation can then be compressed, using quantization, while still retaining a high level of video quality when decoded and presented to the user. The encoder and decoder implementation done in Verilog and synthesized using sophisticated cadence tool Genus. A hardware implementation of the inverse quantization and inverse transform, compliant to the HEVC standard, will be developed. The current design high targets the 4x4 inverse quantization, transform, synthesis and place & route using the Nan gate 45 nm technologies Open Cell Library. A digital implementation using Innovus tool was used for the design with development and also done analysis of speed, area and throughput will be carried out for the current applications and compared to similar ASIC designs. The required operational frequency of this design will support 4K video at up to 40 frames/sec. The main core area of this digital design takes up 10224 µm2 and can operate at maximum frequency of 400 MHz.

Keywords: H.265/HEVC video compression, 45nm technology, speed, area, throughput.

# 1. Introduction

Video Compression is a technique to reduce the size of any type of data, high number of bits into lesser bits. Therefore, in video compression in order to decrease the size of videos and frames this process is also called encoding. As this process is reversible its inverse process is called decoding or decompression. The required tools used to compress and decompress the video are

called encoder or decoder designs. It can be hardware and software there is a combine name for this compression technique known as CODEC, usually mistaken with the terms data containers and video compression algorithms. Data containers are the packed video coded files which can be played by codec software. Compression algorithm can compress the video into specific video coded files so codec can play it. After decompression of a data file 100% equals to the original file then, this phenomenon is called lossless compression, but most compressions are lossy, for to measure the video quality. An uncompressed video shows huge amount of data such as in MBs, GBs and TBs. It is very difficult to compress such huge amount of data even with powerful computer systems. However, compression of video is relatively better due to presence of redundancy compare to other types of data, audio, text etc. It is important to consider the relation between the computational time and quality. If there is low computational time, then the quality will be low as well. This will happen only when there is high compression ratio. Different video compression standards can be found for different compression ratios or bit rate.

The transmission medium or communication channels are the most important part of the Digital transmission. When the large amount of digital data is passed through the channel there is chance that it is possible to face many glitches such as data loss, data redundancy, data corruption, noise etc. To avoid the issue that the technique has used in which in the place the sequence of data in such a manner; number, letters, special characters and symbols, that the data can be stored and transmitted efficiently. This technique is called encoding. In the previous section also discussed that this process is reversible and its counter process is called decoding. There are different encoders and decoders depending on the type of data such as audio, video, image and text etc. In literature survey it is studying about the video encoder and decoder in our research. The current research work for the H.265/HEVC video compression was developed for real time video applications and also to improve the area, speed and throughput forthe above said applications.

### 2. Literature survey

The design of Ma [9] is purely combinational logic and thus has no clock. The gate count of Ma only includes one 1D transform and does not include the transpose memory. If we estimate expected memory size using the design of this work, a register based 8x8 memory would require approximately 9.5k gates and would bump the total gate count to roughly 25k gates for a serially processed design, i.e. the transpose memory in between two IDCT modules. In this work, the design re-uses the 1D transform to perform the 2D transform and thus achieves some area savings.

The design of Porto [10] operates at a very high throughput due to the discarding of columns of data and only performing one 1D inverse transform. It is unclear how effective this technique would be when paired with lower QP values as the proposed design only discusses high QP values. As mentioned previously, it is also difficult to assess human perceived quality from objective metrics. It is also unclear how effective the discarding of columns of data would be with larger transform sizes.

The proposal of Ziyou [11] does implement the full range of transform sizes but does so at a hefty gate count. Also, the design processes of one coefficient at a time and so relies on a higher clock speed than the other designs. This high clock speed would likely come with higher power dissipation and possible emc issues.

## **3.** Proposed architecture

The implementation of current design with 4x4 inverse quantization and inverse transform having three unique design entries. The first is the RTL design done in Verilog and then verified with ICARUS tool and MATLAB. The second phase consists of using incisive Compiler to then synthesize the design and produce files for use in the last phase called digital implementation. Finally, the placement and routing uses files from the synthesis step to produce a placed and routed design. The latest Cadence Innovus was used for placement and routing. The following Figure.1 shows a block diagram that represents the main functional blocks of the complete design.



Figure 1: Block diagram of proposed design

# 3.1 RTL Design

The current design consists of two related and independent decoder having some main functions, the inverse quantization module and the Inverse discrete cosine transform to examine each separately. To begin with the Inverse discrete cosine transforms and then proceeds to the inverse quantization module and then the control unit. The Inverse discrete cosine transform shaving inverse quantization data, performs the 2D transform as two separate 1D transforms and outputs residual data. The decoder then adds the residual data to the predicted block, used originally by the encoder, to reconstruct the pixel block separately. This process is mainly done through block by block to form the video frame.

IDCT/IDST is the main function of this module is to inverse transform the data provided at its inputs. This module undoes the complementing function implemented in the encoder. The input data is the frequency domain and the final output is in the spatial domain. Consider the DCT transform matrix shown in Figure 1 and defined by the HEVC standard. This matrix and its inverse are used in the calculation of 4x4 matrix of DCT and IDCT respectively.

# 4. Physical Layout and Fabricated Digital Decoder Design Results

To highlight the result of the proposed specifiedH.265/HEVC design, synthesis and physical layout proved better result for some important applications. Hence the synthesis and fabricated physical layout of the design results of the existing encoder/decoder design are preceded by the synthesis and power reports of the proposed counters. The existing methodology only is simulate during some basic simulation tools and the analysis is taken out as discussed above but the current design shows efficient area, high speed and good throughput for the real time video applications..



Figure 2: Complete physical layout design

Design and Verification done fully and routed all timing violations are cleared the next step is to perform verification steps on the design. These current verification steps ensure that the design do not have any errors in the placement or routing. All three verification tests passed with no violations. The tests and brief description information discussed below.

DRC: This verification step checks for basic such as required widths of shapes on all type of a layer, spacing between objects on a layer and enclosure requirements of feature such as a via. Geometry: Verifies internal geometries of wires and objects. Wiring nodes are together at the geometry level it controls the flow of geometry through the nodes and that the nodes are generate the new geometry through nodes that modify the geometry. Connectivity: For this connectivity it is check for antennas, opens for compatibility with input and output connections, loops and unconnected pins. The comparisons highlight significant differences in their overall proposed solutions. For example, the IDST is closely tied to the IDCT for the most of the designs do not support this transform in their proposed architecture. In Table 1, the designs are compared to provide an overview of both performance and architecture. These metrics are meant to compare the supported functions and also each design's performance.

In the current design mentioned, the 4x4 hardware was used to perform an 8x8 inverse transform by taking the 4x4 IDCT so that two 4x4 IDCT must be used to compute the 8x8. This re-using of hardware extends also to the 16x16 and 32x32 transforms. The 4K speed metric shows the clock frequency that each design would operate at in order to process 4K video. The metric of maximum throughput is the pixel throughput if each design were operated at their maximum Frequency.

| Design          | Tech. | IDCT          | IDST | Inverse | Size(gate | Max.       | 4K    | Max         |
|-----------------|-------|---------------|------|---------|-----------|------------|-------|-------------|
|                 |       |               |      | Quant.  | count)    | speed(MHZ) | Speed | Throughput( |
|                 |       |               |      |         |           |            | MHz   | pixels/s)   |
| [9]             | 130nm | 8×8           | no   | no      | 8.2k      | Comb.      | NA    | unknown     |
| [10]            | 28nm  | 4×4           | no   | no      | NA        | 78.31      | 23.3  | 835M        |
| [11]            | 65nm  | 4×4-<br>32×32 | yes  | no      | 145.4k    | 500        | 412   | 410M        |
| Current<br>work | 45nm  | 4×4           | yes  | yes     | 18.4k     | 489        | 250   | 720M        |

## Table 1: Architecture and performance comparisons

## 5. Conclusion

Specific H.265/HEVC video compression standard in digital real time video applications are required accuracy and throughput. In this paper, the investigated efficient accuracy, the area of two different implementations of encoders and decoders, the modified transpose memory, control, inverse quantizer, approach to successfully improve the video compression rate without using other existing techniques. The current physical layout Designs in the standard cell design environment and measured the performance in terms of area, speed, and throughput. The New proposed design has been implemented in 45nm technology for the physical layout of the fabricated H.265/HEVC in the Cadence Innovus environment and to increase throughput by a minimum of 35% and reduce power by 22% and also the modified design improves the accuracy function by 15%.

## References

- 1. J. G. Apostolopoulos, "Video Compression", Streaming Media System Group (MPEG), Feb.2006.
- J.-R. Ohm, G. J. Sullivan, H. Schwarz, T. K. Tan, and T. Wiegand, "Comparison of the CodingEfficiency of Video Coding Standards - Circuits and Systems for Video Technology, Vol. 22, No. 12, pp. 1669-1684, Dec. 2016.
- J. Ohm, W. Han, T. Wiegand and G. J. Sullivan, "Overview of the High Efficiency Video Coding (HEVC) Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 12, pp. 1649-1668, 2012.
- 4. ITU-T and ISO/IEC, "ITU-T Rec. H.264 and ISO/IEC 14496-10 : Advanced Video Coding for Generic Audio-Visual Services," ITU-T and ISO/IEC, 2003.
- T. Wiegand, G. Sullivan, G. Bjontegaard and A. Luthra, "Overview of the H.264/AVC Video Coding Standard," IEEE Transactions on Circuits and Systems for Video Technology, vol. 13, no. 7, pp. 560-576, 2003.
- 6. ITU-T and ISO/IEC, "ITU-T Rec. H.265 and ISO/IEC 23008-2: High Efficiency Video Coding," ITU-T and ISO/IEC, 2013.
- 7. B. Juurlink, M. Alvarez-Mesa and C. Chi, "Understanding the Application: An Overview of the H.264 Standard," in Scalable Parallel Programming Applied to H.264/AVC Decoding, Springer, 2012, pp. 5-15.
- 8. G. Pastuszak and A. Abramowski, "Algorithm and Architecture Design of the H.265/HEVC Intra Encoder," IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 1, pp. 210-222, 2016.
- 9. T. Ma, C. Liu, Y. Fan and X. Zeng, "A Fast 8×8 IDCT Algorithm for HEVC," in 2013 IEEE 10th International Conference on ASIC, Shenzhen, 2013.
- M. Porto et al, "Hardware Design of Fast HEVC 2-D IDCT Targeting Real-Time UHD 4K Applications," in 2015 IEEE 6th Latin American Symposium on Circuits & Systems (LASCAS), Montevideo, 2015.
- Y. Ziyou et al, "Area and Throughput Efficient IDCT/IDST Architecture for HEVC Standard," in 2014 IEEE International Symposium on Circuits and Systems, Melbourne, 2014.