Abstract:
In publishing and printing of network version field, there are enormous number of TIFF format (CMYK) images which requires too huge space for storing and enough bandwidth...Show MoreMetadata
Abstract:
In publishing and printing of network version field, there are enormous number of TIFF format (CMYK) images which requires too huge space for storing and enough bandwidth for transmitting. Therefore, common need to manipulate huge amount of data brought about the issue of fast lossless compression. 2D integer wavelet transform can be used for lossless compression of static image, such as, 5/3 lifting wavelet is lossless compression of JEPG2000. Today, Multi-core (dual, four or eight cores) CPU technology help to accelerate wavelet transform speed. However, current multi-more is limit for acceleration. In this article, it presents acceleration of 2D integer wavelet transform by CUDA. Using of the NVIDIA graphics processor unit (GPU), multiple thread parallelization give attractive features than traditional CPU computation. Under the dual cores CPU and the CUDA device, the article accelerates HARR and 5/3 lifting wavelet on TIFF format images. For HARR wavelet, analysis and comparison have been done for original image matrix and matrix of transform result. which indicates adjacent four pixels of original image matrix can directly construct the corresponding four pixels of transform result. In addition, the adjacent four pixels have nothing to do with other pixels of transform result. Therefore, parallel HARR wavelet transform can be achieved by CUDA, the unit of kernel is based on four pixels. For 5/3 lifting wavelet, there are four groups of experiments, each of group have two kinds CUDA memory method(global and texture memory). Therefore, there are eight experiments. Firstly, the kernel uses only row transform and transpose computation by unit of row. Secondly, without transpose, the kernel uses both row and column by unit of row. Thirdly, it also computes row and transpose, however, the transform unit is based on single pixel. At last, it computes row and column without transpose, whose unit is also single pixel. The experiment Experimental results on an NVIDIA GeF...
Published in: 2010 3rd International Symposium on Parallel Architectures, Algorithms and Programming
Date of Conference: 18-20 December 2010
Date Added to IEEE Xplore: 17 February 2011
Print ISBN:978-1-4244-9482-8