论文部分内容阅读
This paper introduced a novel high performance algorithm and VLSI architectures for achieving bit plane coding (BPC) in word level sequential and parallel mode. The proposed BPC algorithm adopts the techniques of coding pass prediction and par- allel & pipeline to reduce the number of accessing memory and to increase the ability of concurrently processing of the system, where all the coefficient bits of a code block could be coded by only one scan. A new parallel bit plane architecture (PA) was proposed to achieve word-level sequential coding. Moreover, an efficient high-speed architecture (HA) was presented to achieve multi-word parallel coding. Compared to the state of the art, the proposed PA could reduce the hardware cost more efficiently, though the throughput retains one coefficient coded per clock. While the proposed HA could perform coding for 4 coefficients belonging to a stripe column at one intra-clock cycle, so that coding for an N×N code-block could be completed in approximate N2/4 intra-clock cycles. Theoretical analysis and ex- perimental results demonstrate that the proposed designs have high throughput rate with good performance in terms of speedup to cost, which can be good alternatives for low power applications.
This paper introduced a novel high performance algorithm and VLSI architectures for achieving bit plane coding (BPC) in word level sequential and parallel mode. The proposed BPC algorithm adopts the techniques of coding pass prediction and par- allel & pipeline to reduce the number of accessing memory and to increase the ability of concurrently processing of the system, where all the coefficient bits could be coded by only one scan. A new parallel bit plane architecture (PA) was proposed to achieve word-level sequential coding. Compared to the state of the art, the proposed PA could reduce the hardware cost more efficiently, though the throughput retains one coefficient coded per clock. While the proposed HA could perform coding for 4 coefficients belonging to a stripe column at one intra-clock cycle, so that coding for an N × N code-block could be completed in approx Theoretical analysis and ex-perimental results demonstrate that the proposed designs have high throughput rate with good performance in terms of speedup to cost, which can be good alternatives for low power applications.