Hybrid System for Speech Coding Based on Contourlet Transformation

The main objective of speech coding is to allow the transmission of speech over digital channel of the highest speech quality and least possible bit rate, beside the security purpose. In this paper, the speech was coded by transforming it applying (which is often single dimension) into a two dimensional array to be suitable for transferring the countourlet transformation. Applied EZC (Embedded Zero tree Contourlet) algorithm, then is applied to the Huffman coding on the results of EZC, and used RLE (Run Length Encoding). The above idea gave the ability for coding, compression with retrieved information of high accuracy, using some measurements for quality measured of reconstructed signal, and found results show high similarity between the original and reconstructed signal.


ABSTRACT
The main objective of speech coding is to allow the transmission of speech over digital channel of the highest speech quality and least possible bit rate, beside the security purpose.
In this paper, the speech was coded by transforming it applying (which is often single dimension) into a two dimensional array to be suitable for transferring the countourlet transformation.Applied EZC (Embedded Zero tree Contourlet) algorithm, then is applied to the Huffman coding on the results of EZC, and used RLE (Run Length Encoding).
The above idea gave the ability for coding, compression with retrieved information of high accuracy, using some measurements for quality measured of reconstructed signal, and found results show high similarity between the original and reconstructed signal.

1-Introduction
The digital signal is a sequence of binary bits; it is more reliable to represent these bits by a special code in a few bits.Few bits mean less memory from the point of view of storage, and less bandwidth and bit rate from the point of view of transformation.In spite of the wide bandwidth in the wired communication systems, but in the wireless, it is still limited, so the need to transform the signal by small number of bit per sample.Beside these reasons, the cost, the delay, and complexity are other reasons.If less memory is utilized, the cost becomes less with lower bit rates and the required channel capacity is also reduced [1].For coding used contourlet transform.
In spite of the speech is one dimension, but the contourlet is applied with it, then applied EZC, then used Huffman coding on the results of EZC, and then used RLE, and the results are good given.

2-Principles of EZC
A new method has been used depending on the Contourlet Transform for compression.The method is EZC which depends on the Embedded Zeros Tree for Contourlet (ETC).The relationship between parent and child for Contourlet Transform coefficients is similar to that between Wavelet coefficients.In the Contourlet Transform, one can assume two different parent-child relationships depending on the number of directional decompositions in the contourlet subbands.The parent lay, the four children will be in two adjacent directional subbands as shown in Figure (1) [2].

3-ETC or EZC algorithm
The following steps represent all the principles of ETC  First step: The Contourlet Transform coefficients of the entire 2-D image are created. Second step: Progressively ETC are made on the coefficients.
All the coefficients are scanned from top to down -If the leaves are less than root, they will be marked to be zero.
-If the leaves are greater than root, they will stay as before without any change. Third step: Huffman coding is applied [3].

4-Contourlet Transform
The contourlet transform is one of the new geometrical image transforms, which can efficiently represent images containing contours and textures.This transform uses a structure similar to that of curvelets, that is, a stage of sub band decomposition followed by a directional transform.In the contourlet transform, a Laplacian pyramid (LP) is employed for the first stage, while directional filter banks (DFBs) are used in the angular decomposition stage.
The contourlet transform is constructed as a combination of the Laplacian pyramid and the directional filter banks (DFB).Conceptually, the flow of operation as shown in Figure (2), where the Laplacian pyramid iteratively decomposes a 2-D image into low pass and high pass sub bands, and the DFBs are applied to the high pass subbands to further decompose the frequency spectrum [4].

6-Iterated Directional Filter Banks
Bamberger and Smith constructed a 2-D directional filter bank (DFB) that can be maximally decimated while achieving perfect reconstruction [5].The DFB is efficiently implemented via an l-level binary tree decomposition that leads to 2 l sub bands with wedge-shaped frequency partitioning as in Figure (4).The original construction of the DFB involves modulating the input image and using quincunx filter banks with diamond-shaped filters [6].
To simplify the analysis of the iterated DFB, propose a new formulation for the DFB that is based only on the QFB's with fan filters.The new DFB avoids the modulation of the input image and has a simpler rule for expanding the decomposition tree [7].
This simplified DFB is intuitively constructed from two building blocks.The first building block is a two-channel quincunx filter bank with fan filters, as in Figure (5), that divides a 2-D spectrum into two directions: horizontal and vertical.The second building block of the DFB is a shearing operator.Figure (6) shows an application of a shearing operator, where a −45• direction edge becomes a vertical edge.By adding a pair of shearing operator and its inverse ("unshearing") before and after respectively, a two channel filter bank, as in Figure (5), obtains a different directional frequency partition while maintaining perfect reconstruction.Thus, the key in the DFB is to use an appropriate combination of shearing operators together with two-direction partition of quincunx filter banks at each node in a binary tree-structured filter bank, to obtain the desired 2-D spectrum division as shown in Figure (6).The following four basic unimodular matrices are used in the DFB in order to provide the equivalence of the rotation operations: Note that R 0 R 1 = R 2 R 3 = I 2 (here I 2 denotes the 2×2 identity matrix).So, for example, up sampling by R 0 is equivalent to down sampling by R1.
A useful tool in analyzing multi-dimensional multirate operations is the Smith form that can diagonalize any integer matrix M into product UDV where U and V are unimodular integer matrices and D is an integer diagonal matrix.The quincunx matrix in: can be expressed in the Smith form as: Where are two 2-D diagonal matrices that correspond to dyadic sampling in each dimension [7,8].

7-Multiresolution Analysis
As for the wavelet filter bank, the iterated Pyramid Directional Filter Bank (PDFB) can be associated with a continuous domain system, called Contourlet.This connection will be made precise by studying the embedded grids of approximation as in the multi resolution analysis for wavelets.The new elements are of multiple directions and they combine with multiscale as in Figure (7) [7,9].From the upper line to the lower line, the scale is reduced by four while the number of directions is doubled.

8-Multiscale
Suppose that the LP in the PDFB uses orthogonal filters and down sampling by two is taken in each dimension.Under certain conditions, the low pass filter G in the LP uniquely defines an orthogonal scaling function (t) L 2 (R 2 ) via the two-scale equation: Where g [n] is the impulse response of the low pass synthesis filter G Let: ., , 2 Then the family { j,n } n Z 2 is an orthogonal basis of V j for all j Z.The sequence of nested subspaces   Z j j V  satisfies the following invariance properties: In other words, Vj is a subspace defined on a uniform grid with intervals 2 j 2 j , which characterize the image approximation at the resolution 2 -j .The different image in the LP carries the details necessary to increase the resolution of an image approximation.Let W j be the orthogonal complement of Vj in V j-1 .Also see Figure (8) [7]. .[n c .Let Fi(z), 0 i 3 be the synthesis filters for these polyphase components.Note that Fi are high pass filters.As in the wavelet filter bank, with each of these filters a continuous function ) ( where f i [n] is the impulse response of the high pass synthesis filter F i (z) [7,10].

9-Multidirectional
By using multi rate identities, it is instructive to view an l-level tree-structured DFB equivalently as a l 2 parallel channel filter bank, as shown below in Figure (9), with equivalent analysis filters, synthesis filters and overall sampling matrices.In Figure (9), the equivalent directional analysis filters are denoted as , and the directional synthesis filters as , which correspond to the subbands indexed as in Figure (10).From the multiscale analysis of the previous LP stage, the DFB in the contourlet filter bank utilizes orthogonal filters and when such DFB is applied to the difference image (detail) subspaces, the resulting detail directional subspaces ) ( , in the frequency domain will be as illustrated in Figure (11).The indexes j, k, and n specify the scale, direction, and location, respectively.Note that the number of DFB decomposition levels l is different at different scales j, and is denoted by j l .Recall that j W is not a shift-invariant subspace.However, its subspaces

10-Huffman Coding
Huffman encoding is an important part of many data compression algorithms; Huffman encoding is effective to compress numerical data [11], the advantage from applying Huffman for numerical data to convert it to binary.The following Huffman coding algorithm given L symbols for coding: 1. Arrange the symbol probability Pi in descending order and consider them as leaf nodes of a tree.2. While there is more than one node  Merge the two nodes with smallest probability to form a new node whose probability = sum of the two merged nodes. Arbitrarily assign 1 or 0 to each pair of branches merging into a node (e.g.0 to left branch, 1 to right branch).3. Read sequentially from root node to leaf where symbol is located.

11-Run-Length Encoding
Run Length Encoding or RLE is a simple method from lossless compression [12 and 13].The idea is that if item d occurs n consecutive times in the input stream, the n occurrences are replaced with the single pair d n.The n consecutive occurrences of a data item are called a run length of n, and this approach to data compression is called Run Length Encoding or RLE [14].

12-Performance Measures
A number of quantitative parameters can be used to evaluate the performance of the wavelet based speech coder, in both reconstructed signal quality after decoding and compression.The following parameters are compared  Signal to Noise Ratio (SNR) the following formula is the mean square of the speech signal is the mean square difference between the original and reconstructed signal. Normalized Root Mean Square Error x(n) is the speech signal, r(n) is the reconstructed signal ,and the mean of the speech signal [1,15] . Correlation between original signal and reconstructed signal shows how much the returned signal is close to the original one, and is represented in equation [16]: Where: = mean2 (A), and  Compression ratio: It is the ratio of the original signal to the compressed signal.

13-The Proposed Algorithm
The CT applied on the image, but in this research is modified to be used on the speech compression, the speech is one dimension, so it is to be convert to 2-D before using CT.The results applied the EZC algorithm.The results feed to the Huffman coding, and then applied RLE.The coded result treated to reconstruct the expected speech signal by applying inverse process.Figure (12) represents the proposed algorithm.1), from the results on this table good results are shown specially the CR yielded high CR by S4 is equal to (9.5855), in addition, Figure (13) shows high similarity between original signal and reconstructed signal.

14-Conclusion
From these are results conclusion can applied Contourlet Transform with signals and give good results, and yielded high Compression as applied with images.Then using the EZC on contourlet transform with Huffman coding and RLE give good results, and yielded high Compression.

Figure ( 2 )
Figure (2).The Original Contourlet Transform.(A) Block Diagram.(B) Resulting Frequency Division 5-Pyramid frames One way to obtain a multiscale decomposition is to use the Laplacian pyramid.The LP decomposition at each level generates a down sampled low pass version of the original and the difference between the original and the prediction, resulting in a band pass image.As shown in Figure (3) depicts this decomposition process, where H and G are called low pass analysis and synthesis filters, respectively, and M is the sampling matrix.The process can be iterated on the coarse (sampled down) signal ( Low frequencies) .Note that in multidimensional filter banks, sampling is represented by sampling matrices; for example, down sampling x[n] by M yields xd[n] = x[Mn],where M is an integer matrix[4].

Figure ( 7 ).
Figure(7).Illustration of the contourlet basis images that satisfy the curve scaling relation.From the upper line to the lower line, the scale is reduced by four while the number of directions is doubled.

W
are regenerative, since they are generated by a single function and its translations [7 and 10].

Figure ( 11 ).
Figure (11).Multiscale and Multidirectional Subspaces Generated by the Transform this is Illustrated on 2-D Spectrum Decomposition