Voice Security Using Hybrid Algorithm

107 Voice Security Using Hybrid Algorithm Alyaa Moufaq Abdul Majeed Haleem  AlyaaHaleem@uomosul.edu.iq  College of Computers Sciences and Mathematics  University of Mosul, Iraq  Received on: 04/11/2009 Accepted on: 16/05/2010 ABSTRACT This research deals with constructing and implementing a new digital voice security Algorithm based on hiding large amount of data (sound file) in a 24 bits host color image (RGB image). The proposed method starts with speech compression to convert human speech into an efficiently encoded representation that can later be decoded to produce a close approximation of the original signal. The process of compression is achieved by first computing Discrete Wavelet Transform (DWT), truncating small-valued coefficients and then efficiently encoding them. The stream bits output from coder are encrypted using Linear Feedback Shift Register (LFSR) algorithm. These enciphered bits are then embedded into the image blocks. A binary key matrix and weight matrix are used as a secret key to protect the hidden information. The algorithm can hide as many as ( ) ) 1 12 ( log2 +   N M bits of data in the image by changing one bit in each block of size ) ( N M  . High security algorithm was achieved using three layers to make it difficult to break by attacker. The algorithm has been implemented using MATLAB.


Introduction
Digital speech communication has been applied in many fields. But communication between two parties over long distances has always been subject to interception. This led to the development of cryptography schemes. Cryptography schemes achieve security mainly through a process of making the speech unintelligible so that those who do not possess necessary keys cannot recover the speech. Though cryptography can hide the content of the speech, the existence of a cryptographic communication in progress cannot be hidden from a third party. If the third party discovers the cryptographic communication, he/she might be able to decipher the speech. It can be seen that latent danger exists in cryptography schemes. The need to avoid this led to the development of steganography schemes which compensate cryptography by hiding the existence of a secret communication [4] . The secure speech transmission system based on steganography by embedding a secret speech file in a cover medium has been increasingly gaining importance in the field of information technology [2] .
Generally Steganography is the art and science of writing hidden messages in such a way that no one apart from the intended recipient knows of the existence of the message; this is in contrast to cryptography, where the existence of the message itself is not disguised, but the content is obscured. Cryptography hides the contents of a secret message from an attacker, whereas steganography even conceals the existence of this message. Therefore the definition of breaking the system is different. In cryptography, the system is broken when the attacker can read the secret message. Breaking a steganography system has two stages first the attacker can detect that steganography has been used. Additionally, he is able to read the embedded message [6] [8] .
Cryptographic techniques scramble a message so that if it is intercepted, it cannot be understood. This process is encryption and the encrypted message is sometimes referred to as cipher text. Steganography, in essence,"camouflages" a message to hide its existence and make it seem "invisible" thus concealing the fact that a message is being sent altogether. A cipher text message may draw suspicion while invisible message will not. [7] [5] In this research both sciences can be combined for better protection of information.

Hiding a message inside color images
Hiding information inside images is a popular technique nowadays. An image with a secret message inside may be easily spread over the World Wide Web or in newsgroups. The most widely used technique to hide data is the usage of the Leastsignificant bit (LSB). Although there are several disadvantages to this approach, the relative easiness to implement it, makes it a popular method. To hide a secret message inside an image, a proper cover image is needed. Because this method change the LSB bits of each pixel in the image, it is necessary to use a lossless compression format, otherwise the hidden information will get lost in the transformations of a lossy compression algorithm [6] .
There are several disadvantages to a LSB strategy. The first is that it is easily recognizable by image analysis. The signature of the embedded text can be recognized and thus does not provide a safe cover for sensitive or copyright marks. Another disadvantage to embedding information inside other data is that lossless compression algorithms and formats such as jpeg will destroy the required structure to recover embedded information [9] .
The other most common methods using a 24-bit color image, a bit of each of the red, green and blue color components can be used, so a total of 3 bits can be stored in each pixel. Thus, an 800*600 pixel image can contain a total amount of 1.440.000 bits (180.000 bytes) of secret data. However, using this method may be lead to changes will be noticeable using statistical analysis against the different areas of the image and causes to distorted the image [7] .
In this research to ameliorate the image hiding quality and hiding capacity our modification algorithm is capable of hiding large amount of data by changing a small number of bits in the original binary image. The modified method used 24-bit color image and partitioned it into blocks of size ) ( N M  and used the first 4 bit of the red, green and blue color components, so that the total size used is ) 12 bits. In this block size can conceal as many as ( ) bits of data by changing only one bit of this block. This algorithm is more effective than the traditional methods (LSB), that can hide one bit by changing one bit in block.

Layers of Hybrid algorithm
New steganography algorithm using three layers of security has been constructed. These layers are developed to acquire high security. These layers work independently to provide unbreakable security as show in Figure (  The compression mechanism is the first layer of security using Discrete Wavelet Transform (DWT) Technique and run length method. Then encryption the coding speech signal using stream cipher algorithm. these two layers are used before hiding input speech signal.

Compression Layer
The idea behind speech compression is to encode audio data to take up less storage space and less band width for transmission. To meet this goal we used Fast Wavelet Transform (FWT). Wavelets concentrate speech information (energy and perception) into a few neighboring coefficients [3] . Therefore as a result of taking the wavelet transform of a signal, many coefficients will either be zero or have negligible magnitudes. Data compression is then achieved by treating small valued coefficients as insignificant data and thus discarding them.

The Fast Wavelet Transform (FWT) Algorithm
The Discrete Wavelet Transform (DWT) coefficients can be computed by using Fast Wavelet Transform algorithm considers the following equation: The The high pass filter is obtained from the low pass filter using the relationship ) , where k varies over the range (1-(2N-1)) to 1. Equation (1) shows that the scaling function is essentially a low pass filter and is used to define the approximations. The wavelet function defined by equation (2) is a high pass filter and defines the details.
Given an input speech signal s of length N as shown in Figure(2), the DWT consists of 2 log N stages at most. The first step produces, starting from s, two sets of coefficients: approximation coefficients cA1, and detail coefficients cD1. These vectors are obtained by convolving s with the low-pass filter Lo_D for approximation, and with the High-pass filter Hi_D for detail, followed by dyadic decimation or down sampling by a factor of 2. As shown in the figure below  The length of each filter is equal to 2N. If n = length (s), the signals F and G are of length (N+ 2n -1), and the coefficients cA1 and cD1 are of length . The next step splits the approximation coefficients cA1 in two parts using the same scheme, replacing s by cA1 and producing cA2 and cD2, and so on. So the wavelet decomposition of the signal s analyzed at level j has the structure shown in Figure (3).

Figure 3: 3-level Decomposition of Signal S
After calculating the wavelet transform of the speech signal, compression is achieved by first truncation wavelet coefficients below a threshold. An experiment conducted on a male spoken sentence, shows that most of the coefficients have small magnitudes. More than 90% of the wavelet coefficients have less than 5% of the maximum value.This means that most of the speech energy is in the high-valued coefficients [3] . Thus the small valued coefficients can be truncated or zeroed. Secondly, encode consecutive zero valued coefficients with two bytes using Run length encoding method. One byte to indicate a sequence of zeros in the wavelet transforms vector and the second byte representing the number of consecutive zeros. Figure (4) show the flowchart of compression algorithm.

Encryption Layer
The Linear Feedback Shift Register (LFSR) has been one of the most popular encryption techniques widely used in speech communication. LFSR is suitable for speech because speech is continuous streaming data. They encrypt individual character (usually binary digits) of a plaintext message one at a time, using an encryption transformation which varies with time. Stream cipher which used LFSR is algorithm that encrypts plaintext one bit at a time [10] . Key stream generator generates outputs stream of bits k1, k2, ..., kn. Cipher text is obtained by XOR this key stream bits with plain text bits p1, p2, ..., pn. Generally , the length of the sequence before repetition occurs depends upon two things, the feedback taps and the initial state. An LFSR of any given size m (number of registers) is capable of producing every possible state during the period N=2 m -1, but will do so only if proper feedback taps, or terms, have been chosen. Such a sequence is called a maximal length sequence, maximal sequence, or less commonly, maximum length sequence. It is often abbreviated as m-sequence. In certain industries msequences are referred to as a pseudonoise (PN) or pseudorandom sequences.

M-Sequence Properties
The Properties of m-sequences include the following: 1. An m-bit register produces an m-sequence of period 2 m -1.
3. The modulo-2 sum of an m-sequence and another phase of the same sequence yields yet a third phase of the sequence.

Algorithm of Linear Feedback Shift Register
Step1: Input coefficient (Ci), initial state (Si) (randomly value) and plain text (P). A linear feedback shift register of length L (length of initial value) consists of L stages numbered 0, 1…. L -1. Step2: Perform AND operation between coefficient and initial value.
Step3: Applying XOR operation between the bits of the result from step2. Function =S0C0  S1C1 Step4: The first bit from result (S0C0  S1C1) puts in the sequence and shift the initial value by one. Step5: The max length of sequence is (2 m -1) when m is length of coefficient vector.
Step6: Applying M-Sequence condition on sequence generate from previous steps if Msequence then go to step 7 else go to step 1 Step7: Convert the samples of P to binary. Repeat the M-sequence until become equal to length of plain text in binary. Step8: Use XOR operation between plain text (samples) in binary and sequence.

Steganography Layer
In this layer used the cipher speech signal produced from previous two algorithms (compression and encryption) and embeds the bits of this signal into selected image.

sound hiding Algorithm
Step1: Read cover image of type (RGB) and save in X.
Step2: Read the bits of cipher speech signal (com_enc_sig) to be embed.
Step3: Convert each component byte of (R G B) of each pixel to binary.
Step4: Divided X into blocks of size M×N & from each pixel of block take only the low nibble bits of (RGB) bytes. Step5: Generate key matrix (secret key shared by the sender and the receiver). The elements of key are randomly select of binary value and of size m×n×12. Step6: read the number of bits (no_bit) to be embedded in each block of X. The value of no_bit should be test as following:

If then Go to step7 Else
Enter another value of no-bit Step7: Find the maximum size of message that accepted by the cover image X.
Let the variable total_bits represent the total bits to be embedded.

If total_bits no_block×no_bits then Go to steps of embedded the message (8) Else
Select another file of speech message Step8: Generate a weight matrix shared by the sender and receiver.
Step9: perform the XOR operation. [11] Step10: perform the component wise multiplication operation. [11] Step11: let the variable represent the bits embedded into block. [11] If then The does not need to change Else bit in should be modify The steps to modify the block: If =0 then complementing will increase the modular sum by .
If =1 then complementing will decrease the modular sum by .

Sound Recovery Algorithm
Step1: Divided X(Stego_image) into blocks of size (M×N).
Step2: Calculate the number of embedded bits in each block from weight matrix.
Step 3: For each block do the step from 3 to 5 Step4: Perform the component wise multiplication operation.
Step5: Find the bits embedded in this block by

Measure for image and recovered sound quality
Image quality after hide message and Sound quality of recovered sound are usually judged by objective measures such as Correlation Co-efficient. Correlation is a measure of the strength of relationship between random variables. The population correlation between two variables X and Y is defined as: ...(4) ρ is called the Product Moment Correlation Coefficient or simply the Correlation Coefficient. It is a number that summarizes the direction and closeness of linear relations between two variables. The correlation coefficient can take values between -1 through 0 to +1. The sign (+ or -) of the correlation defines the direction of the relationship. The table (1) shows guidelines for describing the strength and direction of a correlation.

Experimental Result
The program of the proposed method was tested on the speech message signal of size 37.1 Kb with sampling rate 8 KHZ and a 8 bits sample size. Figure (5) shows a sample speech message that to be hidden inside image before and after applying compression algorithm. The size of message after compression is 14.2 Kb and the compression ratio is 38.2.  (6) shows the 512×512 RGB image used to hide speech message before and after embedding. objective test used for measuring the quality of image after hide message correlation coefficient was calculated between the original image and stegoimage according to the formula (3), the result of the calculation be equal to (0.998), a linear relationship is very strong according to the schedule (1). And the quality of the recovered speech measured using correlation coefficient was calculated between the original message and recovered message the result of the calculation be equal to (0.9838), a linear relationship is very strong according to the schedule (1) as shown in figure (7).