Cryptanalysis Knapsack Cipher Using Artificial Immune System

185 Cryptanalysis Knapsack Cipher Using Artificial Immune System Eman Th. Al-Obaidy Veterinary Medicine College University of Mosul, Iraq Received on: 16/07/2008 Accepted on: 14/10/2008 ABSTRACT In this work, the use of an artificial immune system (AIS ) in cryptanalysis is explored. This AIS uses the clonal selection principle for the cryptanalysis of knapsack cipher. Results showed that the proposed approach is good especially when the effect of the control parameters on the performance of the clonal selection is well taken into consideration. The program is written in Turbo C.


Introduction
The verbtrate immune system is a rich source of theories and acts as an inspiration for computer-based solutions. Over the last few years there has been an increasing interest in the area of artificial immune system [1]. Most AIS aim at solving complex computational or engineering problems, such as pattern recognition, elimination and optimization [2]. This research investigates using AIS in the cryptanalysis of the knapsack cipher problem.

The Immune system
The human immune system is a complex system of cells, the basic component of the immune system is the lymphocytes or the white blood cells. Lymphocytes exit in two forms, B cells and T cells. These two types of cells are rather similar, but differ with relation to how they recognize antigens and by their functional roles, B-cells are receptors of antigen=BCR, T-cells are receptors of antigen =TCR B-cells are capable of recognizing antigens free in solution, while T cells require antigens to be presented by other accessory cells. Each of this has distinct chemical structures and produces many Y shaped antibodies from its surfaces to kill antigens. Ab's are molecules attached primarily to the surfaces of B cells whose aim is to recognize and bind to Ag's as shown in figure 1 [2]. The cells that originally belong to our body and harmless to its functioning are termed self (or self antigens), while the disease causing elements are named nonself (or nonself antigens). The immune system, thus, has to be capable of distinguishing between what is self from what is nonself. Binding is highly specific, so each detector recognizes only a limited set of structurally related antigen. A striking feature of the immune system is that the processes by which it generates detectors, identifies and eliminates foreign material, and remembers the patterns of previous infections are all highly parallel and distributed. This is one reason why immune system mechanisms are so complicated, but it also makes them highly robust to failure of individual components and to attack the immune system itself. Antigenic recognition is the first prerequisite for the immune system to be activated and to mount an immune response. The recognition has to satisfy some criteria. First the cell receptor recognizes an antigen with a certain affinity, and a binding between the receptor and the antigen occurs with strength proportional to this affinity. If the affinity is greater than a given threshold, named affinity threshold, then the immune system is activated. The human immune system contains an organ called thymus that is located behind breastbone, which performs a crucial role in the maturation of T cells. After T cells are generated, they migrate into the thymus where they mature. During maturation all T cells that recognize self antigens are excluded from the population of T cells, a process termed negative selection. If a B-cells encounters of nonself antigen with a sufficient affinity it proliferates into memory and effector cells a process named clonal selection. In contrast if a B-cell recognizes a selfantigen it might result in suppression, as proposed by the immune network theory [1]

Artificial Immune System
An artificial immune system is a computational system based upon metaphors of the natural immune system or artificial immune systems are intelligent methodologies inspired by the immune system toward realworld problem solving [8]. The most common principles used by AIS are negative selection, clonal selection and immune network theory.

Clonal Selection
Ab's are molecules attached primarily to the surface of B cells whose aim is to recognize and bind to Ag's. Each B cell secretes a single type of Ab, which is relatively specific for the Ag. By binding to these Ab's and with a second signal from accessory cells such as the T-helper cell the Ag stimulated the B cell to proliferate (divide) and mature into terminal (no dividing) Ab secreting cells, called plasma cells. The process of cell division (mitosis) generates a colne i.e. a cell or set of cells that are the progenies of a single cell.
B cells in addition to proliferating and differentiating into plasma cells, can differentiate into longlived B memory cells. Memory cells circulate through the blood, lymph, and tissues and when exposed to a second antigenic stimulus, commence to differentiate into plasma cells capable of producing high -affinity Ab's, preselected for the specific Ag that had stimulated the primary response. In figure 2 the clonal selection process is shown [9]. De Castro [9] presented an algorithm called CLONALG, which is based on the natural clonal selection, the following list contains the notation which is used to describe the algorithm: Ab: available antibody repertoire ; Ab{m} : memory antibody repertoire ; Ab{r} : remaining anti body repertoire; Ag{m}: population of antigens to be recognized ; fi : vector containing the affinity of all antibodies with relation to the antigen Agj ; Ab j {n} : n antibodies from Ab with the highest affinities to Agj ; C j :population of clones generated from Ab j {n} ; C j * : population C j after the affinity maturation process; Ab{d} : set of d new molecules that will replace d lowaffinity antibodies from Ab{r} ; Abj* : candidate from C j * , to enter the pool of memory antibodies ; Using the termination above, the CLONALG algorithm can be described as follows : 1) Choose an antigen randomly from Ag{m} and present it to all antibodies in the repertoire Ab; 2) Determine the vector fj which contains the affinity of the chosen antigen to all the antibodies in Ab; 3) The antibodies with the highest affinity to the chosen antigen are selected from Ab , to compose a new set Ab j {n} of high affinity antibodies ; 4) These selected antibodies are now cloned independently and proportionally to their affinities, to generate another repertoire C j of clones. The higher their affinity, the more clones are produced ; 5) The repertoire C j is submitted to an affinity maturation process inversely proportional to the antigenic affinity to generate another repertoire C j * of colnes. But here is the rule, the higher the affinity, the smaller the maturation rate ; 6) Determine the vector fj* which contains the affinity of the matured clones C j * in relation to the antigen (which was chosen in 1) 7) From C j * another re-selection is done to select the one with highest affinity in relation to the antigen (which was chosen in 1) to be a candidate to enter the set memory antibodies Ab{m}.If there already exists an antibody (to the antigen chosen in 1)in Ab {m} which affinity is lower, then it is replaced by new one; 8) The d lowest affinity antibodies (corresponding to the antigen chosen in 1) from Ab{r} are replaced by new individuals. [9][10]

The Knapsack Cipher:
The knapsack problem is formulated as follows. Let us assume that the values M1M2…..Mn and the sum S are given. Let it be necessary to compute b1 b2 ….bn values, so that S = M1b1 + M1b1……+ Mnbn . The values of coefficient bi can be equal to 0 or 1. The 1 value shows that object will fit into the knapsack, 0 values will not into the knapsack.
The Markle-Hellman knapsack cipher encrypts a message as a knapsack problem. The plaintext block transforms into binary string( the length of block is equal to the number of elements in knapsack sequence). One value determines that an element will be in target sum. This sum is a ciphered message. Table 1 shows an example of solving the knapsack problem for the entry numbers sequence:1 3 6 13 27 and 52. for i =2,……., n ( where ai is the i-th element of the sequence). For example {1,3,6,13,27,52} is a superincreasing sequence but {1,3,4,9,15,25} is not. The superincreasing knapsack is easy to decode, which means that it does not protect the data. Anyone can recover the bit pattern from the target sum for a superincreasing knapsack if the elements of the superincreasing knapsack are known.
Markle and Hellman suggested that such a simple knapsack be converted into a trapdoor knapsack which is difficult to break. The algorithm work as follows: 1. select a simple knapsack sequence. Elements make a superincreasing sequence A'=(a'1+a'2+……a'n) 2. select an integer value m greater than sum of all elements of superincreasing sequence. 3. select another integer w that the gcd(m,w)=1, that is number m and w are reciprocally prime. 4. find the inverse of the w mod m-w -1 5. construct the hard knapsack sequence A=wA' mod m i.e. ai = wa'I mod m The trapdoor sequence A could be published as a public key (encryption key). The private (secret) key for this cipher consists of a simple knapsack sequence A' so-called trapdoor values m,w, w -1 .
The encoding is done as follows: The message is divided into n-bits block (each block contains as many elements as simple knapsack sequence). Values in the message block show that the element will be in the target sum. The target sum of each block is cipher text.
The decoding consists of the following: Each number of the ciphered message is multiplied through w -1 mod m and the result of this operation is plaintext. [11][12] [13].

Related work:
Many of the reserchers use genetic algorithms for attacking the knapsack cipher: Spillman [14]   Clarck,J.,A [15] used a knapsack with the same size of Spillman,R algorithm but with a modified fitness function. Fitness = 1 -( Sum -Target / MaxDiff ) 1/2 He found that this modified fitness function finds the solution more quickly since solutions with their sum greater than the target are not being penalized.
In this research the clonal selection algorithm is applied in attacking the knapsack cipher.

The Proposed Algorithm:
A clonal selection principle is used in the proposed algorithm for attacking the Knapsack cipher.
CLONALG in [9] is used in this research( the optimization version), which is clarified in figure [3]. Here is some notes about the algorithm in the optimization version: • In step 1 there is no need for explicit Ag population just an objective function g(.) to be optimized (maximized or minimized), here an Ab affinity corresponds to the evaluation of the objective function for a given Ab.  The following restrictions have been made for incoding : 1. only the ASCII code will be encrypted. 2. The superincreasing sequence will have 8 elements this number of elements guarantee that each character has a unique encoding ( there are 256 ASCII codes and 8 elements length will allow to encrypt 2 8 character A random population of antibodies (binary string 0's and 1's) is used for initialization, the equations from 1 to 5 are used in the algorithm.

Results:
In the proposed algorithm each target sum(ciphertext) in table 2 for each character is attacked 7 times with the following entry parameters: N=100, n={10,20,30,40,50,60,70}, (d=N minus n ), the mutation rate Pm =0.12 then the results are averaged as shown in table 3: The experiments indicated that the proposed algorithm showed a good performance in finding the correct solution when it's results are compared with the Spillman's algorithm in [14] as shown in table 4. The Spillman's algorithm searches on average less than 2% of the space. This diveregence of the results because of the`area of the possible results in Spillman's work the space is 2 15 i.e. 32678 and in this work is 2 8 also the algorithms in [12] and [13] search on average 47.9% and 48.4% of the space (2 8 ), respectively so that the proposed algorithm is good because it searches on average l2.3% of the space (2 8 ) .

Conclusion :
An algorithm that uses a clonal selection principle is used in attacking knapsack cipher. This paper indicates that the clonal selection offers a powerfull tool for cryptanalysis of knapsack cipher especially if the parameters of this algorithm are carefully set, and these parameters are: the size of antibodies repertoire ( N ), the size of best selected antibodies for cloning ( n ), the set of new antibodies that will replace low-affinity antibodies ( d ) and mutation probability ( Pm ). By performing a number of trial runs it can be concluded that starting the clonal selection with ( N=100, n=10,20,….,70, d=N-n, Pm =0.12) can increase the effeciency and performance of the clonal selection. The algorithm gives the correct result by searching 12.3% of search space (2 8 ) while genetic algorithm in [12][13] searches on average 47.9% and 48.4% respectivly of the same space to find the correct result.