Parallel Direct Search Methods

Mostly minimization or maximization of a function is very expensive. Since function evaluation of the objective function requires a considerable time. Hence, our objective in this work is the development of parallel algorithms for minimization of objective functions evaluation takes long computing time. The base of the developed parallel algorithms is the evaluation of the objective function at various points in same time (i.e. simultaneously). We consider in this work the parallelization of the direct search methods, as these methods are non-sensitive for noise and globally convergent. We have developed two algorithms mainly they are dependent on the Hock & Jeff method in unconstrtrained optimization. The developed parallel algorithm are suitable for running on MIMD machine which are consisting of several processors operating independently, each processor has it's own memory and communicating with each other through a suitable network. Key-Words: nonlinear optimization, unconstrained optimization, multidirectional search, parallel direct search, parallel computing, MIMD Computers. لا رط ئا ق ل ةيزاوتملا رشابملا ثحبل فلخ ريشب ةمعنلا دمحم ةيبرتلا ةيلك تانبلل ةيبرتلا ةيلك لصوملا ةعماج :ملاتسلاا خيرات 19 / 8 / 2010 لا خيرات :لوبق 10 / 11 / 2010 لا صخلم ام ةطقن يف ةلادلا ةميق باسح نأ ذإ ،ادج اًفلكم بلغلأا ىلع نوكي ةلادل ةميق ربكا وأ لقا داجيإ جاتحت يتلا لاودلل ةميق ربكأ وأ لقا داجيلإ ةيزاوتم تايمزراوخ ريوطت وه لمعلا اذه يف انفده.لًايوط اًتقو ذخأي يإ وه ريوطتلا اذه ساسأو ،ليوط تقو اهميق باسح .دحاو نآ يف ةفلتخم طاقن يف فدهلا ةلاد ةميق داج ريغ قئا رطلا هذه نوك رشابملا ثحبلا قئا رطل ةيزاوتم تايمزراوخ ريوطت لمعلا اذه يف انسرد اسح ةماع ةروصب ةبراقتمو شيوشتل ةس ، ف فيجو كوه ةقيرط ىلع ساسلأاب نيتينبم نيتيمزراوخ ريوطتب انمق ي ا لأ .ةديقملا ريغ ةيلثم وخلا عون نم تابساح يف ذيفنتلل ةبسانم ةروطملا ةيزاوتملا تايمزرا MIMD ةدع نم نوكتت يتلاو ةصاخ ةركاذ هل جلاعم لكو ةلقتسم تاجلاعم ل و ه ت ةبسانم لاصتا ةكبش للاخ نم اهضعب عم تاجلاعملا لصت . Bashir M. Khalaf & Mohammed W. Al-Neama 52 :ةيحاتفملا تاملكلا لاا ةيلثم يطخلا ريغ ة لاا ، ةيلثم دَّ يقملا ريغ ة لاا ددعتم ثحبلا ، رشابملا ثحبلا ، تاهاجت ، ةيزاوتملا ةبسوحلا ، يزاوملا بيساوح MIMD .


Introduction:
Optimization is a mathematical discipline which appears in many fields such as engineering, economics, operations research, management science, etc. Such as maximizing the production of rice, reducing cost of a car, or getting best performance out of a battery. Optimization can be described as a method of getting best out of any situation. Formally, optimization is minimization or maximization of a function subject to certain constraints. Mathematically we represent an optimization problem as: The function f: R n → R is called the objective function, and set D  R n the constraint set. x = [x1, x2, ..., xn] T is the vector representing n independent variables. Very often optimization problem is stated as minimization problem. An optimization "problem" is unconstrained if the constraints do not have any effect at optimum. (1) Today optimization is well understood discipline with rigorous analysis methods. But in early 60's, the tools and techniques of analysis were yet to be developed and proven. In 1961, Robert Hooke and T.A. Jeeves developed a method for optimization and coined the phrase "direct search" [5], [13]. They provided the following description of direct search methods in the introduction of the paper: We use the phrase "direct search" to describe sequential examination of trial solutions involving comparison of each trial solution with the "best" obtained up to that time together with a strategy for determining (as a function of earlier results) what the next trial solution will be. The phrase implies our preference, based on experience, for straightforward search strategies which employ no techniques of classical analysis except where there is a demonstrable advantage in doing so.
Hooke and Jeeves's paper appeared before any of the "techniques of classical analysis" that use Taylor series expansion of the objective function became available. Objective function can be expanded using Taylor series expansion as: where x is a vector of variable increments, x is the gradient vector containing the first partial derivatives, and H is the matrix of second partial derivatives, the Hessian matrix. Direct search methods neither require nor estimate derivatives. As a consequence, while they are usually slower to converge than derivative based methods, they are usually much more robust in situations where the function values are subject to noise, analytic derivatives are unavailable, or finite difference approximations to the gradient are unreliable. Furthermore, the direct search schemes given here parallelize very well although they can certainly be used as sequential methods. [7,10] Hence the objective of this research is the development of a parallel Direct search methods which is suitable for running on a MIMD (Multiple Instruction streams with Multiple Data streams) computer ( [6,8,9,11]).
( 1 ) Though the term unconstrained, is standard, is somewhat misleading and does not mean lack of constraints. It refers to a situation in which one can move a small distance away from the optimum point in any direction without leaving the feasible region [12].
MIMD Computer Consists of several processors, each processor has its own memory and processing unit. These processors communicate through a suitable communication network. (for more detail, see [1,2,8,11]). In MIMD computer each processor can carry out its own set of instructions, often on its own set of data, independently of all the other processors. Such computers usually number their (more complex) processors in tens rather than thousands that may be found in SIMD (Single Instruction Stream with Multiple Data Stream) computers. MIMD computers are well suited to algorithmic parallelism in which problems can be separated into concurrent independent processors [4].

Direct Search methods:
Direct search method as described by Hooke and Jeeves [13] requires space of points P  R n (henceforth referred to as design space P=[ x1, x2, ..., xn]) which represent possible candidates in the optimization problem, together with a means of saying that P1 is a "better" candidate than P2 (written P1  P2) for any two points in the space. There is presumably a single point P * , the solution, with the property P *  P for all P ≠ P * . Algorithm 1 explains the direct search method.

Algorithm 1 Direct Search Method [3]
1. Each class of methods defines a basic idea or strategy for finding the new point in the space. Following methods are discussed with minimization of objective function as the optimization problem to be solved.

One-at-a-time search:
One-at-a-time search method is also known as alternating variable method from it's form in two dimensions [15]. This is the simplest strategy which consists of minimizing with respect to each independent variable in turn. As shown in figure (1), for two dimensional case, first one variable is varied until no further improvement can be obtained, then the next one and this sequence is repeated with ever-decreasing steps.
One of the drawback of this method is that in most practical cases where the direction of optimum is not along any coordinate axes, the progress is slow and it becomes very inefficient as the number of variables increase [3].

Pattern search methods
Pattern search methods try to find a "better" search direction than simple directions along coordinate axes as in random search methods. This better search direction is found using exploration in the design space. The procedure of going from one point to the new point in design space is called a move. A move is termed a success if the value of f(Pi+1) is less. Than f(Pi) .; otherwise, it is a failure [3].

Hooke and Jeeves pattern search
The pattern search method as described by Hooke and Jeeves (referred mostly as the pattern search method in literature) makes use sequence of exploratory moves and pattern moves.

Exploratory move:
In exploratory move each coordinate direction is examined in turn in the following way. A single step is taken along the direction (i) (by adding an increment  to variable xi). If the move is successful, then the new value of the variable is retained. If the step fails step is taken in opposite direction (by subtracting  from variable xi). If this move is successful then that value of variable is retained otherwise the original value of xi is kept. When all the (n) coordinate directions have been investigated the exploratory move is complete. The point arrived at as a result of this procedure, which may or may not be distinct from the point from which the move originated, is called the base point.

Pattern move:
Initial base point and the base point obtained using the exploratory move define the "pattern" or the search direction. Pattern move takes a single step from present base point in the direction specified by the pattern. This becomes the new starting point for next exploratory move [3].
When a pattern move and successive exploratory move fail, the algorithm returns to the previous base point. If the exploratory move around this base point also fails the pattern is destroyed and increment  is reduced. The whole algorithm is repeated starting from this point. The search is terminated when the increments fail below prescribed limit.
As shown in figure (2) point P1 (marked 1) is the first base point Bo. First exploratory move from Bo begins by incrementing x1 and resulting in P2. Since f(P2) < f(P1), P2 is retained and exploration is continued by incrementing x2. f(P3) < f(P2) so P3 is retained in place of P2. The exploratory move is complete and P3 becomes the second base point B1. Pattern move is made in the direction of B1 −Bo from P3 to P4 (B1 −B0 = P4 −P3). Now f(P4) is not computed, but an exploratory move is performed to improve on the pattern direction. The best point found along x1 coordinate is P5. Since the second move along x2 fails, as the points obtained (P6 and P7) are not better than P5, exploratory move is complete and P5 is retained. As f(P5) < f(B1) = f(P3), it becomes the new base point B2.
Similarly the next base point B3 is obtained as P10. Now a pattern move is made to point P11. Subsequent exploratory move tries points P12, P13, P14, P15 and fails, so we come back to P10. Since f(P11) > f(P10), pattern move itself has failed and we come back to the previous base point at P10. Fresh set of exploration to points P16, P17, and P18 also fail, causing the pattern to be destroyed and increment  to be reduced. The whole procedure is restarted at point P10.

Parallel direct search methods:
The Parallel direct search methods are designed to solve the unconstrained minimization problem: What distinguishes the direct methods from other optimization methods is that they require only that the function f be continuous.
Direct search methods neither require nor estimate derivatives. As a consequence, while they are usually slower to converge than derivatives based methods, they are usually much more robust in situations where the function value are subject to noise, analytic derivatives are unavailable, or finite difference approximations to gradient are unreliable. Furthermore the direct search schemes given here parallelize very well, although they can certainly be used as sequential methods [14].

The first parallel algorithm:
It is clear from the steps of the of the direct search methods algorithm that they are independent processes. Hence each function evaluations process can be carried out in a processor of a MIMD computer. The number of the processors which are used is 2n+1 (n represents the number of the variables).
For simplicity, we can assume P = {x1, x2,…, xn}, xi = xi+h, xi = xi-h, and the first parallel algorithm becomes as follows: (P*). 5. Otherwise: try again with a step half as long (h=h/2). 6. As (P * ) approaches the solution, the algorithm reduces the length of the steps. 7. Stopping criteria: step length falls below a certain tolerance.
Parallel method can be described in the following shape:  We can decrease the no. of CPUs used in the first method to reduce the cost or to use when the no. of the variables (x1, x2,…, xn) is less. In this algorithm each function evaluations process can be assigned to one of the processors of a MIMD computer, which consists of n+1 processors.
The first step of CPU1 calculates f(P1) and compares it with f(P1). If f(P1)<f(P1) then considering x1 = x1+h, otherwise x1 = x1-h, the same procedures are applicable on the CPU2 .. CPUn. In the next step, CPUn+1 receives the new values from CPU1..CPUn, if there is no new value then we try with a step half as long (h=h/2) and send h to CPU1..CPUn, else send the new values to CPU1 .. CPUn, till we find no reduction in reduction in the function value. Parallel method can be described in the following shape:  The results of study showed that there are many differences between the two methods. These differences can be summarized in the following table:

Table (3) Differences between the suggested parallel algorithm First Parallel Method
Second Parallel Method 1) We need 2n+1 processors whatever the number of variables We need n+1 processors. n is the number of variables 2) The program takes less time than the second algorithm when the variables of the function increase..
The program takes more time than the first algorithm when the variables of the function increase.

3) It is costly for solving high dimensional problems. it is preferred to use a such method in simple problems
It is not costly for complex problems because all problems need n+1 processors.