Parallel Simulated Annealing For Stochastic Reservoir Modeling

Socletv of PetroleumErminesn
SPE 26418 Parallel Simulated Annealing for Stochastic Reservoir Modeling

M.N, F%tndaand L.W. Lake, U, of Texas
SPE Members
(l
00WMlhl 1993. SOClefY of PetroleumEngineers,Inc. Th18-r W- preparedfor pmwntatlon aI the $Sth AnnualTechnicalConferenceam! Exhlbltlonof the Soclefyof Pelro!Wm Engineersheld In Houston.TeXSS.3-6 Octobor1993.
Tfrlapaper wae wlwtwf for prewntatlon by an SPE ProgramCommltfeafollowingreview of Information containedin an abstractsubmittedby the author(s).Con!enloof the paper. as praaented,have not been revtewedby tha Societyof PatroleumErrginaeraend we subjectto correctionby the author(s).The matariat as presented,does not neces*rlIY reflect any pos!flwIof the SOClety of PetroleumEnglnwre, its offlwre, or memtram.Pap-wspresentedat SPE maatlngaare subjectto publication raviewby EditorialCommU!eee of the SOCIeIY OfpefrdaumEngI~rE. parmlaaton tooopyISrestricted toan abstractofnotmorethen300 worde.Iiluetrallona msy notbe copied.The abetracfshould containCJMSPlcUOus where and by whom tha paper Is prewrrted. Write Llbrarlan,SPE, P.O. Box 833S36, Richardson,TX 7E41W-3S3B, U.S.A. Telex, 163245 SPEUT.
of
=knowWImsm
ABSTRACT Simulatedannealing (SA) Wutiques have shown great potential to generate geologically realistic permeability fields by combining data from many sources, such as well logs, cores, and tracer tests. However, the application of SA in reservoir description and simulation is limited owing to its prohibitively large computational time requirement, even on modern MqXWornputem. This paper introduces an implementation of a parallel SA algoxithmfor stochastic reservoir modeling on a Hypercube praxtwor network (Mel iPSC 860). The corresponding sequential code, which incorporates univariate and bivariate statistics to generate the permeability field, is optimized to gain maximum advantage fkom the parallel application. In this particularparallel implementation,each processorruns the same source code asynchronously using the single-instructionmultipledata (SXMD)approach, with synchronization linked to an optirmdity tesL By porting the SA algorithm to a parallel cOmputerwe can generatepermeabilityfielm that representnonGaussian bivariate statistics and complex flow geometry at a finer scale than was previously feasible. INTRODUCTION Inverse modeling, a technology which includes the simulated annealing(SA) algorithms, is a systematic procedure to estimate the constitutive properties of a physical system based on a few Referencesand illustrationsat end of paper 9
experimental observations. A physical system can be thought of as a portion of the universe delineated by physical or mathematicalboundaries,such as the earth for a geophysicist,an underground reservoir for a petroleum engineer, or a quantum particle for a quantum physicist. The set of physical parameters describing such systems depends on the specific models used. For instance, a geophysicist characterizing the earths mantle might use the elastic properties of the solids as the parameters, whereasa petroleumengineerrequires permeabilityand porosity distributions to characterize the reservoir rocks. To investigate such physical systems with inverse modeling we use experimentalobservationsto infer the actual valuesof the model parameters. The most general way to accomplish this is by assigning probabilities to all the possible values of the model pmeters. It follows then that the measurementof data, the a primi information in the physical correlation between experimental observations and model parameters, can be described by using probability densities. Simulated annealing algorithms offer solutions to large, often complex, inverse problems by determining such probability densities in a fully nonlinearway. Simulated annealing (SA) algorithms, which belong to a subclass of the general inverse problems, have seen a number of engineering applications statistical mechanics, image analysis,l artificial intelligence,2 optimization in groundwater ma nagemen t,3 seismic inversion,4*5 and reservoir modeling.6**0 SA techniques have shown great promise in obtaining integrated reservoir description because they can combine data from several sources, such as cores, well logs,
PARALLELSIMULATEDANNEALINGFOR STOCHASTICRESERVOIR MODELING
SPE 26418
seismic traces, and interwell tracer flows. The advantageof SA over traditional stochastic techniques is their ability to incorporateeffectivepropertiesderivedfmm integratedmeasures. The disadvantage of SA lies in the large computational (CPU) time required for a reasonable convergence on single-processor machines. However, since they are principally a variation of Monte Carlo simulation technique, they are usually well suited to parallelization. In this paper we describe the parallelization of the so called heat bath algonthm4 on the Intel iPSC 860 Hypercube computer. Our results show that, when the problem size is large, a considerable degree of speed-up is gained by using multiple processors. The efficiency of using multiple prmxssors also increaseswith the size of the problem because of the reduction in the ratio of communication to computation time. SIMULATED ANNEALING INVERSE PROBLEMS - APPLICATION TO
The transition probability between the states Xn and Xn+l can be expressedas
p.
(Ac)= exp [ - %~1-)]

n
(1)
where Tn is a temperature-like function and the series (Tn) is a set of monotonically decreasing positive numbers called the cooling schedule, Following 13q.(1) it is easy to show that the sequence (Xn) attains global convergence as Tn + O. This is because (Xn) is a Markov process and the transition probability, pn between neighboring configurations in (Xn) forms a Gaussian random variable ranging between Oand 1 with a mean value = 0.5. Now, if we equate Eq. (1) to 0.5, then, in an average sense, the ratio *will always be less than unity. As a result,
Considerthe generation of a permeability field on a specific grid as an example of inverse problems. This permeability field should match experimental observations, such as core data, variograms,tracer flow history, etc. Traditional methods used to solve such problems in petroleumengineering include type-curve matching,7numerical simulation,* spectral conditioning and matrix decomposition methods.9 These methods sufler from numerousdrawbaclw the type-curve matching techniques are based on very simple mathematical models and hence yield non-unique results. The stochastic simulation methods of~n assume that the distribution of permeability is stationary and Gaussian. In addition, generating a stochastic field that matches observations from a tracer test or a pressure transient test requires a large number of random realizations, sampled exhaustively,until a desired match is obtainti An exhaustive search of all the possible walizations is computationallyredundantand prohibitive. However, unlike the traditional methods, which tend to exhaustively search through all possible realizations (also called the configurationspace &noted by E), SA searchesthroughonly a portion of it (See Fig. 1). In addition, unlike the traditional methods where the selection of any realization, which is also called a slate or a conjigurafion, is an independenteven~ in SA it depends on its immediate previous neighbor. Thus, SA can be thought as an one-step Markov process in the configuration space. The principle of SA involves moving between neighboring states within E. At each step, when a state is visited, an objective function, the weighted sum of squazed differencesbetween the experimentaland computedattributes, is evaluated. Mathematically, the objective function is a mapping from E onto the real line, e: E + ill,and the sequential SA on E genemtcsa randomsequence (&J 6 E, of configurations that marches toward the desired convergence as the number of selections n + -. TMs is illustrated in Fig. 1. The transition probability between any two states, which is also termed the Gibbs probability function, depends strongly on the differences between the values of the objective functions of these states. 10
(en+l - en) will approach Oas Tn + Oforcing (X~ toward the desirableconvergence. There are two broad classes of SA algorithm that are commonly used for reservoir characterization6i10S11: (1). the Metropolis,l 1 and (2). the heat bath algorithm.4 The Metropolis method for SA, although simple and effective, can be computationally prohibitive for large-scale reservoir engineering problems because of the huge number of rejection moves it makes, especially at low temperature. Because of this reason, in this paper we focus our attention on the alternative method, the heat bath algorithm (HBA), which is better suited for reservoir description. Ouenes et aL 11 and Datta Gupta6 describe the application of Metropolis algorithm for reservoir descriptionand Azencott13gives an extensive review on parallel Metropolis algorithms. In the next section we present a brief description of the sequential HBA followed by two parallelization schemes, We also comparetie efficiencyof these schemes.
There me five important components in any SA algorithm% A conciserepi%sentmion oldata structurk A scalarobjective function,e, which expresses the objectives of the optimization as a single number and also assigns weights among multiple objectives, A procedurefor generatingrandomchanges. A controlparameterT and an anneahg schedule,(Tn). A convergencecriterion. What makes the HBA more attractive than the Metropolis algorithm, especially for reservoir characterization, is that annealing all the sites in every iteration tends to obtain the most general solution with the minimum number of moves or perturbations. Algorithm - For the generation of stochastic fields using HBA the various steps can be summarizedas Select an initial state. Usually this is a distribution of permeability,ki, on a desired grid, sampled randomly froma speciileddistributionbasedon experimentaldata.
t..
.,
SPE 26418
MANMATHN. PANDA ANDLARRY W. LAKE PARALLEL SIMULATED ANNEALING
2. Begin annealing. Start at the first gridblock and let the permeability of this block assume M possibIe values of ki where i = O,..., M-1, randomly selected from the experimental distribution. Here M is a number conveniently Calculate the chosen for computational efficiency. corresponding objective functions, e(ki). Choose a new value of permeability for the block by drawing at random from the following distribution: P(lq) = exp (-e(ki)/ T) /~ * exp (-e@i)/ T). i=O (2)
Permeability values that reduce the objective function e(ki) are generally chosen because they give rise to large probabilities in Eq. (2). 3. Visit sequentially all other gridblock and update the permeability as in step 3. An iteration is complete when all the gridblock havebeen visited. 4. On completion of an iteration, lower the temperature T, according to a specified coding schedule, for example, T = (0.8)n To where n is the iteration number and To is the initial temperature. 5. Return to step 2 and continue until a suitably defined convergenceis satisfied. Obje#tive Function- Stochastic permeability fields can be generated using the HBA by posing this task as an optimization problem. We assume that permeability is a spatiallyrelatedrandom variable. The averagepropertiesof such a field are generally obtained from wre and log &ta, and the autocorrelation structure is defined by variograrns. We assume that the vmiogram, y(h),depends on the separation distance h only, 2~) = E[ { Z(x)- z(x+h))?l,
. Simulated annealing is a computationally very intense algorithm. Even though it samples only a fraction of the entire configuration space, the absolute number of moves or perturbations often becomes prohibitively large even for modem supercomputers. In addition, computational time also increases as the number of terms in the objective function increases. For example, addition of tracer flow data to Eq. (3) to compute the objective function may increase the computation time sharply. However, being primarily a Monte Carlo method, SA is well suited for application on parallel computers. Particularly for large spatially correlatedproblems, like reservoir modeling, the HBA is more suitable for parallel application. This is because, unlike the Metropolis algorithm, in HBA there is no rejection of moves at low temperature% this is particularly an attractive feature for using computer resources efficiently. In this paper we focus our attention on parallelizing the HBA for reservoir charactetiation. Azencott13 gives a thorough review on the parallel Metropolis algorithms that is suitable for generating uncordated randomfields. ~One of the ch~lenges Of parallel computing techniques is the assignment of computational sub-domains to individual processors such that the required mmmunication between them is minimum. Even though, a large number of ud hoc approachescan be found in the literature, an efficient parallelization scheme appears to be problem-dependent. In this paper, we describe two schemes of parallel HBA, called scheme 1 and 2, and discuss their relative advantagesand disadvantages. Scheme 1. This scheme divides the computational grid into equal-sizedsubgrids,the number of which is equal to the number of processors being used. This is also called the systolic or the domain-division scheme.13 Each processor is assigned to a particular subgrid. The processors apply SA to their respective subgrids asynchronously. Synchronization is linked to an optimality test. Figure 2 presents an illustration of this scheme. Consider generating a permeability field on nbl grids using n processors. Scheme 1 divides this grid into n equalized subgrids and assigns them to the processors. Figure 2 shows a schematicof an initial assignment of four processorson an arbitrary grid. For completing one iteration, the processors apply HBA to all the gridblock in their subgrids sequentially. After the optimal permeability values are determined on all the processors following a perturbation, these values are shared among the processors by a globalsendoperation to update the permeability field. Since the update comes after the optimal values--are determined, the optimality condition lags the computation by one iteration. The effect of this lag on the optimal condition, however, does not become critical as long as the size of the subgrids is larger than the measure of the autocorrelationof the fiel~ IIIa later section we demonstratethe effect of integral measures on the performance of this scheme through an example. The time complexity per iteration, which
where z(x) is a spatially-related random variable, such as permeability here, and E is the expectationoperator. In order to generatea random permeability Jeld with a specifiedcorrelation structure, we minimize the error between the variogram computed from the generated field and the experimental variogram. Thus the objective function can be written as Minimize ( ~ @i (7C(ho -ye (hi))2 ] , alli (3)
wherethe subscriptsc and e stand for computedand experimental attributes, respectively, and the qs are weighting factors that sum to one. We use equal weights for all lags jn Eq. (3). The choice of qs depends strongly on the experience and the judgment of the user. For instance, in F@ (3) we would like to assign a relatively large value to qs that correspond to small lag distancesto preserve the spatialcovariancestructureof the genexatedpermeability field, whiie relatively smaller values are usually assigned to those corresponding to large lag distances, This is because the permeability value of any gridblock is influenced only by the blocks in its near vicinity.
11
PARALLEL SIMULATEDANNEALINGFOR STOCHASTICRESERVOIRMODELING
SPE 26418
is the distribution of CPU time among various operations for arithmetic computation and communication between the processors,of this wheme can M expressedas: Communicatiorcnbl (tgwnd+ (n-l) trav) Computation: !!# (km fi.) ndiv + ~
repeats the entire operation. The time complexity of this scheme i~ Communication:nbl ( (n-l) tsend + tgsend + 2 (n-l) trecv)
(bpt perm.)
(4)
Computation ~ (k fn.) ndiv + (top ~m.) ($ where tsend is the CPU time spent to send a message between two processors. Scheme 2 inchxdesthe following steps: 1. Define the initial distribution of permeability on a grid mesh as in sequentialHBA. 2. Randomly select a specific number of permeabilities, ndiv, from the experimentalpermeabilitydistribution. 3. Designate the master processor (usually processor 0), Read the input data on the master processor and send the information to all other processorsby a global send operation, 4. Start with gridblock 1. Assign one permea~lity from ndiv values to each processor. These processors compute the Gibbs probability @q. 2) and send them to the master. After all ndiv permeability values have been assigned, the master determines the optimal permeability from the entire Gibbs probability function and sends it to all the other processor by a global send operation. 4. Sequentially visit all the gridblock to complete one iteration. 5. Lower the temperatureartdrepeat steps 3 and 4. 6. Repeat steps 3 to 5 untill a desired stopping criterion is satisfied Comparing Eqs. (4) and (5) we observe that in scheme 2 the communication overhead is larger than in scheme 1. Yet, for reservoir modeling applications where permeability is often spatially correlated, scheme 2 is a more attractive option since the accuracy of this algorithm does not depend on the spatial structure of the permeability field. In the next section we present the results of an application of the parallel HBA to generatestochasticpermeability fields that shows the validityof the above cktim. We also compare the merits of these two schemesthere. RESULTS AND DISCUSSION
n nbl ndiv tgsend kecv %7fn.
= numbr of processors = total numberof gridblock = number of divisions between the minimum and the maximum perrneabtity = CPU time taken by a global send operation = CPU time taken by a receive operation betweentwo processors. = CPU time required for computing the error function, all i = CPU time required to determine the optimal permeability from the Gibbs ~obabilf~ distribution
topt perm.
ha typical scheme 1 applicationthe serviceprocessoror node (usually processor O) reads the input data and sends the information to all other processors by a global send operation. As the processors sequentially visit the gridb!ocks in their respective subgrids they evaluate the optimal permeability y following a procedure that is identical to the sequential HBA. Even though each processor performs annealing within its subgrid, it uses the permeability values of the entire grid mesh to evahate the objective function. Hence after each gridblock is amealed the new permeability value is shared by all the processorsby a global exchange operation in order to update the permeability field. After an iteration is completed, the temperature is lowered, and annealing is continued until a pre spcified convergencecriterionis satisfied. Scheme 2. This scheme assigns all the processors to a single gridblock at a time starting with gridblock 1. This scheme works like the master-slave scheme.13 Each processor is assigned a permeability between the maximum and the minimum values for which it computes the Gibbs probability fimction. We illustrate this scheme in Fig. 3. In scheme2 one of the prowsors, often called the master, does all the bookkeeping and the others, called the slaves, carry out the computations and send the results to the master processor. The master determines the optimal permeability value for the gridblock fmm the computed Gibbs probability function and sends the optimal value to the slaves by a global send operation. Then the processors march to a neighboring gridblock. When all the gridblock are sequentially visited one iteration is complete. At this time the master lowers the temperature and
of Combinatorialoptimizationtechniques, such as SA, are superior than the traditional stochastic simulation methods in generating stochastic permeability fields because (1) these methods are capable of ittcoqxxating information from a numberof soumes, and (2) the results obtained by these methods are robust because of the non-hwar solution procedures. ParalfeI SA ako has an additionaladvantagein terms of saving CPU time. In this section we apply the parallel HBA to simulate the distribution of permeability on a slab obtained from an actual sample by assuming that permeability is a spatially random variable with known variograms. We also assume that permeability is log-normally distributed with a second-order
12
SPE 26418
MANMATHN. PANDA AND LARRY W. LAKE
statimarity structure in space. We compare the efftciemcyof both parallelization schemes in terms of their CPU time requirement and accumcy, We extend the application to study the effat of problem size on the efficiency of the algorithms. Our applicationalso demonstratesthat the choice of paralklizing schemes depends on the autocorrelation structure of a random field - The Antolini Sandstone is an eolian outcrop from northern Arizona.14 A rectangularsample measuring 38x 13x 2 cm was characterized by minipermeameter measurements on each square centimeter and by a miscible tracer flow. Our study focuses on one of the facesof the slab denotedas Face B. Figure 4(a) shows a contour map of the air permeability distribution on Face B. The permeability varies between 10 and 1480 md with an arithmetic average value of 477 md and standarddeviationof 314 md. We presentMatberon estimates15of the average verticaland the horizontalvariogmms in Fig. 4(b). To apply the parallel HBA we define the objective functionas the weighted sum of the squared differencesbetween the computed and experimental average horizontal and vertical variograms. All rhe simulation runs are carried out on an Intel iPSC 860 Hypercubeat The University of Texas at Austin. The resultsam presentedbelow. Scheme 1. Figure 5 shows the behavior of the objective function (i.e., squared error in Eq. (3)) as a function of the number of iterations. The number of processors is varied parametrically from 1 to 8. When the number of processors is increasedhorn 4 to 8 them is significantinterferencebetweenthe processors that forces the permeability configuration to a local minimum after 20 itemions. The interference is because of the large spatialcorrelationof the permeabilityfield The cause of interference between the processors depends on the integral scales of the permeability field. The larger the integral scale, the stronger is the interference. For instance, consider the variograms of the Antolini sample. Its integral scales am 15 cm and 3 cm in horizontal and vertical directions, respectively. This means that a perturbation in any gridblock permeability influences the permeability of all the gridblock that fall within the elliptic area whose half major axis is the horizontal integral scale and the half minor axis, the vertical integral scale. In scheme 1 when the number of processors is increased from 4 to 8 the horizontal distance between two consecutive processors become less than the horizontal integral scale and this give rise to processor intelfenmce. Figure 6 compares four realizations of Antolini core permeabilitygeneratedusing 1,2,4 and 8 processors. The local minimum that occurs when 8 processors are used is very apparentfrom the last figure. In this section we also study the effect of the autocorrelation structure of a permeability field on the performance of scheme 1. To accomplish this we use a power law wuiogram (fractal)model in the horizontal direction. Figure 7(a) presents five cases with varying Hurst coefficien~ H, (Yang, 1989}. Figure 7(b) presents the variation in the
objective functions corresponding to these five cases. From these figures we infer that scheme 1 is less sensitive to fields that have very little spatial correlation, which corresponds to small value of H (S 0.1). As H increases the degree of interference between the processors increases, forcing the solution to a local minimum. Scheme 2. Figure 8 shows the behavior of the objective function when scheme 2 is used. This figure varies the number of processors from 1 to 16. We observe that the objective function decreasesmonotonically with the number of iterations. Unlike scheme 1, the behavior of the objective function does not depend on the numberof processom. The accumcyof the results is also appaxentfrom Fig, 9, which presents four realizationsof Antolini core permeability obtained using 1, 2, 8, and 16 processors. Scheme 2 of the parallel HBA is better suited for application to spatially correlated problems. This scheme is robust in terms of the number of processors and the autocorrelation structure of the permeability field. However, comparing Fig. 6 and 9 we find that the communication overheadin scheme 2 is larger than in scheme 1, which indicates that a better method would be a combination of these two schemes. In this paper, however, we restrict to the use of scheme2 forparallelizationof HBA. To verify the accuracy of the generated permeability fiekls in Fig. 9 we present results of a tracer flow across a twodimensional vertical cross section using the chemical flooding simulator UTCHEM.16 Figure 10 compares the history of the tracer effluent concentration obtained from the simulation with experimental data. The match between the simulation and the experimentaldata shows the validhy of scheme 2 in generating spatiallycorrelatedpermeabilityfields. We have so far studied the application of the parallel SA algorithm to generate spatially correlated permeability fields without regard to the effect of various parameters on the efficiency of the algorithm. Some of the important parametersof SA that effect the efficiencyare the size of the problem, the cooling schedule, and the stopping criterion. The cooling schedule and the stopping criterion have identicaleffectson the sequentialand the parallel SA algorithms, In this section we study the effect of problem size, i.e., the number of gridblock, on the performanceof the parallel SA algorithm. The objective here is to determine the robustnessof the parallel algorithm, in terms of computational time and efficiency, as the problem size increases. The algorithm is robust and linearly scalable if the CPU time required to satisfya convergence criterion increases linearly with the size of the problem. In the case of a linearly scalablealgorithm the number of arithmetic operations increases linearly with the size of the problem. l%ereforqthe number of floating point operationsper unit CPU time remains approximately constant as the size of the problem increases. We summarize the sensitivity of the parallelheat bath algorithm below. (i) ~ion time vs. nroblem siZQ- Figure 11 is a plot of CPU time required to obtain a suitable convergenceof the HBA
13
PARALLEL SIMULATEDANNEALINGFOR STOCHASTICRESERVOIR MODELING CONCLUSIONS
SPE 26418
for various number of processors. In this figure, the size of the permeability field is the parameter. The computation time for a given problem size decreases as the numlxw of processors increases. The nxhwtion of computation time is largest for large problems because of the increase in the computation to communication ratio. In other words, small problems incur a comparativelylarge communicationoverhead on multiprocessor configurations, and, hence, are better suited for sequential operation compared to large problems. Figure 11 also indicates that there is a considerabledecrease in CPU time as the number of prcwessorsis increased A ten-fold speed-up is obtained in generating permeability fields on a 75 x 25 grid using 16 processors.
- Figure 12 presents a plot of (ii) Effj.@ncv vs. ~ efficiencyvs. the problem size for various numberof processors, varying from 1 to 16. The efficiency of an algorithm on n processorsis
(6)
where
n=
nbl n
tl, tn
= = =
efficiency,% CPU time taken to solve a problem on 1 and n processors, respectively problem size (numberof gridbloch hem) number of processors
The efficiency of multipmessor systems is always less than 100 % because of ?he loss of CPU time in communictltion among the processors. Figure 12 also shows that there is a general trend of increasing efficiency of multiprocessor configurations for large problems. This is caused by the reduction in the ratio of communication to computation time. The average efficiency of scheme 2 in our applications is 85 % when 16processorsare used.
v s. @@3.Wt2 (iii) ~ - ne number of floating point operations (FLOP) executed by an algorithm is another indication of its efficiency. Monte Carlo simulation algorithms, and especially SA algorithms, typically achieve large operation counts. Figure 13 presents the operation count for various problem sizes. A typical operation count is about 10 million floating point operations per second (MFLOP) per processor when the number of gridblock is larger than 500. This means that a large fraction of the CPU time is invested in arithmetic computations. A large operation count also means that only a small fraction of CPU time is used during communications among the processors. Since the parallel HBA is only slightly sensitive to the problem size, it can be easily used for solving large problems without losing a large fraction of the computationalefficiency.
This paper presents an application of simulated annealing (SA) on parallel computers to generate spatially correlated data. Generation of stochastic permeability fields is only a particular example. We consider two schemesof parrdlelkttion of the heat bath algorithm (HBA) and discuss their relative merits, in particular relevance to stochastic generation of permeability fields. Generationof uncurrdated data on pamllel wmpuccrsis relatively straight forward where a parallel Metropolis algorithm13 can be easily used. What makes the generationof correktteddata on parallel machines particularly difficult is that the computations on individual gridblock are not independent. Thus, a change in any gridblock influences the values of other gridblock affecting the accuracy of the results. This paper presents anew scheme to paral!elizethe HBA that can generate spatially correlated data without affecting the accuracy of the results. These generated fields honor the experimental variograms. In our applications, we also vary the size of the permeabilityfields generatedand the number of processorsused to study the sensitivity of these parameters. We arrive at the following conclusions from the analysis of the parallel application of the heat bath algorithm. (1) Parallel heat bath algorithm, scheme 2, is well suited for stochastic generation of strongly spatially correlated data liie permeabilityfields. (2) A master-slave type approach to parallelize HBA yields more accurate and robust results compared to a domaindivisiontype approach. (3) Since relatively !itt.lesynchronized communication is necessary, the application of HBA on multiprocessors yields relativelylarge computationalefficiency. A typical efficiencyin this study is 85 %. (4) A typical operation count of SA algorithms on an Intel iPSC 860 on a 75 x 25 grid mesh is 10 MFLOP per prwessor when 16processorsare used. (5) In generating permeability fields on an 75 x 25 grid using 16processors,we obtaineda ten-foldspeed-up. ACKNOWLEDGMENTS This work is supported by the Enhanced Oil and Gas Recovery Research Program of the Center for Petroleum and Geosystems Engineering at The University of Texas at Austin. Larry W. Lake held a Shell Distinguished Chair during the period of this work and currently holds the Montcrief Centenial Chair in PetroleumEngineering at The University of Texas at Austin.
NOMENCLATURE
en e~i) h
k
Energy or objective functionat nth move Energy or objective functionfor ki Lag distance Permeabilityof a gridblock,L2 *
14
+.*
SPE26418 E nbl ndiv Pn+ T tl tn t~nd lgsend t~v terr fn. topt perm.
MANMATHN, PANDA ANDLARRY W. LAKE fkrSge permeability,L2 Total number of gridbhcks Namber of divisions between the minimum and the maximum permeability Transition probability from n to nil configuration Temperature or control parameter for simulated annealing Time taken to solve a problem on a single processorin Eq. (6) Time taken to solve a problem on p proce: in Eq. (6) CPU time spent to send a message between two processors CPU time taken by a global send operation CPU time taken by an messagereceiveoperation betweentwo processors. CPU time rquired for computing the error function CPU time required to determinethe optimal permeability from the Gibbsprobability distribution 5. Sen, M.K. and Stoffa, P.L.: Nonlinear OneDimensional Seismic Waveform Inversion Using Simulated Annealing, Geophysics (1991) 56, 16241638. 6. Datta Gupta, A.: Stochastic Heterogeneity, Dispersion and Field Tracer Response: Ph.D. dissertation,The U. of Texas, Austin (1992). 7. Lee, W.J.: Well Testing, SPE Text Book Series, 1, SPE of AIME, New York (1982). 8. Allison, S.B., Pope, G.A. and Sephernoori, K.: Analysis of Field Tracers for Rt .ervoir Description, ~.pet. Sci. Eng. (1991) 5,2,173-186. 9. Yang, A.P.: Stochastic Heterogeneity and Dispersion: Fh.D.dissertation, The U. of Texas, Austin (1990). 10. Farmer, C.L.: The Mathematical Generation of Reservoir Geology in Numerical Rocks, Joint IMAISPEEuropean Conference on the Mathematics of Oil Recovery, Robinson College, Cambridge University, (July, 1989j. 11. Ouenes, A., Bahralolom, L, Gutjahr, A., and Lee, R.: Conditioning Permeability Fields by Simulated Annealing, paper presented at the Third European Conference on the Mathematics of Oil Recovery, Delft, Netherlands(1992)June 17-19. 12. Metropolis, N,, Rosenbluth, A,, Rosenbluth, M., Teller, A., and Teller, E.: Equation of State Calculations by Fast Computing Machines? Journal of Chemical Physic~(1953) 21,6,1087-1092. 13. SimulatedAnneaiing: Parailelization Techniques, R. Azencott (cd.), John Wiley and Sons, Inc., New York (1992). 14. Ganapathy, S., Wreath, D.G., Lim, M.T., Rouse, B.A., Pope, G.A., and Sephemoori, K.: Simulation of Heterogeneous Sandstone Experiments Characterized Using CT Scanning: paper SPE 21757 presented at the Western Regional Meeting, Long Beach, California, March20-22,1991. 15. Matheron, G; Traite de Geostatistique Appliqu4eV Tome 1.Memoires du Bureau de Recherches Gkologiques et Minieres, No. 14, Editions Technip, Paris, 1962. 16. Datta Gupta, A., Pope, G.A., Sephernoori, K., and Thrasher, R.: A Symmetric Positive Definite Formulation of a Three~DimeniionalMicellar/Polymer Simulator, SPE Reser. Eng. (Nov. 1986)622-632.
Greek Symbols Variogmrn Y

to n
Weighting factor, used to provide selective preferenceto data Efficiencyof a parallel algorithm as in I@ (6)
REFERENCES 1. Geman, S. and Geman, D.: Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images; Institute of Elect. Electron. Eng. Trans. on Patterns Analysis and Machine Intelligence, ~AM1- ~ (1984) 721-741. 2. Hinton, G. and Sejnowski, T.: Optimal Perceptual Inference: Proceedings of Inst. Elect, and Electron. Eng. Comp. Sot. Conf on Computer Vision and Pattern Recognition (1983) 448-453. 3. Dougherty, D.E. and Marryot4 R.A.: Marlcov Chain Length Effects on Optimization in Groundwater Management by Simulated Annealing; Computational Methods in Geosciences, W.E. Fitzgibbon and M.F. Wheeler(etIs.),SIAM Philadelphia (1992) S3 -65. 4. Rothman, D.H.: Nonlinear Inversion, Statistical Mechanics,and Rmidual Statics Estimation, Geophysics (1985) 50,12,2784-2796.
15
e,
SPE26418
Random Walk
Scheme 1
Pmccssor
dbcction
~ Gridblock
q -
B
Sub-domains
Markov Su~=Space
\ Mode; Space
Conve;ged Solution (minimum energy)
Marching in y &cc[ion
Marching in x ditcctiofl
Figure 1. Schematicof dte simulatedanncalbigalgorithm that samplesonly a fraction of drecntkc osmlclspace. Each move worka Mm a Markov process.
Fi_ visit the tidblwks in their .=- 2. Scheme I for parallel the HBA. The processors sub-domainsqucntially. The optimal conditionlags the computationby one i!cration.
Scheme 2
(For gridblc+k m)
k (maximum)
k(1) k(2) Pesmcabilidcssclecccd fsumtbccxpcrimemal distribution
I.w Distance, ens
mZUSE 4(b). Avcmgc horizonti, and vertical cx~rimcn~ semiv~O_ compur~ fmm Face B Antolini Sandatonc permeabilitymcasurcmcnts.
k (minimum)
md 1600 1400 ; 1200
:
Figure 3. Schcmc2 for parallel the HBA. All the proccaaora am assignedto the gcidblockssqucntially. This schcrneyields scsblcsolutions.
1000 600 600
38 cm 4 Horizontal b
400 200 0
?igurc 4(a). Pecmcabilitydistributionon Face B of the Amolird slab. 16
SPE~6418
Parallel of FUA Scheme1
~ 0S1015202S Numberof Iterations Figure 5. Changeof objective functionwith the numberof iterationsfor pamdlel scheme1. Thc sesulta convergeto a Iwal nrinimum when the numberof processorsusedis Iargcr than4. Figure, 7(a). 30 3s .40 0
I Vertical Variogmm = H = 0.1 +- H=O.4 H=O.S -aH.o.6 -H=O.9 8., . . 1. . . 3. ...1....1 . . . . 1.2..1... 5 10 1s 20 25 30 35 40 Lag Distance, cm
Power law (fractal) variograma used to study the sensitivity of scheme 1 to the autocorrelationsmcmre of the pctmcability field. The verticaf variogmrnis the averageexperimentalvtiogrsm compcited fmm Face B Anttrlini Sandstonepermeability measurements.
md
I
06
I
14CI 1200 , * ,, 1000 800 6(IO 4fXl 200
1600
-+- H.o.s 103 Hm O.* -*H.0,4 lor c-mm. Sj- 0,7S ,. :..+.!
,nl ..
-*-rs.
oc9
1,
20
1 :,
I
I
I 1
10
30
40
Numberof Iterations
Figure 7(b). Effect of autocorrelationstructureon the petiorrnanccof scherm 1. For small Hurst coefficient scheme1 is stable. However, when H becomeslarge the solutionis forcedtowards10CSI minima.
ParsllcI HBA .%hcme2 or~~
051015
20
25
30
35
40
Numberof Iterations
Figure 6. Stochasticpermeability distributionon Face B of the Antolini slab. Pesmseability fields gcncratcdusingschemo1 of pasallclHBA on an iPSC 860 Hypercubeon a 38 x 13 mesh.
Figure 8. Changeof objective functionwith the numberof iterationsfor pwallel HB.4, scheme2. This plot showsthat scheme2 is a stablealgorithm.
17
No.
ofpmccsems = 1, CPU timo = 79MJsec
No.
of prwesaors = 4, CPU2fmo = 2015 W
1
1400 . 12LM 1000 $ X 800 600 2 400
lfxxl
5
Number
10
1s
20
of Processors
200
Figure 11. Computationtime requiredfor multiprocessor cmrtigurs!ions for varioua problemsir%s.CPU time rwhces with the numberof prwessors. All runsarc ma& on Intel iPSC 860 Hypercube.
120
100 No.
No. of Processors = 1
ofprocessors= 8, CPUtftrw = 1209 S02
e s :
~
80
60
---------_. .-
/ / ---------- ----- ----- -- --

*-----\
_--
do------,
. ..-. \
--e
8 16
40 [:! q 20/
()~ o
No. of prucssora = 16, CPU time= 826 sw Figure 9. Stwhastfc psrmeab]litydisrnbmionon Face B of the Amolini sktb. Permeability fields gerteratcd usingscheme2 of paraflcl HBA on an iPSC 860 Hypercubeon a 38x 13 mesh.
a
500 1000 1s00 7300
problem Size, number of grldblocks
Figure 12. COmpmationsl cffkierrcy of various multiprocessor configurations. increases wi!h theproblemsize bxausc of the reduction Efficiency
in communicationoverhead.
1.0
0.8
Z. y.
& q
0.6
0.4
~ g
&
b
1s
Cj
10
5
i!
0.2 0.0
0.0
0.4
0.8
1.2
1.6
2.0
Pore Volume Injected
1-.1
+.;y-:rdbocka =2000
-%<:000 -s
.7.
-----...-*. *.6
--------
Soo
------
2s0
Number
of Processors
Figure 10. Cmnpa.isonof. xperimcntaleffluenttracerconccntzarion with lhe results fmm tracerflo - simulationacrossa verticafcresssectionof the Amolini slab generatedusingparallel HBA, achcme2.
Figure 13. Op3rali0n countp$r processor for various multiprocessor configurations on Intel iPSC 860 Hypcrcu&. The average opcra[ioncount is larger than 10 MFLOP (million floating pointopmoons per second) per processor when 16 processors arc used.
18

Parallel Simulated Annealing For Stochastic Reservoir Modeling

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Parallel Simulated Annealing For Stochastic Reservoir Modeling

Transféré par

Droits d'auteur :

Formats disponibles

Socletv of PetroleumErminesn

SPE 26418 Parallel Simulated Annealing for Stochastic Reservoir Modeling

PARALLELSIMULATEDANNEALINGFOR STOCHASTICRESERVOIR MODELING

(Ac)= exp [ - %~1-)]

MANMATHN. PANDA ANDLARRY W. LAKE PARALLEL SIMULATED ANNEALING

PARALLEL SIMULATEDANNEALINGFOR STOCHASTICRESERVOIRMODELING

n nbl ndiv tgsend kecv %7fn.

MANMATHN. PANDA AND LARRY W. LAKE

PARALLEL SIMULATEDANNEALINGFOR STOCHASTICRESERVOIR MODELING CONCLUSIONS

Greek Symbols Variogmrn Y

Conve;ged Solution (minimum energy)

I.w Distance, ens

md 1600 1400 ; 1200

1000 600 600

?igurc 4(a). Pecmcabilitydistributionon Face B of the Amolird slab. 16

ParsllcI HBA .%hcme2 or~~

ofpmccsems = 1, CPU timo = 79MJsec

of prwesaors = 4, CPU2fmo = 2015 W

ofprocessors= 8, CPUtftrw = 1209 S02

/ / ---------- ----- ----- -- --

problem Size, number of grldblocks

Pore Volume Injected

Vous aimerez peut-être aussi