Académique Documents
Professionnel Documents
Culture Documents
Li-Pin Juan
Contents
System requirement for GPU accelerating with OpenACC standard .............................................. 4
Linking Intel Math Kernel Library with Portland Visual Fortran ...................................................... 9
Debugging ...................................................................................................................................... 23
Check list when your program shows an error message .......................................................... 23
Inheriting an Existing Fortran Projects Properties ....................................................................... 25
Enabling/Disabling IEEE floating point arithmetic ......................................................................... 25
Create and compile programs ....................................................................................................... 25
The basic operation on PGI Visual Fortran ................................................................................ 25
Compiling on the Microsoft command shell ............................................................................. 27
Creating a new solution (project) .............................................................................................. 28
Functions and subroutines ............................................................................................................ 29
Calling module-defined subroutines ......................................................................................... 29
Calling module-defined subroutines ......................................................................................... 30
Module-defined function misses its own type decalration ....................................................... 31
Part I Passing module-defined functions: an application on taking derivative ......................... 32
Part II Passing vector-valued module-defined functions .......................................................... 34
Part III Passing into a module an external function that is not invoked by the use command 35
PartIV Passing into a module a vector-valued function with dynamic size of image ................ 36
Composition function: an application on solving for steady-state capital stock computation 37
(Obsolete) The use of INTERFACE ............................................................................................. 39
Array, arguments ........................................................................................................................... 42
Dimension declaration .............................................................................................................. 42
The upper bound in the subscript of arrays .............................................................................. 42
Assigning partial matrix to a new array: array indexing ............................................................ 44
Generating one-dimension array with fixed interval ................................................................ 45
Passing array to module-defined functions .............................................................................. 45
Rule in accommodating the dimension for arrays .................................................................... 46
2. Move to Language on the navigation pane, enable OpenACC Directives by select Yes(acc).
3. Move to Target Processors, choose the model of your CPU, say, AMD Bulldozer.
4. Move to Target Accelerators, select Yes in the box next to target Nvidia Accelerator, and
choose Yes for NVIDIA: Enable Profiling to see the timing information of accelerator
kernel profiling. Set Target Host to YES to generate a unified host+GPU library.
5. Move to Diagnostics, select Yes for revealing Accelerator Information.
7. Again, on the menu bar, select Tools|Options. On the navigation pane, under Projects
and Solutions, select PVF Directories. Follow the following three figures to add new item
into three application directories. That is,
a. C:\Program Files\PGI\win64\2012\cuda\4.2\bin to Executable files,
b. C:\Program Files\PGI\win64\2012\cuda\4.2\include to Include and module files
c. C:\Program Files\PGI\win64\2012\cuda\4.2\lib64 to library files
8. Click on F5 twice for compiling the codes. (Turning off Anti-virus software in case the
automatic sanding is launched to terminate the execution of the application.)
The supplement source code in the context above could be downloaded from here. A
template with correct Intel Math Kernel Library and CUDA linking for my desktop can be
found here.
10
Based on the library list above, we need to complete the linking setup with the following
configuration:
1. Since in this installation example we use PVF2012 as the fortran compiler, a preliminary
step before we proceed to the configuration of linking is to precompile a set of
Fortran95 modules, such as lapack95 and blas95.
a. Launch PVF for VS 2010 cmd (64) on start menu All Programs, with
administrator privilege by right-clicking the icon and selecting Run as
administrator.
b. Change directory to MKL interface for Fortran95 math library: After the
command prompt, type cd <MKL interface directory>, for example, cd
C:\Program Files (x86)\Intel\Composer XE 2013\mkl\interfaces.
c. We need to work on precompiling blas95 and lapack95 sequentially. The order
doesnt matter. Here I only take lapack95 for example. The procedure is similar
for precompiling blas95:
i. You should see a makefile under <mkl dir>\interfaces\lapack95.
ii. Type in the command line, nmake libintel64 install_dir=lapack95
interface=lp64 FC=pgf95, and then hit Enter. (for IVF, use FC=ifort)
iii. The precompiled modules should be generated under the new folder
lapack95 automatically.
d. Apply the same treatment on generating the modules of blas95 (for example,
nmake libintel64 install_dir=blas95 interface=lp64 FC=pgf95. (again, for IVF, use
FC=ifort)
2. Right-click on the solution icon under Solution Explorer, and choose Properties to open
the Property Pages interactive popup window.
3. Over the right-hand-side navigation pane, under Configuration Properties, choose
Fortran|General. In the box next to Additional Include Directories, type in the paths of
header files or precompiled modules. In my PVF compiler under this item, there are two
lines:
C:\Program Files (x86)\Intel\Composer XE 2013\mkl\include
C:\Program Files (x86)\Intel\Composer XE
2013\mkl\interfaces\blas95\blas95\include\intel64\lp64
C:\Program Files (x86)\Intel\Composer XE
2013\mkl\interfaces\lapack95\lapack95\include\intel64\lp64
C:\Program Files\Microsoft HPC Pack 2012\Inc
The first one refers to the folder of MKL header files, such as lapack.f90 or mkl_blas.fi.
The second one contains the modules that are precompiled from Fortran95 interface
libraries, such as blas95.mod and lapack95.mod. Note that if you didnt include the last
two paths, then for a successful compile, we need to add one line command before the
main unit, say, include lapack.f90 if we like to use the module lapack95 in the
11
application. Note that Use lapack95 can be substituted by the command Use
mkl95_lapack.
4. On the navigation pane, select Fortran|Preprocessing to check whether the setting next
to Additional Include Directories has synchronized with that specified in step 2.
5. Dont activate the uses of ACML, IMSL, and MKL, if we want to specify the following
linking to Intel MKL library: on the navigation pane, select linker|General. In my
PVF2012, I add
C:\Program Files (x86)\Intel\Composer XE 2013\mkl\lib\intel64
C:\Program Files (x86)\Intel\Composer XE
2013\mkl\interfaces\blas95\blas95\lib\intel64
C:\Program Files (x86)\Intel\Composer XE
2013\mkl\interfaces\lapack95\lapack95\lib\intel64
C:\Program Files\Microsoft HPC Pack 2012\Lib\amd64
next to Additional Library Directories.
6. The final step is to add the libraries recommended by the Intel Advisor to the box next
to Additional Dependencies under Linker|Input. In my case, they are:
mkl_scalapack_lp64.lib mkl_cdft_core.lib mkl_intel_lp64.lib mkl_core.lib
mkl_pgi_thread.lib mkl_blacs_msmpi_lp64.lib msmpi.lib
C:\Program Files (x86)\Intel\Composer XE
2013\mkl\interfaces\blas95\lib95\lib\intel64\mkl_blas95_lp64.lib
C:\Program Files (x86)\Intel\Composer XE
2013\mkl\interfaces\lapack95\lapack95\lib\intel64\mkl_lapack95_lp64.lib
*Note that the last two paths cannot be omitted.
7. Everything has been done!
12
13
14
15
16
2
3
17
The version for my Visual Studio 2012 IDE contains the following library: mkl_blas95_lp64.lib
mkl_lapack95_lp64.lib mkl_scalapack_lp64.lib mkl_cdft_core.lib mkl_intel_lp64.lib mkl_core.lib
mkl_intel_thread.lib mkl_blacs_msmpi_lp64.lib msmpi.lib
18
19
20
21
22
Debugging
1. choose to compile the program in the Debug, rather than Release, mode; 2. go to
DebugStart debugging, or simply click on the F5 key; 3. play around the placement of
breakpoints. If Fortran were not able to detect the breakpoint of a statement in one specific
code block, just moving the breakpoint up or down a few lines away from the current point
could be a quick workaround.
23
is better than
7) Loot out for the location of the initialization of the loop counter. Wrongly placing
it may keep the counter from increasing properly.
8) If the output file doesnt show up in the designated folder or appear with blank
content, then check whether the statement of format is correct. For example, we
like to write a series of three strings into the file denoted by unit=2.
The data saving demand would fail, if we do not specify correct printout form
with exact three letter a that corresponds to the three strings.
24
It happens when copying and pasting existing program unit across modules but
forgetting to deleting the old one.
If you want to call the intrinsic function such as isnan(x) or isinf(x) to detect not-a-number or
infinity, then make sure of enabling this function.
25
Lets walk through how to create a simple program that print Hello World. 5
1. Select File|New|Project from the Visual Studio main menu
2. Then New Project dialog appears. In the Project types window located in the left pane of
the dialog box, expand PGI Visual Fortran, and select Win32. In the Templates window
located in the left pane of the dialog box, select Console Application (32-bit). In the
Name field located at the bottom of the dialog box, type any name you like. Then, click
OK.
The subsequent instruction is copied from PGI Visual Fortran Users Guide (Release 2012)
26
4. To build the solution, from the main menu, select Build|Build Solution. The
View|Output window shows the results of the build.
5. To run the application, select Debug|Start Without Debugging.
and functions to be compiled and linked. Here is an example. Note that we need to pay
attention to the hierarchy among the subroutines. If A.f90 is called by B.f90, and B.f90 is called
by C.f90, and C.90 is called by the main program, then the order of having these files being
compiled should be arranged as follows: ABCMain Program. Suppose we are working on
the Microsoft command shell. The commands should be entered in order as follows:
1)
2)
3)
4)
5)
$pgf90 -c A.f90
$pgf90 c B.f90
$pgf90 c C.f90
$pgf A.obj B.obj C.ojb main.f90 o main
$main
29
30
Now we correct the statement by putting the type declaration back to the beginning of
FUNCTION:
31
32
Again, the dimension declaration of scalar-valued function is not needed and will pop up an
error message.
34
Part III Passing into a module an external function that is not invoked by
the use command
35
36
37
38
39
Figure 1 Part A
40
Figure 2 Part B
2. The second way to decompose this single file into independent parts: one for the MAIN
PROGRAM, one for MODULE, and one for FUNCTION. Each file is saved with your preferred
filename associated with the extension filename .f90. The details of the operation in PGI
Fortran is as follows:
a. In the window of Visual Studio (PGI Fortran), go to Solution ExplorerSource
FilesAddNew ItemFree-Format Fortran source file (.f90)In the blank next
to Name:, type Circle (as you prefer) for the MODULE. Copy the statements
between and including MODULE Circle and END MODULE Circle and paste it onto
the new created .90 file.
b. Following the same procedure above, create the .90 files for the MAIN PROGRAM
and the FUNCTION block, respectively: main_program.f90 and Area_Circle.f90.
c. Again, remember to put REAL in the beginning of the line FUNCTION Area_Circle(r)
in the INTERFACE block.
d. Save all the three files, and go to DebugStart without Debugging. A pop-up
window requesting for input should then be shown.
41
3. Note that you can do any other combination between the two extreme outlays mentioned
in 3 and 4: for example, you may only make the MODULE block independent of the single
file mentioned in 3, or only make the FUNCTION block independent of the single one. Either
way would be fine and yields the same outcome for sure.
Array, arguments
Dimension declaration
42
43
44
45
46
47
Otherwise you would see the following error message: incorrect number of shape specifier
48
49
Data output
Printing the heading of your output file
50
51
52
53
Consider that we want to print an array, say, of size 10-by-10. Normally, we use the for structure
to do the job. Alternatively, we can reduce the multiple-line for-loop statements on a single line.
For example,
PRINT (f4.1),(x(i),i=1,1,9)
PRINT *, (x(i,j),i=1,n) (Note that j is controlled by a hidden for loop )
PRINT (2e12.4), (x(i), y(i), i = 1, 1, 20)
This last command performs the task that orderly prints each row of one-dimension arrays x, y
in the scientific notation with the length of 12 and the precision of 4 decimal spaces. Note that
number 2 is joined to the beginning of e12.4 in order to accommodate two designated spaces
for x(i ) and y (i ).
That is, if you save your data as the .txt format without comma separating, then ensure you add
the Table attribute into the IMPORT statement. By the way, if you want to specify the full path
of the target file, just go to InsertFile Pathlook for your file.
54
PROGRAM Main
USE ZREAL_INT
REAL(8), DIMENSION(1) :: x, xGuess
REAL(8) :: Fcn1
REAL(8) :: a, b, c, above, below
EXTERNAL :: Fcn1
a=1
b=1
c = -2
above = 7
below = -1
xGuess = 100.
print*,'xGuess',xGuess
!--------------------------------! Fcn1 is an independent function
!--------------------------------CALL D_ZREAL(Fcn1, x, xguess = xGuess)
print*,'x zreal independent',x
!--------------------------------! Fcn2 is a contained function
!--------------------------------CALL D_ZREAL(Fcn2, x, xguess = xGuess)
print*,'x zreal contained',x
CONTAINS
REAL(8) FUNCTION Fcn2(x)
REAL(8) :: x
Fcn2 = a*x**2 + b*x + c
END FUNCTION
END PROGRAM
REAL(8) FUNCTION Fcn1(x)
REAL(8) :: x
Fcn1 = x**2 + x - 2
55
END FUNCTION
While IVF won't issue any error message for the same program code,
PGI will complain that "Your program does not comply with Fortran 2003 rules.
You cannot pass a contained subprogram (Fcn2) as an argument, in Fortran 2003."
You need to revise the declaration of the function Fcn2 by declaring it as a subprogram
in a module.
module mod2
CONTAINS
REAL(8) FUNCTION Fcn2(x)
REAL(8) :: x
Fcn2 = x**2 + x - 2
END FUNCTION
end module
56
turn line 156 to real(wp), dimension(:,:) :: py,pyh. Then, the program can be compiled issuing
no error message.
57
58
59
Missing arguments
60
61
62
63
So it does matter to convert integer to real so as to get correct answer as shown in the first
column of the last figure.
Another example is shown below: either to write = 2 ((1)) or = 2 (1. )
that is adding a period after any integer to have computation being in real type.
65
More example:
66
67
68
69
Relational operators
70
71
72
Second, to invoke all the ancillary f90 source files of the dfpmin algorithm in the main program,
add the following source files under the same directory of the main program so that during the
time of compiling, the ancillary files can be identified, compiled and linked by the windowsbased PGI Fortran. The ancillary files vary among the algorithm you attempt to use in your main
program. In the current context, we need to add the following source files into the main
program: nrtype.f90, nrutil.f90, nrerror.f90.
73
Therefore, throughout your code, the declaration of variables could use the type defined in
nrtype.f90:
74