Vous êtes sur la page 1sur 16

Seeds Development with Eclipse

Jeremy Villalobos University of North Carolina at Charlotte

Introduction
This tutorial teaches how to create a Seeds project using Eclipse. The tutorial goes over creating a Java project and adding jars as well as javadoc to the project. Then it shows how to create a simple workpool template from the PGAFramework and deploy it to the Grid.

Getting Eclipse
Eclipse is an open source project you can get for free at http://www.eclipse.org. Go to the website, click on Download and get "Eclipse IDE for Java Developers." The file is somewhat big, so it will take a few minutes depending on your network connection. Once you get the compressed file, you can unpack it using the pertinent archive program. The IDE does not have an installation wizard, so just put the folder "eclipse" in some path on your file system. Examples can be c:/Program Files/ in Windows and /usr/localin Linux. The IDE is started by calling the executable eclipse. In Linux type/usr/local/eclipse/eclipseon the command line. And in Windows typec:\Program Files\eclipse\eclipse on the command line or just double click on the eclipse executable.

Getting the PGAF Libraries


Once Eclipse is loaded, select the workbench button (Figure 1). The button displays a tool tip saying "Workbench" when you hover your mouse over it. There is plenty of documentation on getting started at Eclipse's webiste. Refer to it for more information on particular Eclipse features. We'll move on to setting up Eclipse to help us code PGAF programs. Eclipse provides help through its java doc information and by detecting systax errors in real-time. It also generates code automatically for common tasks such as getter and setter functions, and it fills out required interface functions.

Figure 1: The welcome screen on Eclipse. Next, we need the PGAF libraries. Go to http://coit-grid01.uncc.edu/seeds/download.php to get the PGAF libraries. Select the compressed file that is appropriate for your platform. Place the folder in a place accessible only by the current user. This folder will be used as a "seed folder". The folder will be sent to the nodes in the Grid, and it is also used locally. We will use the location C:\lib\PGAF for Windows and /home/username/lib/PGAF for Linux and Mac when referring to this folder's path as an example from now on.

Getting the Java Cog Libraries


In order to deploy the framework, we'll need the Java Cog libraries. Go to http://wiki.cogkit.org/wiki/Main_Page and click on the Download link. The registration is optional. Select the library's binary compressed file appropriate for your platform. Notice that Cog Kit also provides a javadoc URL. The javadoc URL is useful when adding the jars to Eclipse as will be explained in the next section. Uncompress the file into some path; we will use C:\lib\COG"(Windows) /home/username/lib/COG(Linux) to refer to the folder as an example. Those are all the files we'll need to build this development environment. In summary, we have: 1. The Eclipse IDE (Eclipse IDE for Java Developers) 2. The PGAF Libs (Version 1.0.3) 3. The Java Cog Kit(Binary version 4.1.5) In the interest of full attribution, we also include JXSE's libraries and UPNPlib along with the PGAF libraries.

Creating a Project
In this section we'll go through creating a new Java project, adding the library jars to it and the javadoc. Create a new Java Project by going to File->New->JavaProject. Figure 2 shows the button sequence on the screen.

Figure 2: Name the project MySeedsProject and click the Finish button.

Figure 3: You can now see the project's folder on the left panel. Eclipse shows the libraries used with the project as well as the source folder.

Figure 4: Next, we need to add the PGAF jar libraries to the project. Right-click on the project folder and select Propertiesoption as shown in Figure 5.

Figure 5: Click on Java Build Path on the Dialog's left panel. The tabs show option to link the project to source code from other projects, other eclipse projects, and to Java libraries. Click on the Libraries tab. You should see something similar to Figure 6.

Figure 6: Click on Add Library...button (Figure 7). Click User Library... button. The new dialog shows an empty list of the User Libraries. Then click New... button and give the name Seeds to the library; then click OK. Figure 7 shows the windows involved in this step; the number show the order of the actions.

Figure 7: While still on User Libraries dialog, click on Add JARs.. button. In the file manager dialog, navigate to the path where you place the PGAF libraries. Select all the jar's inside the lib folder (Crt+A).

Figure 8: You should see something like the picture shown in Figure 8. Notice that each jar file has a list of attributes. Eclipse allows us to set the path for the source code of the each jar file, and the javadoc. We will set the javadoc path for the PGAF library to have the real-time, in-line javadoc Eclipse feature. Locate the jar file called "seeds.jar". double-click on the attribute Javadoc location. Figure 9 shows Dialog you should see. in the field "Javadoc locatio path:" enter the URL: http://coitgrid01.uncc.edu/seeds/javadoc/. You can click Validate... button to have eclipse check if the URL indeed points to a javadoc location. The URL points to a Javadoc that is human readable, you can go to the site and browse for information on the Framework's classes.

Figure 9: Click OK's and Applys all the way to the workbench, and your done setting up the coding part of the development environment. We still need to set up some settings in order to run a program, but at the moment to code. Notice you can add Javadoc for the Jxta library and other libraries. We showed how to get it for PGAFramework because it significantly improves your ability to understand and code using Seeds. Add the Cog library by repeating the steps done to create the PGAF library. The next section goes over creating a simple workpool application.

Selecting a Template
At this point, the programmer should evaluate the type of problem being solved by the framework. The programmer then, decides on a specific framework interface that will provide the most benefit in implementing a solution. At the moment, the framework only offers a workpool solution, the generic template and other templates will be available as needed. For information on how to create an Advance Template, go here.

Coding
We will create three classes: a class that implements the interface called MyModule, a class that extends the Data object, and a class that executes the module. Right-click on the project's folder, select New->Class. The New Java Class Dialog will open up.

Figure 10: Type the class name MyModule in the Name Field. To implement the Workpool interface click on Browse... button for the Superclass option. provide the name of the package as edu.institution.

Figure 11: Start typing Workpool. As you type, the Interface edu.uncc.grid.pgaf.interfaces.basic.Workpool class will be among the reduced list shown. Select that class from the list.

Implementation note:Although the initial intention was to use Java Interfaces for the PGAF Interface, there are some features used from Java Abstract classes that made the use of classes necesary for PGAF Interfaces. The PGAF Interface does not equate to a Java Interface, and therefore the user's module should use extend keyword instead of implements.

Create another class named MyData by folloing the same steps done to create MyModule class. It has to implement the edu.uncc.grid.pgaf.datamodules.Data interface, so use the Interface options instead of the Superclass option. This is done by clicking on the Add... button below browse button. Then create an executable Java Class named RunMyModule.java. Notice there is a check box in the Dialog "New Java Class" that will create the main() function automatically, check the check box for RunMyModule Class. Alternatively, you can also add the static main method by writing it. Double-click on MyModule file. Notice Eclipse already filled in the functions that you should implement for the interface. You can hover you mouse around a PGAF class like "Workpool" and the in-line javadoc will provide you with some information on the purpose of the interface and good programming practices when using it. You can try the trick on the functions. Hover the mouse on Compute() function for example.

Figure 12: The files used for this tutorials are at http://coitgrid01.uncc.edu/seeds/download/ide_tutorial/hostname.zip. You can replace the files with the one provided to continue, or you can copy and paste the code. The code in the sample just runs an algorithm on the compute() function to find out the hostname where compute() is running. It then puts the name in a MyData object and sends it back to the server node. The DiffuseData() function sends an empty data object. The GatherData() function receives the hostname in the MyData object and prints it to the standard output. getDataCount() request the framework to run 10 jobs. Source files used for the example The code in RunMyModule uses the Seeds class. The Seeds class requests an important parameters. The parameter is the path to the seed folder. This is the folder that will be sent to each of the grid nodes before submitting a job to run them and boot up the network. The folder contains the jar files for the mentioned libraries, plus a couple of configuration files. The available_servers file list the available host that you want to use to deploy Seeds. In a production environment, one can imagine this file being generated by a scheduler. The second file is the seed_file, this file has the address of the rendezvous jxse servers, this is used to get the network connected. At least one server per grid node must have an address written into the seed_file. A sample copy of both files is included in the PGAF folder.

Running the Program


In order to run the program, we need to create the seed folder. The seed folder is the framework's (C:/lib/Seeds or /home/username/lib/Seeds) folder that contains the library files plus the project we just created, which at the moment is at $HOME/workspace or C:/Documents and Settings/username/workspace. Copy the contents inside the bin folder inside workspace/MySeedsProjectto the seed folder. The contents should preserver Java's package naming convention; otherwise, the Java VM will not find them and a ClassNotFoundException will be thrown. Alternatively, you can create a jar for your project and add the jar file to the shuttle folder. To do this, right-click on the project's folder and select Export... option, follow the wizard from there, and place

the jar in the shuttle folder. Edit the file AvailableServer.txt to have the localhost line uncommented (Figure 13 ).

Figure 13: We will first use the framework withing the workstation only. Make sure there are no entries on the seed_file, which is inside the seed folder. Make note of the path for the seed folder because it will be used as an argument to run program. Create a new run on Eclipse by clicking on the Run Button's arrow. Figure 14 shows the Run Button, the arrow appears as you hover the mouse on the button. Select Run Configurations....

Figure 14: On the "run Configurations" dialog, click New Run Configuration button as shown on Figure 15. Name the new configuration MySeedsTest. The Project and Main Class field should be automatically filled. If not, select project MySeedsProject, and the main class is RunMyModule with its complete class name.

Figure 15:

Figure 16: Switch to the Arguments tab, and add the path to the seed folder as an argument.

We are ready to test the program. We now need to create to get a grid proxy. To do this, from the command line type:
gridproxyinit

C:/lib/COG/bin (windows) /home/username/lib/COG/bin (linux) must be on your PATH variable for the previous command to work. You can do this in Linux by typing:
exportPATH=$PATH:/home/username/lib/COG/bin

in windows
setPATH=%PATH%:C:\lib\COG\bin

The environmental variable setting is temporal for the session only. They need to be added to
/etc/profile

in Linux. In windows, you can click Start->Right-click MyComputer->Properties.

Then, you can navigate to Environmental Variables and edit Path. A certificate must also be present at the workstation for gridproxyinit to work. To do this, you can use the program WinSCP to copy the certificate files from the the Grid organization you belong to. The files are the usercert.pem and userkey.pem which are located in the .globus directory at the server where you requested the certificate. Copy those file to /home/username/.globus Run the application by clicking on the Run Button. There should be logging information from the JXSE library printing on Eclipse's Console panel. The output for the test application will be printed along with the log information. You can pipe the standard error to a separate file to clean up the output.

Testing and Debugging

Program debugging
The framework is a prototype, some unexpected errors can happen as you introduce creative uses for th e presented material. The seed folder will keep a log at each of the machines. The seed folder name starts with the host name to prevent host sharing a file system from conflicting. In that folder, you can extract PGAF.log (PGAF being the initial name of the framework). In it a verbose description of the framework actions will be given, but mostly focus on exceptions thrown. To run a program for performance, turn off the logging feature by calling the command Node.getLog.setLevel( Level.INFO) or any other level at the modules InitializeModule() method. Any Exception related to your code will be routed to the standard output. You can also send information via this channel by using RemoteLogger.printToObserver() function withing your module.

Communication
Make sure ports are opened before running this tutorial. Also, the NAT circumventing features is not fully tested, so test the framework on nodes that can clearly reach each other first. The framework uses UPNP to get accesss from your workstation, if your workstation is connected to a router with UPNP turn off, the program is likely to experience connection problems. If your client is using wireless, the Hostspot provider may have close some of the ports used by the framework. The error will create an TimeoutException

java.net.SocketTimeoutException:connecttimedout

Assumptions And Issues


The framework is a prototype. It makes some assumptions about the Grid nodes. It assumes the Grid nodes's submit host will also host a peer, and that peer will listen on port 50040. It also assumes the Grid nodes will schedule the jobs to run almost immediately. Fork, and Condor deployment are supported. No interface exist to communication with meta-schedulers. The framework may not correctly identify the Network Interface on a Virtualized system. Multicast should be enabled on the Grid nodes to let the network choose a leader. Alternatively, a shared file system can be used to select a leader.

Running the Program (Command Line)


If you are not using an IDE, you can run PGAF projects using the command line. Make sure all the pertinent class and jar files are in the CLASSPATH. This is done different for each platform. Scripts to set up Cog and Seeds on the CLASSPATH are provided at Seeds home page at: http://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.sh and http://coit-grid01.uncc.edu/seeds/download/scripts/set_pgaf_environment.bat The Linux command is run by typing: source set_pgaf_environment.sh /path/to/cog/libraries/ path/to/shuttle/folder/ The scritp may need to be set to executable with the command: chmod +x set_pgaf_environment.sh

Running on the Local Machine (Command Line)


Lastly, we can run the program. To run the framework locally make sure there is only the local host mentioned in the Available Server file. Then, perform the following command: java -classpath $CLASSPATH my.institutions.RunMyModule AvailableServers_test.txt gpaf_shutle_folder_path The framework will take a few seconds to load, the P2P log is outputed to the standard error. In Linux the stderr can be redirected using 2> error.file.txt. java -classpath $CLASSPATH my.institutions.RunMyModule AvailableServers_test.txt gpaf_shutle_folder_path 2> error.file.txt The output should have the name of your local workstation 10 times. Running on a Remote Machine Next, we can run the program on a remote Grid node. For this, the user needs a certificate from a Grid node and have the certificate in ~/.globus location. Getting a grid certificate and getting access to grid

resources is beyond the scope of this tutorial. We refer the reader to the Globus website for that information. The bin folder from the cog libraries downloaded at the beginning of this tutorial should be the PATH. The Linux script provided here should set that for you. You can also set the path with the the command:
exportPATH=$PATH:/path/to/cog/lib/bin/folder

For Windows, Click the Start button, Right-click on My Computer. Select Properties, go to advanced tab, select Environmental Variables, and edit PATH to include the cog lib bin folder. Make sure not to have spaces as this is known to cause errors. Now we need to get a proxy. Type:
gridproxyinit

on the command line. When prompted, type the certificate's passphrase. This creates the proxy we need to run jobs on the external grid node. Add the grid node job submit host to the Available servers file following the rules specified in the file. Take out the local machine entry. We assume there will be a Rendezvous node at the remote node, so also add a host to the seed_file with the port 50040. put at least two computers or CPU cores on the remote machines's description. Now we can repeat the same command as it was done with the local machine
javaclasspath$CLASSPATHmy.institutions.domain.RunMyModule AvailableServers_test.txtgpaf_shutle_folder_path

This time, it should take longer. But after a few minutes, the result should come back. It is possible to run a local machine together with a list of remote hosts. For this example, because the amount of work is so small, the local machine would have done all the jobs before the grid nodes come on line.

Vous aimerez peut-être aussi