Vous êtes sur la page 1sur 82

IBM Platform HPC

Version 4.2

IBM Platform HPC, Version 4.2


Installation Guide



SC27-6107-02

IBM Platform HPC


Version 4.2

IBM Platform HPC, Version 4.2


Installation Guide



SC27-6107-02

Note
Before using this information and the product it supports, read the information in Notices on page 71.

First edition
This edition applies to version 4, release 2, of IBM Platform HPC (product number 5725-K71) and to all subsequent
releases and modifications until otherwise indicated in new editions.
Copyright IBM Corporation 1994, 2014.
US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM Corp.

Contents
Chapter 1. Installation planning
Preinstallation roadmap
Installation roadmap. .

.
.

.
.

.
.

.
.

.
.

. . . . 1
.
.

.
.

.
.

.
.

. 2
. 3

Chapter 2. Planning. . . . . . . . . . 5
Planning your system configuration . .
Planning a high availability environment .

.
.

.
.

.
.

. 5
. 7

Chapter 3. Preparing to install PHPC . . 9


PHPC requirements . . . . . . . . . . .
High availability requirements . . . . . .
Prepare a shared file system . . . . . .
Configure and test switches . . . . . . . .
Plan your network configuration . . . . . .
Installing and configuring the operating system on
the management node . . . . . . . . . .
Red Hat Enterprise Linux prerequisites . . .
SUSE Linux Enterprise Server (SLES) 11.x
prerequisites . . . . . . . . . . . .

.
.
.
.

. 13
. 15
. 16

Chapter 4. Performing an installation


Comparing installation methods
Quick installation roadmap . .
Quick installation . . . . .
Custom installation roadmap .
Custom installation . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

. 9
10
12
12
13

17
.
.
.
.
.

.
.
.
.
.

17
19
20
23
24

Chapter 5. Performing a silent


installation . . . . . . . . . . . . . 29
Response file for silent installation.

. 29

Completing the high availability enablement .


Configure IPMI as a fencing device . . .
Create a failover notification. . . . . .
Setting up SMTP mail settings . . . .
Verifying a high availability environment . .
Troubleshooting a high availability environment
enablement . . . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

44
44
45
45
46

. 47

Chapter 10. Upgrading IBM Platform


HPC . . . . . . . . . . . . . . . . 49
Upgrading to Platform HPC Version 4.2 . . . . .
Upgrade planning . . . . . . . . . . .
Upgrading checklist . . . . . . . . .
Upgrading roadmap . . . . . . . . .
Upgrading to Platform HPC 4.2 without OS
reinstall. . . . . . . . . . . . . . .
Preparing to upgrade . . . . . . . . .
Backing up Platform HPC . . . . . . .
Performing the Platform HPC upgrade . . .
Completing the upgrade . . . . . . . .
Verifying the upgrade . . . . . . . . .
Upgrading to Platform HPC 4.2 with OS reinstall
Preparing to upgrade . . . . . . . . .
Backing up Platform HPC . . . . . . .
Performing the Platform HPC upgrade . . .
Completing the upgrade . . . . . . . .
Verifying the upgrade . . . . . . . . .
Troubleshooting upgrade problems . . . . .
Rollback to Platform HPC 4.1.1.1 . . . . . .
Upgrading entitlement. . . . . . . . . . .
Upgrading LSF entitlement . . . . . . . .
Upgrading PAC entitlement . . . . . . . .

49
49
49
50
50
50
51
52
53
55
55
55
57
57
58
60
60
61
63
63
63

Chapter 6. Verifying the installation . . 35


Chapter 11. Applying fixes . . . . . . 65
Chapter 7. Taking the first steps after
installation . . . . . . . . . . . . . 37
Chapter 8. Troubleshooting installation
problems. . . . . . . . . . . . . . 39
Configuring your browser

. 40

Chapter 9. Setting up a high availability


environment . . . . . . . . . . . . 41
Preparing high availability . . . . .
Enable a high availability environment .

Copyright IBM Corp. 1994, 2014

.
.

.
.

.
.

. 41
. 43

Chapter 12. References


Configuration files . . . .
High availability definition
Commands . . . . . .
pcmhatool . . . . . .

. . . . . . . 67
.
file
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

67
67
68
68

Notices . . . . . . . . . . . . . . 71
Trademarks . . . . . .
Privacy policy considerations

.
.

.
.

.
.

.
.

.
.

.
.

.
.

. 73
. 73

iii

iv

Installing IBM Platform HPC Version 4.2

Chapter 1. Installation planning


Installing and configuring IBM Platform HPC involves several steps that you
must complete in the appropriate sequence. Review the preinstallation and
installation roadmaps before you begin the installation process.
The Installation Guide contains information to help you prepare for your Platform
HPC installation, and includes steps for installing Platform HPC.
As part of the IBM Platform HPC installation, the following components are
installed:
v IBM Platform LSF
v IBM Platform MPI

Workload management with IBM Platform LSF


IBM Platform HPC includes a workload management component for load
balancing and resource allocation.
Platform HPC includes a Platform LSF workload management component. IBM
Platform LSF is an enterprise-class software that distributes work across existing
heterogeneous IT resources creating a shared, scalable, and fault-tolerant
infrastructure, delivering faster, more reliable workload performance. LSF balances
load and allocates resources, while providing access to those resources. LSF
provides a resource management framework that takes your job requirements,
finds the best resources to run the job, and monitors its progress. Jobs always run
according to host load and site policies.
This LSF workload management component is installed as part of Platform HPC
installation, and the workload management master daemon to be configured
running on the node same as the Platform HPC management node.
For more information on IBM Platform LSF, refer to the IBM Platform LSF
Administration guide. You can find the IBM Platform LSF here:
http://public-IP-address/install/kits/kit-phpc-4.2/docs/lsf/, where
public-IP-address is the public IP address of your Platform HPC management node
To upgrade your product entitlement for LSF refer to Upgrading LSF entitlement
on page 63.

IBM Platform MPI


By default, IBM Platform MPI is installed with IBM Platform HPC. For building
MPI applications, you must have one of the supported compilers installed. Refer to
the IBM Platform MPI release notes for a list of supported compilers. The IBM
Platform MPI release notes are in the /opt/ibm/platform_mpi/doc/ directory.
For more information on submitting and compiling MPI jobs, see the IBM Platform
MPI User's Guide 9.1 (SC27-5319-00).

Copyright IBM Corp. 1994, 2014

Preinstallation roadmap
Before you begin your installation, ensure that the preinstallation tasks are
completed.
There are two cases to consider before installing Platform HPC, including:
v Installing Platform HPC on a bare metal management node.
v Installing Platform HPC on a management node that already has an operating
system installed.
If you are installing Platform HPC on a management node that already has an
operating system that is installed, you can omit preinstallation actions 4 and 5.
Table 1. Preinstallation roadmap

1.

Actions

Description

Plan your cluster

Review and plan your cluster setup.


Refer to Planning your system
configuration on page 5.

1.

Review Platform HPC requirements

Make sure that the minimum hardware


requirements are met, including:
v Hardware requirements
v Software requirements
Refer to PHPC requirements on page 9.

2.

Configure and test switches

Ensure that the necessary switches are


configured to work with Platform HPC.
Refer to Configure and test switches on
page 12.

3.

Plan your network configuration

Before proceeding with the installation,


plan your network configuration,
including:
v Provision network information
v Public network information
v BMC network information
Refer to Plan your network
configuration on page 13.

4.

Obtain a copy of your operating system

If the operating system is not installed,


you must obtain a copy of your operating
system and install it.

5.

Install and configure your operating


system

Ensure that you configure your operating


system:
v Decide on a partitioning layout
v Meet the Red Hat Enterprise Linux 6.x
prerequisites
Refer to Installing and configuring the
operating system on the management
node on page 13.

6.

Obtain a copy of IBM Platform HPC

Installing IBM Platform HPC Version 4.2

If you do not have a copy of IBM


Platform HPC, you can download it from
IBM Passport Advantage.

Installation roadmap
This roadmap helps you navigate your way through the PHPC installation.
Table 2. Installation roadmap

1.

Actions

Description

Select an installation method

Choose an installation method from the


following:
v Installing PHPC using the installer.
Using the installer you have the
following choices:
Quick installation
Custom installation
v Installing PHPC using silent mode
Refer to Chapter 1, Installation
planning, on page 1.

2.

Perform the installation

Follow your installation method to


complete the PHPC installation.

3.

Verify your installation

Ensure that PHPC is successfully


installed.
Refer to Chapter 6, Verifying the
installation, on page 35.

4.

Troubleshooting problems that occurred


during installation

If an error occurs during installation, you


can troubleshoot the error.
Refer to Chapter 8, Troubleshooting
installation problems, on page 39.

5.

(Optional) Upgrading product


entitlement

Optionally, you can update your product


entitlement for LSF.
Refer to Upgrading LSF entitlement on
page 63.

6.

(Optional) Apply PHPC fixes

After you install PHPC, you can check if


there are any fixes available though the
IBM Fix Central.
Refer to Chapter 11, Applying fixes, on
page 65.

Chapter 1. Installation planning

Installing IBM Platform HPC Version 4.2

Chapter 2. Planning
Before you install IBM Platform HPC and deploy system, you must decide on your
network topology, and system configuration.

Planning your system configuration


Understand the role of the management node and plan your system settings and
configurations accordingly. IBM Platform HPC software is installed on the
management node after the management node meets all requirements.
The management node is responsible for the following functions:
v Administration, management, and monitoring of the system
v Installation of compute nodes
v Operating system distribution management and updates
v System configuration management
v Kit management
v
v
v
v
v

Provisioning templates
Stateless and stateful management
User logon, compilation, and submission of jobs to the system
Acting as a firewall to shield the system from external nodes and networks
Acting as a server for many important services, such as DHCP, NFS, DNS, NTP,
HTTP

The management node connects to both a provision and public network. Below,
the management node connects to the provision network through the Ethernet
interface that is mapped to eth1. It connects to the public network through the
Ethernet interface that is mapped to eth0. The public network refers to the main
network in your company or organization. A network switch connects the
installation and compute nodes together to form a provision network.

Copyright IBM Corp. 1994, 2014

Figure 1. System with a BMC network

Each compute node can be connected to the provision network and the BMC
network. Multiple compute nodes are responsible for calculations. They are also
responsible for running batch or parallel jobs.
For networks where compute nodes have the same port for an Ethernet and BMC
connection, the provision and BMC network can be the same. Below, is an example
of a system where compute nodes share a provisioning port.

Installing IBM Platform HPC Version 4.2

Figure 2. System with a combined provision and BMC network

Note: For IPMI using a BMC network, you must use eth0 in order for the BMC
network to use the provision network.
Although other system configurations are possible, the two Ethernet interface
configurations is the most common. By default, eth0 is connected to the provision
interface and eth1 is connected to the public interface. Alternatively, eth0 can be the
public interface and eth1 the provision interface.
Note: You can also connect compute nodes to an InfiniBand network after the
installation.
The provision network connects the management node and compute nodes is
typically a Gigabit or 100-Mbps Ethernet network. In this simple setup, the
provision network serves three purposes:
v System administration
v System monitoring
v Message passing
It is common practice, however, to perform message passing over a much faster
network using a high-speed interconnect such as InfiniBand. A fast interconnect
provides benefits such as higher throughput and lower latency. For more
information about a particular interconnect, contact the appropriate interconnect
vendor.

Planning a high availability environment


A high availability environment includes two installed PHPC management nodes
locally with same software and network configuration (except the hostname and IP
address). High availability is configured on both management nodes to control key
services.

Chapter 2. Planning

Installing IBM Platform HPC Version 4.2

Chapter 3. Preparing to install PHPC


Before installing PHPC, steps must be taken to ensure all prerequisite are met.
Before installing PHPC, you must complete the following steps:
v Check the PHPC requirements. You must make sure that the minimum hardware
and software requirements are met.
v Configure and test switches.
v Plan network configuration.
v Obtain a copy of the operating system. Refer to the PHPC requirements for a list
of supported operating systems.
v Install an operating system for the management node.
v Obtain a copy of the product.

PHPC requirements
You must make sure that the minimum hardware and software requirements are
met.

Hardware requirements
Before you install PHPC, you must make sure that minimum hardware
requirements are met.
Minimum hardware requirements for the management node:
v 100 GB free disk space
v 4 GB of physical memory (RAM)
v At least one static Ethernet configured interface
Note: For IBM PureFlex systems, the management node must be a node that is
not in the IBM Flex Chassis.
Minimum requirements for compute node for stateful package-based installations:
v 1 GB of physical memory (RAM) for compute nodes
v 40 GB of free disk space
v One static Ethernet interface
Minimum requirements for compute node for stateless image-based installations:
v 4 GB of physical memory (RAM)
v One static Ethernet interface
Optional hardware can be configured before the installation:
v Additional Ethernet interfaces for connecting to other networks
v Additional BMC interfaces
v Additional interconnects for high-performance message passing, such as
InfiniBand
Note: Platform HPC installation on an NFS server is not supported.

Copyright IBM Corp. 1994, 2014

Software requirements
One of the following operating systems is required:
v Red Hat Enterprise Linux (RHEL) 6.5 x86 (64-bit)
v SUSE Linux Enterprise Server (SLES) 11.3 x86 (64-bit)

High availability requirements


You must make sure that these requirements are met before you set up high
availability.

Management node requirements


Requirements for the primary management node and the secondary management
node in a high availability environment:
v The management nodes must have the same or similar hardware requirements.
v The management nodes must have the same partition layout.
After you prepare the secondary management node, you can ensure that the
secondary node uses the same partition schema as the primary management
node. Use df -h and fdisk -l to check the partition layout. If the secondary
node has a different partition layout, reinstall the operating system with the
same partition layout.
v The management nodes must use the same network settings.
v The management nodes must use the same network interface to connect to the
provision and public networks.
Ensure that the same network interfaces are defined for the primary and
secondary management nodes. On each management node, issue the ifconfig
command to check that the network settings are the same. Additionally, ensure
that the IP address of same network interface is in the same subnet. If not,
reconfigure the network interfaces on the secondary management node
according to your network plan.
v The management nodes must be configured with the same time, time zone, and
current date.

Virtual network requirements


Virtual network information is needed to configure and enable high availability.
Collect the following high availability information:
v Virtual management node name
v Virtual IP address for public network
v Virtual IP address for provision network
v Shared directory for user home
v Shared directory for system work data
Note: In a high availability environment, all IP addresses (management nodes IP
addresses and virtual IP address) are in the IP address range of your network. To
ensure that all IP addresses are in the IP address range of your network, you can
use sequential IP addresses. Sequential IP addresses can help avoid any issues. For
example:

10

Installing IBM Platform HPC Version 4.2

Table 3. Example: Sequential IP addresses


Primary
management
IP address range node

Secondary
management
node

Virtual IP
address

public

192.168.0.3192.168.0.200

192.168.0.3

192.168.0.4

192.168.0.5

provision

172.20.7.3172.20.7.200

172.20.7.3

172.20.7.4

172.20.7.5

Network

Shared file system requirements


Shared file systems are required to set up a high availability environment in
Platform HPC. By default, two shared directories are required in a high availability
environment; one to store user data and one to store system work data. In a high
availability environment, all shared file systems must be accessible by the
provision network for both the management nodes and compute nodes.
The following shared file systems must already be created on your shared storage
server before you set up and enable a high availability environment:
Shared directory for system work data
v The minimum available shared disk space that is required is 40 GB.
Required disk space varies based on the cluster usage.
v The read, write, and execute permissions must be enabled for the
operating system root user and the Platform HPC administrator. By
default, the Platform HPC administrator is phpcadmin.
Shared directory for user data (/home)
v Ensure that there is enough disk space for your data in your /home
directory. The minimum available shared disk space that is required is 4
GB, and it varies based on the disk space requirements for each user and
the total user number. If not provided, the user data is stored together
with system work data.
v The read and write permissions must be enabled for all users.
Additionally, the following shared file system requirements must be met:
v The shared file systems cannot be one of the management nodes.
v The shared file systems should be specific to and only use for the high
availability environment. This ensures that no single point of failure (SPOF)
errors occur.
v If the IP address of the shared storage server is in the network IP address range
that is managed by Platform HPC, it must be added as an unmanaged device to
the cluster to avoid any IP address errors. Refer to Unmanaged devices.
v If using an external NAS or NFS server to host the shared directories that are
needed for high availability, the following parameters must be specified in the
exports entries:
rw,sync,no_root_squash,fsid=num

where num is an integer and should be different for each shared directory.
For example, to create a shared data and a shared home directory on an external
NFS server, use the following commands:

Chapter 3. Preparing to install

11

mkdir -p /export/data
mkdir -p /export/home

Next, modify the /etc/exports file on the external NFS server.


/export/ 172.20.7.0/24(rw,sync,no_root_squash,fsid=0)

Note: If you are using two different file systems to create the directories, ensure
that the fsid parameter is set for each export entry. For example:
/export/data 172.20.7.0/24(rw,sync,no_root_squash,fsid=3)
/export/home 172.20.7.0/24(rw,sync,no_root_squash,fsid=4)

Prepare a shared file system


Before you enable high availability, prepare a shared file system. A shared file
system is used in high availability to store shared work and user settings.

Procedure
1. Confirm that the NFS server can be used for the high availability configuration
and that it is accessible from the Platform HPC management nodes. Run the
following command on both management nodes to ping the NFS server from
provision network.
# ping -c 2 -I eth1 192.168.1.1
PING 192.168.1.1 (192.168.1.1) from 192.168.1.3 eth1: 56(84) bytes of data.
64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.051 ms
64 bytes from 192.168.1.1: icmp_seq=2 ttl=64 time=0.036 ms

2. View the list of all NFS shared directories available on the NFS server.
# showmount -e 192.168.1.1
Export list for 192.168.1.1:
/export/data 192.168.1.0/255.255.255.0
/export/home 192.168.1.0/255.255.255.0

3. Add the NFS server as an IP pool to the Platform HPC system. This prevents
the IP address of the NFS server from being allocated to a compute node and
ensures that the NFS server name can be resolved consistently across the
cluster. On the primary management node, run the following commands.
#nodeaddunmged hostname=nfsserver ip=192.168.1.1
Created unmanaged node.
#plcclient.sh -p pcmnodeloader
Loaders startup successfully.

Configure and test switches


Before installing IBM Platform HPC ensure that your Ethernet switches are
configured properly.
Some installation issues can be caused by misconfigured network switches. These
issues include: nodes that cannot PXE boot, nodes cannot download a kickstart file,
nodes cannot go into interactive startup. To ensure that the Ethernet switches are
configured correctly, complete the following steps:
1. Disable the Spanning Tree on switched networks.
2. If currently disabled, enable PortFast on the switch.
Different switch manufacturers may use different names for Portfast. It is the
forwarding scheme that the switch uses. For best installation performance, the
switch begins forwarding the packets as it begins receiving them. This speeds
the PXE booting process. Enabling PortFast if it is supported by the switch is
recommended.

12

Installing IBM Platform HPC Version 4.2

3. If currently disabled, enable multicasting on the switch. Certain switches might


need to be configured to allow multicast traffic on the private network.
4. Run diagnostics on the switch to ensure that the switch is connected properly,
and there are no bad ports or cables in the configuration.

Plan your network configuration


Before installing Platform HPC, ensure that you know the details of your network
configuration, including if you are setting up a high availability environment.
Information about your network is required during installation, including
information about the management nodes, and network details.
Note: If you are setting up a high availability environment, collect the information
for both management nodes, the primary management node and the secondary
management node.
The following information is needed to setup and configure your network.
Plan your network details, including:
v Provision network information:
Network subnet
Network domain name
Static IP address range
v Public network information:
Network subnet
Network domain name
Static IP address range
v BMC network information:
Network subnet
Network domain name
Static IP address range
v Management node information:
Node name (use a fully qualified domain name with a public domain suffix,
for example: management.domain.com)
Static IP address and subnet mask for public network
Static IP address and subnet mask for provision network
Default gateway address
External DNS server IP address
Note: For a high availability environment, the management node information is
required for both the primary management node and the secondary management
node.

Installing and configuring the operating system on the management


node
Before you can create the PHPC management node, you must install an operating
system on the management node.

Chapter 3. Preparing to install

13

Complete the following steps to install the operating system on the management
node:
1. Obtain a copy of the operating system.
2. Install and configure the operating system.
Before you install the operating system on the management node, ensure that the
following conditions are met:
v Decide on a partitioning layout. The suggested partitioning layout is as follows:
Ensure that the /opt partition has a least 4 GB
Ensure that the /var partition has at least 40 GB
Ensure that the /install partition has at least 40 GB
Note: After you install Platform HPC, you can customize the disk partitioning
on compute nodes by creating a custom script to configure Logical Volume
Manager (LVM) partitioning.
v Configure at least one static network interface.
v Use a fully qualified domain name (FQDN) for the management node.
v The /home directory must be writable.
If the /home directory is mounted by autofs, you must first disable the autofs
configuration:
# chkconfig autofs off
# service autofs stop

To make the /home directory writable, run the following command as root:
# chmod u+w /home
# ls -al / |grep home

v The package openais-devel must be removed manually if it is already installed.


v Before you install PHPC on the management node, make sure that shadow
passwords authentication is enabled. Run setup and make sure Use Shadow
Passwords is checked.
v Ensure that IPv6 is enabled for remote power and console management. Do not
disable IPv6 during the operating system installation. To enable IPv6 do the
following:
For RHEL: If the disable-ipv6.conf file exists in the /etc/modprobe.d directory,
comment out the following line to disable IPv6: install ipv6 /bin/true
For SLES: If 50-ipv6.conf file exists in the /etc/modprobe.d directory, comment
out the following line to disable IPv6: #install ipv6 /bin/true
Note: After you install the operating system, ensure that the operating system time
is set to the current real time. Use the date command to check the date on the
operating system, and date -s command to set the date. For example: date -s
"20131017 04:57:00"
Important:
The management node does not support installing on an operating system that is
upgraded through yum or zypper update. Do not run a yum update (RHEL) or
zypper update (SLES) before installing PHPC. You can update the management
node's operating system after installation. If you do upgrade your operating
system through yum or zypper then you must roll back your changes before
proceeding with the PHPC installation.

14

Installing IBM Platform HPC Version 4.2

If you are installing the Red Hat Enterprise Linux (RHEL) 6.x operating system,
see the additional RHEL prerequisites.
If you are installing the SUSE Linux Enterprise Server (SLES) 11.x operating
system, see the additional SLES prerequisites.
After all the conditions and prerequisites are met, install the operating system.
Refer to the operating system documentation for how to install the operating
system.

Red Hat Enterprise Linux prerequisites


Before you install Platform HPC on Red Hat Enterprise Linux (RHEL) 6.x, you
must ensure the following:
1. The 70-persistent-net.rules file is created under /etc/udev/rules.d/ to make the
names persistent across reboots.
2. Before installing PHPC, you must stop the NetworkManager service. To stop
the NetworkManager service, run the following command:
/etc/init.d/NetworkManager stop

3. Disable SELinux.
a. On the management node, edit the /etc/selinux/config file to set
SELINUX=disabled.
b. Reboot the management node.
4. Ensure that the traditional naming scheme ethN is used. If you have a system
that does not use the traditional naming scheme ethN, you must revert to the
traditional naming scheme ethN:
a. Rename all ifcfg-emN and ifcfg-p* configuration files and modify the
contents of the files accordingly. The content of these files is
distribution-specific (see /usr/share/doc/initscripts-version for details).
For example, ifcfg-ethN files in RHEL 6.x contain a DEVICE= field which is
assigned with the emN name. Modify it to suit the new naming scheme
such as DEVICE=eth0.
b. Comment the HWADDR variable in the ifcfg-eth* files if present as it is not
possible to predict here which of the network devices is named eth0, eth1
etc.
c. Reboot the system.
d. Log in to see the ethN names.
1. Check whether the package net-snmp-perl is installed on the management
node. If not, you must install it manually from the second RHEL 7 on POWER
ISO.
2. Before installing PHPC, you must stop the NetworkManager service. To stop
the NetworkManager service, run the following command:
/etc/init.d/NetworkManager stop

3. Disable SELinux.
a. On the management node, edit the /etc/selinux/config file to set
SELINUX=disabled.
b. Reboot the management node.

Chapter 3. Preparing to install

15

SUSE Linux Enterprise Server (SLES) 11.x prerequisites


Before you install Platform HPC on SUSE Linux Enterprise Server (SLES), you
must complete the following steps.
1. You must disable AppArmor. To disable AppArmor, complete the following
steps:
a. Start the YaST configuration and setup tool.
b. From the System menu, select the System Services (Runlevel) option.
c. Select the Expert Mode option.
d. Select the boot.apparmor service, go to the Set/Reset menu and select
Disable the service.
e. To save the options click OK.
f. Exit the YaST configuration and setup tool by clicking OK.
2. If the createrepo and perl-DBD-Pg packages are not installed, complete the
following steps:
a. To install the packages, prepare the following ISO images:
v Installation ISO image: SLES-11-SP3-DVD-x86_64-GM-DVD1.iso
v SDK ISO image: SLE-11-SP3-SDK-DVD-x86_64-GM-DVD1.iso
b. Create a software repository for each ISO image using the YaST
configuration and setup tool. You must create a software repository for both
the installation ISO image and the SDK ISO image. To create a software
repository, complete the following steps:
1) Start the YaST configuration and setup tool in a terminal.
2) From the Software menu, select the Software Repositories option and
click Add.
3) Select the Local ISO Image option and click Next.
4) Enter the Repository Name and select a Path to ISO Image. Click Next.
5) Click OK to save the options and exit the YaST configuration and setup
tool.
c. Install the createrepo and perl-DBD-Pg packages, run the following
command: zypper install createrepo perl-DBD-Pg
3. Reboot the management node.

16

Installing IBM Platform HPC Version 4.2

Chapter 4. Performing an installation


Install PHPC using the installer. The installer enables you to specify your
installation options.
After the installation starts, the installer automatically checks the hardware and
software configurations. The installer displays the following based on the results:
v OK - if no problems are found for the checked item
v WARNING - if configuration of an item does not match the requirements;
installation continues despite the warnings
v FAILED - if the installer cannot recover from an error, the installation quits
The installer (phpc-installer) displays the corresponding error message for
problems that are detected and automatically ends the installation. If there are
errors, you must resolve the identified problems then rerun the phpc-installer
until all installation requirements are met.

Usage notes
v Do not use an NFS partition or a local /home partition for the depot (/install)
mount point.
v In the quick installation, the default values are used for values not specified
during installation.
v A valid installation path for the installer must be used. The installation path
cannot include special characters such as a colon (:), exclamation point (!) or
space, and the installation cannot begin until a valid path is used.

Comparing installation methods


IBM Platform HPC can be installed using an interactive installer in one of two
methods, the quick installation method and the custom installation method. The
quick installation method sets up quickly sets up basic options with default
options. The custom installation method provides added installation options and
enables the administrator to specify additional system configurations.
Below is a complete comparison table of the two installation methods and the
default values provided by the installer.
Table 4. Installer option comparison

Options

Default value

Option
included in
the Quick
installation?
(Yes/No)

Select a mount point for the


depot (/install) directory.

Yes

Yes

Select the location that you want CD/DVD drive


to install the operating system
from.

Yes

Yes

Specify a provision network


interface.

Yes

Yes

Copyright IBM Corp. 1994, 2014

eth0

Option
included in
the Custom
installation?
(Yes/No)

17

Table 4. Installer option comparison (continued)

Options

Default value

Option
included in
the Quick
installation?
(Yes/No)

Specify a public network


interface.

eth1

Yes

Yes

Do you want to enable a public


network connection?

Yes

Yes

Yes

Do you want to enable the


public interface firewall?

Yes

No

Yes

Do you want to enable NAT


forwarding on the management
node?

Yes

No

Yes

Enable a BMC network that uses No


the default provisioning
template?

Yes

Yes

Create a new network

Yes

Yes

If creating a new BMC network,


specify a subnet for the BMC
network

N/A

Yes

Yes

If creating a new BMC network,


specify a subnet mask for the
BMC network

255.255.255.0

Yes

Yes

If creating a new BMC network, N/A


specify a gateway IP address for
the BMC network

No

Yes

If creating a new BMC network,


specify an IP address range for
the BMC network

192.168.1.3-192.168.1.254

Yes

Yes

Specify the hardware profile


used by your BMC network.
Hardware profile, options
include:

IBM_System_x_M4

Yes

Yes

Set the domain name for the


provision network.

private.dns.zone

Yes

Yes

Set the domain name for the


public network.

public.com

Yes

Yes

Select a BMC network, options


include:

Option
included in
the Custom
installation?
(Yes/No)

v Create a new network


v Public network
v Provision Network

v IPMI
v IBM_Flex_System_x
v IBM_System_x_M4
v IBM_iDataPlex_M4
v IBM_NeXtScale_M4

18

Installing IBM Platform HPC Version 4.2

Table 4. Installer option comparison (continued)

Options

Default value

Option
included in
the Quick
installation?
(Yes/No)

Specify the provisioning


compute node IP address range.
This is generated based on
management node interface.

10.10.0.3-10.10.0.200

No

Yes

Do you want to provisioning


compute nodes with the node
discovery method?

Yes

No

Yes

Specify the node discovery IP


address range. This is generated
based on management node
interface.

10.10.0.201-10.10.0.254

No

Yes

Set the IP addresses of the name 192.168.1.40,192.168.1.50


servers.

No

Yes

Specify the NTP server.

No

Yes

Do you want to export the /home Yes


directory?

No

Yes

Set the database administrator


password.

No

Yes

No

Yes

pool.ntp.org

pcmdbpass

Set the default root password for PASSW0RD


compute nodes.

Option
included in
the Custom
installation?
(Yes/No)

Quick installation roadmap


Before you begin your quick installation, use the following roadmap to prepare
your values for each installation option. You can choose to use the default example
values for some or all of the options.
Table 5. Preparing for PHPC quick installation
Option

Example values

1.

Select a mount point for


the depot (/install)
directory.

2.

Select the location that you CD/DVD drive


want to install the
operating system from.

3.

Specify a provision
network interface.

eth0

4.

Specify a public network


interface.

eth1

5.

Enable a BMC network


that uses the default
provisioning template?

Yes

Your values

Chapter 4. Performing an installation

19

Table 5. Preparing for PHPC quick installation (continued)

6.

Option

Example values

Select a BMC network,


options include:

Create a new network

Your values

v Create a new network


v Public network
v Provision Network
7.

If creating a new BMC


network, specify a subnet
for the BMC network.

192.168.1.0

8.

If creating a new BMC


network, specify a subnet
mask for the BMC
network.

255.255.255.0

9.

Specify the hardware


IBM_System_x_M4
profile used by your BMC
network. Hardware profile,
options include:
v IPMI
v IBM_Flex_System_x
v IBM_System_x_M4
v IBM_iDataPlex_M4
v IBM_NeXtScale_M4

10.

Set the provision network


domain name.

private.dns.zone

11.

Set a domain name for the


public network? (Yes/No)

Yes

12.

Set the public domain


name.

public.com or FQDN

Quick installation
You can configure the management node by using the quick installation option.

Before you begin


PHPC installation supports the Bash shell only.
v Before you start the PHPC installation, you must boot into the base kernel. The
Xen kernel is not supported.
v User accounts that are created before PHPC is installed are automatically
synchronized across compute nodes during node provisioning. User accounts
that are created after PHPC is installed are automatically synchronized across
compute nodes when the compute nodes are updated.
v You must be a root user to install.
v Installing PHPC requires you to provide the OS media. If you want to use the
DVD drive, ensure that no applications are actively using the drive (including
any command shell). If you started the PHPC installation in the DVD directory,
you can suspend the installation (Ctrl-z), change to another directory (cd ~),
and then resume the installation (fg). Alternately, you can start the installation
from another directory (for example: cd ~; python mount_point/phpcinstaller).

20

Installing IBM Platform HPC Version 4.2

v The /home mount point must have writable permission. Ensure that you have the
correct permissions to add new users the /home mount point.

About this task


The installer completes pre-checking processes and prompts you to answer
questions to complete the management node configuration. The following steps
summarize the installation of PHPC on your management node:
1.
2.
3.
4.

License Agreement
Management node pre-check
Specify installation settings
Installation

Complete the following installation steps:

Procedure
1. Choose one of the following installation methods:
v Download the PHPC ISO to the management node.
v Insert the PHPC DVD into the management node.
2. Mount the PHPC installation media:
v If you install PHPC from ISO file, mount the ISO into a directory such as
/mnt. For example:
# mount -o loop phpc-4.2.x64.iso /mnt

v If you install PHPC from DVD media, mount to a directory such as /mnt.
Tip: Normally, the DVD media is automatically mounted to
/media/PHPC-program_number. To start the installer, run: /media/PHPCprogram_number/phpc-installer. If the DVD is mounted without execute
permission, you must add python in front of the command (python
/media/PHPC-program_number/phpc-installer).
3. Start the PHPC installer, issue the following command:
# /mnt/phpc-installer

4. Accept the license agreement and continue.


5. Management node pre-checking automatically starts.
6. Choose the Quick Installation option as your installation method.
7. Select a mount point for the depot (/install) directory. The depot (/install)
directory stores installation files for PHPC. The PHPC management node
checks for the required disk space.
8. Select the location that you want to install the operating system from. The
operating system version that you select must be the same as the operating
system version on the management node.
OS Distribution installation from the DVD drive:
Insert the correct OS DVD disk into the DVD drive. The disk is
verified and added to the depot (/install) directory after you confirm
the installation. If the PHPC disk is already inserted, make sure to
insert the OS disk after you copy the PHPC core packages.
OS Distribution installation from an ISO image or mount point:
Enter the path for the OS Distribution or mount point, for
example:/iso/rhel/6.x/x86_64/rhel-server-6.x-x86_64-dvd.iso The
PHPC management node verifies that the operating system is a
supported distribution, architecture, and version.
Chapter 4. Performing an installation

21

Note: If the OS distribution is found on more than one ISO image,


use the first ISO image during the installation. After the PHPC
installation is completed, you can add the next ISO image from the
Web Portal.
If you choose to install from an ISO image or mount point, you must enter the
ISO image or mount point path.
9. Select a network interface for the provisioning network.
10. Select how the management node is connected to the public network. If the
management node is not connected to the public network, select: It is not
connected to the public network.
11. Enable a BMC network that uses the default provisioning template. If you
choose to enable a BMC network, you must specify the following options:
a. Select a BMC network. Options include:
v Public network
v Provision Network
v Create a new network. If you create a new BMC network, specify the
following options:
A subnet for the BMC network.
A subnet mask for the BMC network.
b. Select hardware profile for the BMC network.
12. Enter a domain name for the provisioning network.
13. Set a domain name for the public network.
14. Enter a domain name for the public network.
15. A summary of your selected installation settings is displayed. To change any
of these settings, press 99 to reselect the settings or press '1' to begin the
installation.

Results
You successfully completed the PHPC installation. You can find the installation log
here: /opt/pcm/log/phpc-installer.log.
To configure PHPC environment variables, run the following command: source
/opt/pcm/bin/pcmenv.sh. Configuration is not required for new login sessions.

What to do next
After you complete the installation, verify that your PHPC environment is setup
correctly.
To get started with PHPC, using your web browser, you can access the Web Portal
at http://hostname:8080 or http://IPaddress:8080. Log in with the user account
root and default password Cluster on the management node.

22

Installing IBM Platform HPC Version 4.2

Custom installation roadmap


Before you begin your custom installation, use the following roadmap to prepare
your values for each installation option. You can choose to use the default example
values for some or all of the options.
Table 6. Preparing for PHPC custom installation
Options

Example values

1.

Select a mount point for the depot


(/install) directory.

2.

Select the location that you want to install CD/DVD drive


the operating system from.

3.

Specify a provision network interface.

eth0

4.

Specify a public network interface.

eth1

5.

Do you want to enable the public


interface firewall (Yes/No)

Yes

6.

Do you want to enable NAT forwarding


on the management node? (Yes/No)

Yes

7.

Enable a BMC network that uses the


default provisioning template?

Yes

8.

Select one of the following options for


creating your BMC networks:
a.

9.

Your values

Create a new network and specify the Yes


following options:
i. Subnet

192.168.1.0

ii. Subnet mask

255.255.255.0

iii. Gateway IP address

192.168.1.1

iv. IP address range

192.168.1.3192.168.1.254

b.

Use the public network

N/A

c.

Use the provision Network

N/A

Specify the hardware profile used by


your BMC network. Hardware profile,
options include:

IBM_System_x_M4

v IPMI
v IBM_Flex_System_x
v IBM_System_x_M4
v IBM_iDataPlex_M4
v IBM_NeXtScale_M4
10.

Set the provision network domain name.

private.dns.zone

11.

Set a domain name for the public


network? (Yes/No)

Yes

12.

Set the public domain name.

public.com or FQDN

13.

Specify the provisioning compute node IP 10.10.0.3-10.10.0.200


address range. This is generated based on
management node interface.

14.

Do you want to provisioning compute


nodes with the node discovery method?
(Yes/No)

Yes

Chapter 4. Performing an installation

23

Table 6. Preparing for PHPC custom installation (continued)


Options

Example values

15.

Specify the node discovery IP address


range. This is generated based on
management node interface.

10.10.0.201-10.10.0.254

16.

Set the IP addresses of the name servers.

192.168.1.40,192.168.1.50

17.

Specify the NTP server.

pool.ntp.org

18.

Do you want to export the /home


directory? (Yes/No)

Yes

19.

Set the database administrator password.

pcmdbadm

20.

Set the default root password for


compute nodes.

Cluster

Your values

Custom installation
You can configure the management node by using the custom installation option.

Before you begin


Note: PHPC installation supports the Bash shell only.
v Before you start the PHPC installation, you must boot into the base kernel. The
Xen kernel is not supported.
v User accounts that are created before PHPC is installed are automatically
synchronized across compute nodes during node provisioning. User accounts
that are created after PHPC is installed are automatically synchronized across
compute nodes when the compute nodes are updated.
v You must be a root user to install.
v Installing PHPC requires you to provide the OS media. If you want to use the
DVD drive, ensure that no applications are actively using the drive (including
any command shell). If you started the PHPC installation in the DVD directory,
you can suspend the installation (Ctrl-z), change to another directory (cd ~),
and then resume the installation (fg). Alternately, you can start the installation
from another directory (for example: cd ~; python mount_point/phpcinstaller).
v The /home mount point must have writable permission. Ensure that you have the
correct permissions to add new users the /home mount point.

About this task


The installer completes pre-checking processes and prompts you to answer
questions to complete the management node configuration. The following steps
summarize the installation of PHPC on your management node:
1. License Agreement
2. Management node pre-check
3. Specify installation settings
4. Installation
Complete the following installation steps:

24

Installing IBM Platform HPC Version 4.2

Procedure
1. Choose one of the following installation methods:
v Download the PHPC ISO to the management node.
v Insert the PHPC DVD into the management node.
2. Mount the PHPC installation media:
v If you install PHPC from ISO file, mount the ISO into a directory such as
/mnt. For example:
# mount -o loop phpc-4.2.x64.iso /mnt

v If you install PHPC from DVD media, mount to a directory such as /mnt.
Tip: Normally, the DVD media is automatically mounted to
/media/PHPC-program_number. To start the installer, run: /media/PHPCprogram_number/phpc-installer. If the DVD is mounted without execute
permission, you must add python in front of the command (python
/media/PHPC-program_number/phpc-installer).
3. Start the PHPC installer, issue the following command:
# /mnt/phpc-installer

4. Accept the license agreement and continue.


5. Management node pre-checking automatically starts.
6. Select the Custom Installation option.
7. Select a mount point for the depot (/install) directory. The depot (/install)
directory stores installation files for PHPC. The PHPC management node
checks for the required disk space.
8. Select the location that you want to install the operating system from. The
operating system version that you select must be the same as the operating
system version on the management node.
OS Distribution installation from the DVD drive:
Insert the correct OS DVD disk into the DVD drive. The disk is
verified and added to the depot (/install) directory after you confirm
the installation. If the PHPC disk is already inserted, make sure to
insert the OS disk after you copy the PHPC core packages.
OS Distribution installation from an ISO image or mount point:
Enter the path for the OS Distribution or mount point, for
example:/iso/rhel/6.x/x86_64/rhel-server-6.x-x86_64-dvd.iso The
PHPC management node verifies that the operating system is a
supported distribution, architecture, and version.
Note: If the OS distribution is found on more than one ISO image,
use the first ISO image during the installation. After the PHPC
installation is completed, you can add the next ISO image from the
Web Portal.
If you choose to install from an ISO image or mount point, you must enter the
ISO image or mount point path.
9. Select a network interface for the provisioning network.
10. Enter the IP address range that is used for provisioning compute nodes.
11. Choose whether to provision compute nodes automatically with the node
discovery method.
12. Enter a node discovery IP address range to be used for provisioning compute
nodes by node discovery. The node discovery IP address range is a temporary
IP address range that is used to automatically provision nodes by using the
Chapter 4. Performing an installation

25

auto node discovery method. This range cannot overlap the range that is
specified for the provisioning compute nodes.
13. Select how the management node is connected to the public network. If the
management node is not connected to the public network, select: It is not
connected to the public network. If your management node is connected to
a public network, optionally, you can enable the following settings:
a. Enable PHPC specific rules for the management node firewall that is
connected to the public interface.
b. Enable NAT forwarding on the management node for all compute nodes.
14. Enable a BMC network that uses the default provisioning template. If you
choose to enable a BMC network, you must specify the following options:
a. Select a BMC network.
v Public network
v Provision Network
v Create a new network. If you create a new BMC network, specify the
following options:
A subnet for the BMC network.
A subnet mask for the BMC network.
A gateway IP address for the BMC network.
An IP address range for the BMC network.
b. Specify a hardware profile for the BMC network.
Table 7. Available hardware profiles based on hardware type
Hardware

Hardware profile

Any IPMI-based hardware

IPMI

IBM Flex System x220, x240, and x440

IBM_Flex_System_x

IBM System x3550 M4, x3650 M4, x3750 M4

IBM_System_x_M4

IBM System dx360 M4

IBM_iDataPlex_M4

IBM NeXtScale nx360 M4

IBM_NeXtScale_M4

15. Enter a domain name for the provisioning network.


16. Set a domain name for the public network.
17. Enter a domain name for the public network.
18. Enter the IP addresses of your name servers that are separated by commas
19. Set the NTP server.
20. Export the home directory on the management node and use it for all
compute nodes.
21. Enter the PHPC database administrator password.
22. Enter the root account password for all compute nodes.
23. A summary of your selected installation settings is displayed. To change any
of these settings, press 99 to reselect the settings or press '1' to begin the
installation.

What to do next
After you complete the installation, verify that your PHPC environment is setup
correctly.

26

Installing IBM Platform HPC Version 4.2

To get started with PHPC, using your web browser, you can access the Web Portal
at http://hostname:8080 or http://IPaddress:8080. Log in with the root user
account and password on the management node.

Chapter 4. Performing an installation

27

28

Installing IBM Platform HPC Version 4.2

Chapter 5. Performing a silent installation


Silent installation installs IBM Platform HPC software using a silent response file.
You can specify all of your installation options in the silent installation file before
installation.
Before you complete the installation using silent mode, complete the following
actions:
v Install the operating system on the management node.
v Ensure that you have the correct permissions to add new users the /home mount
point.
To complete the silent installation, complete the following steps:
1. Mount the PHPC installation media:
v If you install PHPC from ISO file, mount the ISO into a directory such as
/mnt. For example:
# mount -o loop phpc-4.2.x64.iso /mnt

v If you install PHPC from DVD media, mount to a directory such as /mnt.
Tip: Normally, the DVD media is automatically mounted to
/media/PHPC-program_number. To start the installer, run: /media/PHPCprogram_number/phpc-installer. If the DVD is mounted without execute
permission, you must add python in front of the command (python
/media/PHPC-program_number/phpc-installer).
2. Prepare the response file with installation options. The silent response file
phpc-autoinstall.conf.example is located in the /docs directory in the
Platform HPC ISO.
Note: If the OS distribution is found on more than one ISO image, use the first
ISO image during the installation. After the PHPC installation is completed,
you can add the next ISO image from the Web Portal.
3. Run the silent installation:
mnt/phpc-installer -f path_to_phpc-autoinstall.conf

where mnt is your mount point and path_to_phpc-autoinstall.conf is the location


of your silent install file.

Usage notes
v A valid installation path must be used. The installation path cannot include
special characters such as a colon (:), exclamation point (!) or space, and the
installation cannot begin until a valid path is used.

Response file for silent installation


Response file for IBM Platform HPC silent installation.
# IBM Platform HPC 4.2 Silent Installation Response File
# The silent installation response file includes all of the options that can
# be set during a Platform HPC silent installation.
# ******************************************************************** #
# NOTE: For any duplicated options, only the last value is used
#
# by the silent installation.
#
Copyright IBM Corp. 1994, 2014

29

# NOTE: Configuration options cannot start with a space or tab.


#
# ******************************************************************** #
[General]
#
# depot_path
#
#
# The depot_path option sets the path of the Platform HPC depot (/install) directory.
#
#Usage notes:
#
# 1. The Platform HPC installation requires a minimum available disk space of 40 GB.
#
# 2. If you specify depot_path = /usr/local/pcm/, the installer places all Platform
# HPC installation contents in the /usr/local/pcm/install directory and creates
# a symbol link named /install that points to the /usr/local/pcm/install directory.
#
# 3. If you specify depot_path = /install or depot_path = /, the installer places
# all Platform HPC installation content into the /install directory.
#
# 4. If you have an existing /install mount point, by default, the installation
# program places all installation contents into the /install directory regardless
# of the depot_path value.
depot_path = /
#
#
#
#
#
#

private_cluster_domain
The private_cluster_domain option sets the provisioning networks domain
name for the cluster. The domain must be a fully qualified domain name.
This is a mandatory option.

private_cluster_domain = private.dns.zone
#
#
#
#
#
#
#
#

provisioning_network_interface
The provisioning_network_interface option sets one network device on the
Platform HPC management node to be used for provisioning compute
nodes. An accepted value for this option is a valid NIC name that exists on
the management node. Values must use alphanumeric characters and cannot use
quotations ("). The value lo is not supported. This is a mandatory option.

provisioning_network_interface = eth0
#
# public_network_interface
#
# The public_network_interface option sets a network device on the Platform HPC
# management node that is used for accessing networks outside of the cluster.
The value
# must be a valid NIC name that exists on the management node. The value cannot be
# the same as the value specified for the provisioning_network_interface option.
# The value cannot be lo and cannot include quotations (").
# If this option is not defined, no public network interface is defined.
#public_network_interface = eth1
[Media]
#
# os_path
#
# The os_path option specifies the disc, ISO, or path of the first OS distribution

30

Installing IBM Platform HPC Version 4.2

used to
# install the
#
# The os_path
# - full path
# - full path
# - full path
#

Platform HPC node. The os_path is a mandatory option.


option must use one of the following options:
to CD-ROM device, for example: /dev/cdrom
to an ISO file, for example: /root/rhel-server-6.4-x86_64-dvd.iso
to a directory where an ISO is mounted, for example: /mnt/basekit

os_path = /root/rhel-server-<version>-x86_64-dvd.iso
[Advanced]
# NOTE: By default, advanced options use a default value if no value is specified.
#
#
#
#
#
#
#

excluded_kits
The excluded_kits option lists specific kits that do not get installed.
This is a comma-separated list. The kit name should be same with the name
defined in the kit configuration file. If this option is not defined,
by default, all kits are installed.

#excluded_kits = kit1,kit2
#
#
#
#
#
#

static_ip_range
The static_ip_range options sets the IP address range used for provisioning
compute nodes. If this option is not defined, by default, the value is
automatically based on the provision network.

#static_ip_range = 10.10.0.3-10.10.0.200
#
#
#
#
#
#
#
#

discovery_ip_range
The discovery_ip_range option sets the IP address range that is used for provisioning
compute nodes by node discovery. This IP address range cannot overlap with the IP range
used for provisioning compute nodes as specified by the static_ip_range option. You
can set the discovery_ip_range value to none if you do not want to use node discovery.
If this option is not defined, the default value is set to none.

#discovery_ip_range = 10.10.0.201-10.10.0.254
#
#
#
#
#
#
#

enable_firewall
The enable_firewall option enables Platform HPC specific rules for the management
node firewall to the public interface. This option is only available if the
public_network_interface is set to yes. If this option is not defined, by default,
the value is set to yes.

#enable_firewall = yes
#
#
#
#
#
#
#

enable_nat_forward
The enable_nat_forward option enables NAT forwarding on the management node
for all compute nodes. This option is only available if the enable_firewall
option is set to yes. If this option is not defined, by default,
the value is set to yes.

#enable_nat_forward = yes
#
# enable_bmcfsp
#
Chapter 5. Performing a silent installation

31

# The enable_bmcfsp option enables a BMC or FSP network with the default provisioning template.
# This option indicates which network is associated with BMC or FSP network.This is a
# mandatory option. If this option is not defined, by default, a BMC or FSP network is
# not enabled.
# Options include: new_network, public, provision
#
new_network option: Creates a new BMC or FSP network by specifyingi the following
options for the
# the new network
#
[bmcfsp_subnet]
#
[bmcfsp_subnet_mask]
#
[bmcfsp_gateway]
#
[bmcfsp_iprange]
#
will be applied to create a new network
#
public option: Creates a BMC or FSP network that uses the public network.
#
provision option: Creates a BMC or FSP network that uses the provision network.
#enable_bmcfsp = new_network
#
# bmcfsp_subnet
#
# Specify the subnet for the BMC or FSP network. This value must be different than the
value used by
# the public and provision networks. Otherwise, the BMC or FSP network set up fails. This
option is
# required if enable_bmcfsp = new_network.
#bmcfsp_subnet = 192.168.1.0
#
# bmcfsp_subnet_mask
#
# Specify the subnet mask for the BMC netwrok. This option is required if
enable_bmcfsp = new_network.
#bmcfsp_subnet_mask = 255.255.255.0
#
# bmcfsp_gateway
#
# Specify the gateway IP address for the BMC or FSP network.This option is available if
enable_bmcfsp = new_network.
#bmcfsp_gateway = 192.168.1.1
#
# bmcfsp_iprange
#
# Specify the IP address range for the BMC or FSP network. This option is required if
enable_bmcfsp = new_network
#bmcfsp_iprange = 192.168.1.3-192.168.1.254
#
# bmcfsp_hwprofile
#
# Specify a hardware profile to associate with the BMC or FSP network. This option is
required if enable_bmcfsp = new_network.
#
# bmcfsp_hwprofile options:
# For x86-based systems, the following are supported hardware profile options:
#
# IBM_System_x_M4: IBM System x3550 M4, x3650 M4, x3750 M4
# IBM_Flex_System_x:
IBM System x220, x240, x440
# IBM_iDataPlex_M4:
IBM System dx360 M4
# IPMI:
Any IPMI-based hardware
#

32

Installing IBM Platform HPC Version 4.2

# For POWER systems, the following are supported hardware profile options:
# IBM_Flex_System_p:
IBM System p260, p460
#bmcfsp_hwprofile = IBM_System_x_M4
#
#
#
#
#
#

nameservers
The nameservers option lists the IP addresses of your external name servers
using a comma-separated list.If this option is not define, by default,
the value is set to none.

#nameservers = 192.168.1.40,192.168.1.50
#
# ntp_server
#
# The ntp_server option sets the NTP server.If this option is not defined,
# by default, this value is set to pool.htp.org.
#ntp_server = pool.ntp.org
#
#
#
#
#
#

enable_export_home
The enable_export_home option specifies if the /home mount point exports to
the management node. The export home directory is used on all all compute nodes.
If this option is not defined, by default, this value is set to yes.

#enable_export_home = yes
#
# db_admin_password
#
# The db_admin_password option sets the Platform HPC database administrator password.
# If this option is not defined, by default, this value is set to pcmdbadm.
#db_admin_password = pcmdbadm
#
# compute_root_password
#
# The compute_root_password option sets the root account password for all compute nodes.
# If this option is not defined, by default, this value is set to Cluster.
#compute_root_password = Cluster
#
#
#
#
#
#
#

cluster_name
The cluster_name option sets the cluster name for the Platform HPC workload manager.
The cluster name must be a string containing any of the following characters: a-z, A-Z,
0-9 or underscore (_). The string length cannot exceed 39 characters.
If this option is not defined, by default, this value is set to phpc_cluster.

#cluster_name = phpc_cluster
#
#
#
#
#
#
#
#

cluster_admin
The cluster_admin specifies the Platform HPC workload manager administrator. This
can be a single user account name, or a comma-separated list of several user account
list. The first user account name in the list is the primary LSF administrator and
it cannot be the root user account. For example: cluster_admin=user_name1,user_name2...
If this option is not defined, by default, this value is set to phpcadmin.

#cluster_admin = phpcadmin
Chapter 5. Performing a silent installation

33

34

Installing IBM Platform HPC Version 4.2

Chapter 6. Verifying the installation


Ensure that you have successfully installed PHPC.
Note: You can find the installation log file phpc-installer.log in the /opt/pcm/log
directory. This log file includes details and results about your PHPC installation.
To verify that your installation is working correctly, log in to the management node
as a root user and complete the following tasks:
1. Source PHPC environment variables.
# . /opt/pcm/bin/pcmenv.sh

2. Check that the PostgreSQL database server is running.


# service postgresql status
(pid 13269) is running...

3. Check that the Platform HPC services are running.


# service phpc status
Show status of the LSF subsystem
lim (pid 31774) is running...
res (pid 27663) is running...
sbatchd (pid 27667) is running...
SERVICE
WEBGUI

STATUS
STARTED

WSM_PID
16550

PORT
8080

SERVICE
jobdt
plc
plc_group2
purger
vdatam

STATUS
STARTED
STARTED
STARTED
STARTED
STARTED

WSM_PID
5836
5877
5917
5962
6018

HOST_NAME
hjc-ip200
hjc-ip200
hjc-ip200
hjc-ip200
hjc-ip200

HOST_NAME
hjc-ip200

4. Log in to the Web Portal.


a. Open a supported web browser. Refer to the Release Notes for a list of
supported web browsers.
b. Go to http://mgtnode-IP:8080, where mgtnode-IP is the real management
node IP address. If you are connected to a public network, you can also
navigate to http://mgtnode-hostname:8080, where mgtnode-hostname is the
real management node hostname.
c. Log in as an administrator or a user. An administrator has administrative
privileges that include managing cluster resources. A user account is not
able to manage cluster resources but can manage jobs.
By default, PHPC creates a default administrative account where the
username and password is phpcadmin and phpcadmin. This default phpcadmin
administrator account has all administrative privileges.
d. After you log in, the Resource Dashboard is displayed in the Web Portal.

Copyright IBM Corp. 1994, 2014

35

36

Installing IBM Platform HPC Version 4.2

Chapter 7. Taking the first steps after installation


After your installation is complete, as an administrator you can get started with
managing your clusters.
The following tasks can be completed to get started with Platform HPC:
v Enabling LDAP support for user authentication
v Provision your nodes by adding the nodes to your cluster
v Modify your provisioning template settings
Manage image profiles
Manage network profiles
v Set up the HTTPS connection
v Submit jobs
v Create resource reports
v Create application templates
For more information about IBM Platform HPC, see the Administering IBM Platform
HPC guide.
For the latest release information about Platform HPC 4.2, see Platform HPC on
IBM Knowledge Center at http://www.ibm.com/support/knowledgecenter/
SSDV85_4.2.0.

Copyright IBM Corp. 1994, 2014

37

38

Installing IBM Platform HPC Version 4.2

Chapter 8. Troubleshooting installation problems


Troubleshooting problems that occurred during the IBM Platform HPC installation.
To help troubleshoot your installation, you can view the phpc-installer.log file
that is found in the /opt/pcm/log directory. This file logs the installation steps, and
any warnings and errors that occurred during the installation.
Note: During the installation, the installation progress is logged in a temporary
directory that is found here: /tmp/phpc-installer.
To view detailed error messages, run the installer in DEBUG mode when
troubleshooting the installation. To run the installer in debug mode, set the
PCM_INSTALLER_DEBUG environment variable. When running in DEBUG mode, the
installer does not clean up all the files when an error occurs. The DEBUG mode
also generates extra log messages that can be used to trace the installer's execution.
Set the PCM_INSTALLER_DEBUG environment variable to run the installer in DEBUG
mode:
#

PCM_INSTALLER_DEBUG=1 hpc-ISO-mount/phpc-installer

where hpc-ISO-mount is the mount point.


Note: Only use the PCM_INSTALLER_DEBUG environment variable, to troubleshoot a
PHPC installation using the interactive installer. Do not use it for installing PHPC
using silent install.
Common installation issues include the following issues:
v The Platform HPC installer fails with the error message Cannot reinstall
Platform HPC. Platform HPC is already installed. To install a new Platform
HPC product, you must first uninstall the installed product.
v During management node pre-checking, one of the checks fails. Ensure that all
Platform HPC requirements are met and rerun the installer. For more
information about Platform HPC see the Release Notes.
v Setting up shared NFS export fails during installation. To resolve this issue,
complete the following steps:
1. Check the rpcbind status.
#

service rpcbind status

2. If rpcbind is stopped, you must restart it and run the S03_base_nfs.rc.py


script.
# service rpcbind start
# cd /opt/pcm/rc.pcm.d/
# pcmconfig -i ./S03_base_nfs.rc.py

v Cannot log in to the Web Portal, or view the Resource Dashboard in the Web
Portal.
Configure your web browser. Your web browser must be configured to accept
first-party and third-party cookies. In some cases, your browser default
settings can block these cookies. In this case, you need to manually change
this setting.
Restart the Web Portal. In most cases, the services that are required to run the
Web Portal start automatically. However, if the Web Portal goes down, you
Copyright IBM Corp. 1994, 2014

39

can restart services and daemons manually. From the command line, issue the
following command:# pmcadmin stop ; pmcadmin start

Configuring your browser


To properly configure your browser, you must have the necessary plug-ins
installed.

About this task


If you are using Firefox as your browser, you are required to have the Flash and
JRE plug-ins installed. To install the Flash and JRE plug-ins, complete the following
steps:

Procedure
1. Install the appropriate Adobe Flash Player plug-in from the Adobe website
(http://get.adobe.com/flashplayer).
2. Check that the Flash plug-in is installed. Enter about:plugins into the Firefox
address field.
Shockwave Flash appears in the list.
3. Check that the Flash plug-in is enabled. Enter about:config into the Firefox
address field. Find dom.ipc.plugins.enabled in the list and ensure that it has a
value of true. If it is set to false, double-click it to enable.
4. Restart Firefox.
5. Download the appropriate JRE plug-in installer from the Oracle website
(http://www.oracle.com/technetwork/java/javase/downloads/index.html).
The 64-bit rpm installer (jre-7u2-linux-x64.rpm) is recommended.
6. Exit Firefox.
To run Java applets within the browser, you must install the JRE plug-in
manually. For more information about installing the JRE plug-in manually, go
to http://docs.oracle.com/javase/7/docs/webnotes/install/linux/linux-plugininstall.html.
7. In the package folder, run the command:
rpm -ivh jre-7u2-linux-x64.rpm
8. When the installation is finished, enter the following commands:
cd /usr/lib64/mozilla/plugins
ln -s /usr/java/jre1.7.0_02/lib/amd64/libnpjp2.so

9. Check that the JRE plug-in was installed correctly. Start Firefox and enter
about:plugins into the Firefox address field.
Java(TM) Plug-in 1.7.0_02 is displayed in the list.

40

Installing IBM Platform HPC Version 4.2

Chapter 9. Setting up a high availability environment


Setup an IBM Platform HPC high availability environment.
To setup a high availability (HA) environment in Platform HPC, complete the
following steps.
Table 8. High availability environment roadmap
Actions

Description

Ensure that the high availability


requirements are met

Requirements for setting up a shared storage


device and a secondary management node
must be met.

Preparing high availability

Set up the secondary management node


with an operating system and Platform HPC
installation.

Enable a Platform HPC high availability


environment

Set up Platform HPC high availability on the


primary and secondary management nodes.

Complete the high availability enablement

After high availability is enabled setup up


the compute nodes.

Verify Platform HPC high availability

Ensure that Platform HPC high availability


is running correctly on the primary and
secondary management nodes.

Troubleshooting enablement problems

Troubleshooting problems that occurred


during a Platform HPC high availability
environment setup.

Preparing high availability


Preparing an IBM Platform HPC high availability environment.

Before you begin


Ensure that all high availability requirements are met and a shared file system is
created on a shared storage server.

About this task


To prepare a high availability environment, set the secondary management node
with the same operating system and PHPC version as on the primary management
node. After the secondary management node is set up, the necessary SSH
connections and configuration must be made between the primary management
node and the secondary management node.

Procedure
1. Install the operating system on the secondary node. The secondary
management node must use the same operating system and version as used on
the primary management node. Both management nodes must use the same
network and must be connected to the same network interface.
Refer to Installing and configuring the operating system on the management
node on page 13.
Copyright IBM Corp. 1994, 2014

41

2. Ensure that the time and time zone is the same on the primary and secondary
management nodes.
a. To verify the current time zone, run the cat /etc/sysconfig/clock
command. To determine the correct time zone, refer to the information
found in the /usr/share/zoneinfo directory.
b. If the time zone is incorrect, update the time zone. To update the time
zone, set the correct time zone in the /etc/sysconfig/clock file.
For example:
For RHEL:
ZONE=US/Eastern

For SLES:
TIMEZONE=America/New_York

c. Set the local time in the /etc/localtime file, for example:


ln s /usr/share/zoneinfo/US/Eastern /etc/localtime

d. Set the date on both management nodes. Issue the following command on
both management nodes.
date -s current_time

e. If the management nodes already have PHPC installed, run the following
command on both management node to get the system time zone.
lsdef -t site -o clustersite -i timezone

If the system time zones are different, update the system time zone on the
secondary node, run the following command:
chdef -t site -o clustersite timezone=US/Eastern

3. Install PHPC on the secondary node. You must use the same PHPC ISO file as
you used for the management node. You can complete the installation using the
installer or the silent installation.
The installer includes an interactive display where you can specify your
installation options, make sure to use the same installation options as the
primary management node. Installation options for the primary management
node are found in the installation log file (/opt/pcm/log/phpc-installer.log)
on the primary management node. Refer to Chapter 4, Performing an
installation, on page 17.
If you use the silent installation to install PHPC, you can use the same response
file for both management nodes. Refer to Chapter 5, Performing a silent
installation, on page 29
4. Verify that the management nodes can access the shared file systems, issue the
showmount -e nfs-server-ip command, where nfs-server-ip is the IP address of
the NFS server that connects to the provision network.
5. Add the secondary management node entry to the /etc/hosts file on the
primary management node. Ensure that the failover node name can be resolved
to the secondary management node provision IP address. Run the command
below on the primary management node.
echo "secondary-node-provision-ip secondary-node-name" >> /etc/hosts
#ping secondary-node-name

where secondary-node-provision-ip is the provision IP address of the secondary


node and secondary-node-name is the name of the secondary node.
For example: #echo "192.168.1.4 backupmn" >> /etc/hosts
6. Backup and configure a passwordless SSH connection between the primary
management node and the secondary node.

42

Installing IBM Platform HPC Version 4.2

# Back up the SSH key on the secondary node.


ssh secondary-node-name cp rf /root/.ssh /root/.ssh.PCMHA
# Configure passwordless SSH between the management node and the secondary node.
cat /root/.ssh/id_rsa.pub > /root/.ssh/authorized_keys
scp r /root/.ssh/* secondary-node-name:/root/.ssh

where secondary-node-provision-ip is the provision IP address of the secondary


node and secondary-node-name is the name of the secondary node.
7. Prepare the compute nodes. These steps are used for provisioned compute
nodes that you do not want to reprovision.
a. Shutdown the LSF services on the compute nodes.
# xdsh __Managed service lsf stop

b. Unmount and remove the /home and /share mount points on the compute
nodes.
# updatenode __Managed mountnfs del
# xdsh __Managed 'umount /home'
# xdsh __Managed 'umount /shared'

Enable a high availability environment


Enable an IBM Platform HPC high availability environment.

Before you begin


Ensure that the secondary management node is installed and setup correctly.
Ensure that SSH connections are configured and network settings are correct
between the primary management node and the secondary management node.

About this task


You can set up the high availability environment using the high availability
management tool (pcmhatool). The tool defines and sets up a high availability
environment between the management nodes using a predefined high availability
definition file.
Note: The high availability management tool (pcmhatool) supports Bash shell only.

Procedure
1. Define a high availability definition file according to your high availability
settings, including: virtual name, virtual IP address, and shared storage. The
high availability definition file example ha.info.example is in the
/opt/pcm/share/examples/HA directory. Refer to High availability definition
file on page 67.
2. Set up a high availability environment.
Setup can take several minutes to synchronize data to shared storage. Ensure
that the shared storage server is always available. Issue the following command
on the primary management node.
pcmhatool config -i ha-definition-file

-s secondary-management-node

where ha-definition-file is the high availability definition file that you created in
step 1, and secondary-management-node is the name of the secondary
management node.

Chapter 9. Setting up high availability

43

Usage notes
1. During a high availability enablement, some of the services start on the standby
management node instead of the active management node. After a few
minutes, they switch to the active management node.
2. If the management node crashes during the high availability environment
setup, rerun the pcmhatool command and specify the same options. Running
this command again cleans up the incomplete environment and starts the high
availability enablement again.
3. You can find the enablement log file (pcmhatool.log) in the /opt/pcm/log
directory. This log file includes details and results about the high availability
environment setup.
4. If you enable high availability, the pcmadmin command cannot be used to restart
the PERF loader.
In a high availability, use the following commands to restart the PERF loader:
pcm-ha-support
pcm-ha-support
pcm-ha-support
pcm-ha-support
pcm-ha-support

start
start
start
start
start

--service
--service
--service
--service
--service

PLC
PLC2
JOBDT
PTC
PURGER

What to do next
After the high availability enablement is complete, verify that the Platform HPC
high availability environment is set up correctly.

Completing the high availability enablement


After high availability is enabled, you can set up and configure additional options,
such as configuring an IPMI device as a fencing device to protect your high
availability cluster from malfunctioning nodes and services. You can also set up
email notification when a failover is triggered.

Configure IPMI as a fencing device


In a high availability cluster that has only two management nodes, it is important
to configure fencing on an IPMI device. Fencing is the process of isolating a node
or protecting shared resources from a malfunctioning node within a high
availability environment. The fencing process locates the malfunctioning node and
disables it.
Use remote hardware control to configure fencing on an IPMI device.

Before you begin


This fencing method requires both management nodes to be controlled remotely
using IPMI. If your management nodes are on a power system or using a different
remote power control method, you must create the corresponding fencing script
accordingly.

Procedure
1. Create an executable fencing script on the shared file system. For example, you
can use the example fencing script (fencing_ipmi.sh) that is found in the
/opt/pcm/share/examples/HA directory. Run the following commands to create
the script on a shared file system. Ensure that you modify fencing_ipmi.sh to
your real environment settings.

44

Installing IBM Platform HPC Version 4.2

mkdir -p /install/failover
cp /opt/pcm/share/examples/HA/fencing_ipmi.sh /install/failover

2. Edit the HA controller service agent configuration file (ha_wsm.cfg) in the


/opt/pcm/etc/failover directory on the active management node. In the
[__Failover__] section, set the value for fencing_action parameter to the
absolute path of your custom script. For example:
fencing_action =/install/failover/fencing_ipmi.sh

3. Restart the PCMHA service agent.


pcm-ha-support start --service PCMHA

Create a failover notification


Create a notification, such as an email notification, for a triggered failover.

Before you begin


Note: Before you can send email for a triggered failover, you must configure your
mail parameters. Refer to Setting up SMTP mail settings.

Procedure
1. Create an executable script on the shared file system. For example, you can use
an executable script that sends an email when a failover is triggered. An
example send email script (send_mail.sh) is in the /opt/pcm/share/examples/HA
directory. Run the following commands to create the script on a shared file
system. Ensure that you modify send_mail.sh to your real environment
settings.
mkdir -p /install/failover
cp /opt/pcm/share/examples/HA/send_mail.sh /install/failover

2. Edit the high availability controller configuration file (ha_wsm.cfg) on the


management node in the /opt/pcm/etc/failover directory. In the[__Failover__]
section, set the failover_action parameter to the absolute path of your custom
script. For example:
failover_action=/install/failover/send_mail.sh

3. Restart the high availability environment.


pcm-ha-support start --service PCMHA

Setting up SMTP mail settings


Specify SMTP mail settings in IBM Platform HPC.

Before you begin


To send email from Platform HPC, an SMTP server must already be installed and
configured.

Procedure
1. Log in to the Web Portal as the system administrator.
2. In the System & Settings tab, click General Settings.
3. Expand the Mail Settings heading.
a. Enter the mail server (SMTP) host.
b. Enter the mail server (SMTP) port.
c. Enter the user account. This field is only required by some servers.
d. Enter the user account password. This field is only required by some
servers.
Chapter 9. Setting up high availability

45

4. Click Apply.

Results
SMTP server settings are configured. Platform HPC uses the configured SMTP
server to send email. The account from which the mail is sent is the user email
account. However, if the user email account is not specified then the email account
uses the management node name as the email address.

Verifying a high availability environment


Verify an IBM Platform HPC high availability environment.

Before you begin


You can find the enablement log file (pcmhatool.log) in the /opt/pcm/log directory.
This log file includes details and results about your PHPC enablement.

Procedure
1. Log on to the management node as a root user.
2. Source Platform HPC environment variables.
# . /opt/pcm/bin/pcmenv.sh

3. Check that Platform HPC high availability is configured.


# pcmhatool info
Configuring status: OK
================================================================
HA group members: master, failover
Virtual node name: virtualmn
Virtual IP for <eth0:0>: 192.168.0.100
Virtual IP for <eth1:0>: 172.20.7.100
Shared work directory on: 172.20.7.200:/export/data
Shared home directory on: 172.20.7.200:/export/home

4. Check that Platform HPC services are running. All services must be in state
STARTED, for example:
# service phpc status
Show status of the LSF subsystem
lim (pid 29003) is running...
res (pid 29006) is running...
sbatchd (pid 29008) is running...
SERVICE STATE
ALLOC CONSUMER RGROUP RESOURCE
PLC
STARTED 32
/Manage* Manag* master
PTC
STARTED 34
/Manage* Manag* master
PURGER
STARTED 35
/Manage* Manag* master
WEBGUI
STARTED 31
/Manage* Manag* master
JOBDT
STARTED 36
/Manage* Manag* master
PLC2
STARTED 33
/Manage* Manag* master
PCMHA
STARTED 28
/Manage* Manag* master
PCMDB
STARTED 29
/Manage* Manag* master
XCAT
STARTED 30
/Manage* Manag* master

SLOTS SEQ_NO INST_STATE ACTI


1
1
RUN
9
1
1
RUN
8
1
1
RUN
7
1
1
RUN
4
1
1
RUN
6
1
1
RUN
5
1
1
RUN
1
1
1
RUN
2
1
1
RUN
3

5. Log in to the Web Portal.


a. Open a supported web browser. Refer to the Release Notes for a list of
supported web browsers.
b. Go to http://mgtnode-virtual-IP:8080, where mgtnode-virtual-IP is the
management node virtual IP address. If you are connected to a public
network, you can also navigate to http://mgtnode-virtual-hostname:8080,
where mgtnode-virtual-hostname is the virtual management node hostname.

46

Installing IBM Platform HPC Version 4.2

If HTTPS is enabled, go to https://mgtnode-virtual-IP:8443 or


https://mgtnode-virtual-hostname:8443 to log in to the web portal.
c. Log in as an administrator or user. An administrator has administrative
privileges that include managing cluster resources. A user account is not
able to manage cluster resources but can manage jobs.
d. After you log in, the Resource Dashboard is displayed in the Web Portal.
Under the Cluster Health option, both management nodes are listed.

Troubleshooting a high availability environment enablement


Troubleshooting an IBM Platform HPC high availability environment.
To help troubleshoot your high availability enablement, you can view the log file is
found here /opt/pcm/log/pcmhatool.log. This file logs the high availability
enablement steps, and any warnings and errors that occurred during the high
availability enablement.
Common high availability enablement issues include the following issues:
v When you run a command on the management node, the command stops
responding.
To resolve this issue, log in the management node with a new session. Ensure
that the external NFS server is available and check the network connection to
the NFS server is available. If you cannot log in the management node, try to
reboot it.
v When you check the Platform HPC service status, one of the service agent
statuses is set to ERROR.
When the monitored service daemon is down, the service agent attempts to
restart it several times. If it continually fails, the service agent is set to ERROR.
To resolve this issue, check the service daemon log for more detail on how to
resolve this problem. If the service daemon can be started manually, restart the
service agent again, issue the following command:
pcm-ha-support start --service service_name

where service_name is the name of the service that is experiencing the problem.
v Services are running on the standby management node after an automatic
failover occurs due to a provision network failure.
Platform HPC high availability environment uses the provision network for
heartbeat communication. The provision network failure causes the management
nodes to lose the communication, and fencing to stop working. To resolve this
issue, stop the service agents manually, issue the following command:
pcm-ha-support stop --service all

v Parsing high availability settings fails.


To resolve this issue, ensure that the high availability definition file does not
have any formatting errors, the correct virtual name, and that the IP address
does not conflict with an existing node. conflicts with existing managed node.
Also, ensure that the xCAT daemon is running by issuing the command tabdump
site.
v During the pre-checking, one of the checks fails.
To resolve this issue, ensure that all Platform HPC high availability requirements
are met and rerun the high availability enablement tool.
v Syncing data to shared directory fails.

Chapter 9. Setting up high availability

47

To resolve this issue, ensure that the network connection to the external shared
storage is stable during the high availability enablement. If a timeout occurs
during data synchronization, rerun the tool by setting PCMHA_NO_CLEAN
environment variable. This environment variable ensures that existing data on
the NFS server is unchanged.
#PCMHA_NO_CLEAN=1 pcmhatool config i ha-definition-file

s secondary-management-node

where ha-definition-file is the high availability definition file and


secondary-management-node is the name of the secondary management node.
v Cannot log in to the Web Portal, or view the Resource Dashboard in the Web
Portal.
All Platform HPC services are started a few minutes after the high availability
enablement. Wait a few minutes and try again. If the issue persists, run the high
availability diagnostic tool to check the running status.
#pcmhatool check

48

Installing IBM Platform HPC Version 4.2

Chapter 10. Upgrading IBM Platform HPC


Upgrade IBM Platform HPC from Version 4.1.1.1 to Version 4.2. Additionally, you
can upgrade the product entitlement files for Platform Application Center or LSF.

Upgrading to Platform HPC Version 4.2


Upgrade from Platform HPC Version 4.1.1.1 to Version 4.2. The upgrade procedure
ensures that the necessary files are backed up and necessary files are restored.
The following upgrade paths are available:
v Upgrading from Platform HPC 4.1.1.1 to 4.2 without OS reinstall
v Upgrading from Platform HPC 4.1.1.1 to 4.2 with OS reinstall
If any errors occur during the upgrade process, you can roll back to an earlier
version of Platform HPC.
For a list of all supported upgrade procedures, refer to the Release notes for Platform
HPC 4.2 guide.

Upgrade planning
Upgrading IBM Platform HPC involves several steps that you must complete in
the appropriate sequence. Review the upgrade checklist and upgrade roadmap
before you begin the upgrade process.

Upgrading checklist
Use the following checklist to review the necessary requirements before upgrading.
In order to upgrade to the newest release of IBM Platform HPC, ensure you meet
the following criteria before proceeding with the upgrade.
Table 9.
Requirements

Description

Hardware requirements

Ensure that you meet the hardware


requirements for Platform HPC.
Refer to PHPC requirements on page 9.

Software requirements

Ensure that you meet the software


requirements for Platform HPC.
Refer to PHPC requirements on page 9.

External storage device

Obtain an external storage to store the


necessary backup files. Make sure that the
external storage is larger than the size of
your backup files.

Obtain a copy of the Platform HPC 4.2 ISO

Get a copy of Platform HPC 4.2

(Optional) Obtain a copy of the latest


supported version operating system

Optionally, you can upgrade your operating


system to the latest supported version.

Copyright IBM Corp. 1994, 2014

49

Upgrading roadmap
Overview of the upgrade procedure.
Table 10. Upgrading Platform HPC
Actions

Description

1.

Upgrading checklist

Ensure that you meet all of the requirements


before upgrading Platform HPC.

2.

Preparing to upgrade

Before you can upgrade to the newest release


of Platform HPC you must complete specific
tasks.

3.

Creating a Platform HPC 4.1.1.1


backup

Create a backup of your current Platform HPC


4.1.1.1 settings and database. This backup is
used to restore your existing settings to the
newer version of Platform HPC.

4.

Perform the Platform HPC upgrade

Perform the upgrade using your chosen path:


v Upgrading to Platform HPC 4.2 without OS
reinstall
v Upgrading to Platform HPC 4.2 with OS
reinstall

5.

Completing the upgrade

Ensure that data is restored and services are


restarted.

6.

Verifying the upgrade

Ensure that PHPC is successfully upgraded.

7.

(Optional) Applying fixes

After you upgrade PHPC, you can check if


there are any fixes available though the IBM
Fix Central.

Upgrading to Platform HPC 4.2 without OS reinstall


Upgrade your existing installation of IBM Platform HPC to the most recent version
without reinstalling the operating system on the management node.
Note that if you are upgrading Platform HPC to Version 4.2 without reinstalling
the operating system, the PMPI kit version is not upgraded.

Preparing to upgrade
Before upgrading your IBM Platform HPC installation, there are some steps you
should follow to ensure your upgrade is successful.

Before you begin


To prepare for your upgrade, ensure that you have the following items:
v You must have an external backup to store the contents of your 4.1.1.1 backup.
v The Platform HPC 4.2 ISO file.
v If you are upgrading the operating system, make sure that you have the RHEL
ISO file, and that you have a corresponding OS distribution created.
For additional requirements refer to Upgrading checklist on page 49.

About this task


Before you upgrade to the next release of Platform HPC, you must complete the
following steps:

50

Installing IBM Platform HPC Version 4.2

Procedure
1. Mount the Platform HPC installation media:
mount -o loop phpc-4.2.x64.iso /mnt

2.

Upgrade the pcm-upgrade-tool package.


For RHEL:
rpm -Uvh /mnt/packages/repos/kit-phpc-4.2-rhels-6-x86_64/pcm-upgrade-tool-*.rpm

For SLES:
rpm -Uvh /mnt/packages/repos/kit-phpc-4.2-sles-11-x86_64/pcm-upgrade-tool-*.rpm

3. Set up the upgrade environment.


export PATH=${PATH}:/opt/pcm/libexec/

4. Prepare an external storage.


a. Ensure that the external storage has enough space for the backup files. To
check how much space you require for the back, run the following
commands:
# du -sh /var/lib/pgsql/data
# du -sh /install/

Note: It is recommended that the size of your external storage is greater


than the combined size of the database and the /install directory.
b. On the external storage, create a directory for the database backup.
mkdir /external-storage-mnt/db-backup

where the external-storage-mnt is the backup location on your external


storage.
c. Create a directory for the configuration file backup.
mkdir /external-storage-mnt/config-backup

where the external-storage-mnt is the backup location on your external


storage.
5. Determine which custom metrics you are using, if any. The custom metrics are
lost in the upgrade process, and can manually be re-created after the upgrade is
completed.
6. If you created any new users after Platform HPC was installed, you must
include these new users in your backup.
/opt/xcat/bin/updatenode mn-host-name -F

where mn-host-name is the name of your management node.

Backing up Platform HPC


Create a backup of your current Platform HPC installation that includes a backup
of the database and settings before you upgrade to a newer version of Platform
HPC.
Note: The backup procedure does not back up any custom configurations. After
the upgrade procedure is completed, the following custom configurations can be
manually re-created:
v Customization to the PERF loader, including internal data collection and the
purger configuration files
v Customization to the Web Portal Help menu navigation
v Addition of custom metrics
v Alert polices
Chapter 10. Upgrading

51

v LDAP packages and configurations

Before you begin


Platform HPC does not back up or restore LSF configuration files or data. Before
you upgrade, make sure to back up your LSF configuration files and data. After
the upgrade is complete, you can apply your backed up configuration files and
data.

Procedure
1. Stop Platform HPC services:
pcm-upgrade-tool.py services --stop

2. Create a database backup on the external storage. The database backup backs
up the database data and schema.
pcm-upgrade-tool.py backup --database -d /external-storage-mnt/db-backup/

where external-storage-mnt is the backup location on your external storage. The


backup includes database files and the backup configuration file pcm.conf.
3. Create a configuration file backup on the external storage.
pcm-upgrade-tool.py backup --files -d /external-storage-mnt/config-backup/

Performing the Platform HPC upgrade


Perform the upgrade without reinstalling the operating system and restore your
settings.

Before you begin


Ensure that a backup of your previous settings was created before you proceed
with the upgrade.

Procedure
1. Upgrade Platform HPC from 4.1.1.1 to 4.2, complete the following steps:
a. Upgrade the database schema.
pcm-upgrade-tool.py upgrade --schema

b. If you created custom metrics in Platform HPC 4.1.1.1, you can manually
re-create them. See more about Defining metrics in Platform HPC.
c. Start the HTTP daemon (HTTPd).
For RHEL:
# service httpd start

For SLES:
# service apache2 start

d. Start the xCAT daemon.


# service xcatd start

e. Upgrade Platform HPC.


pcm-upgrade-tool.py upgrade

--packages -p /root/phpc-4.2.x64.iso

f. Copy the Platform HPC entitlement file to the /opt/pcm/entitlement


directory.
2. Restore settings and database data, complete the following steps:
a. Stop the xCAT daemon.
/etc/init.d/xcatd stop

b. Restore database data from a previous backup.

52

Installing IBM Platform HPC Version 4.2

pcm-upgrade-tool.py restore --database -d /external-storage-mnt/db-backup/

where external-storage-mnt is the backup location on your external storage


and db-backup is the location of the database backup.
c. Restore configuration files from a previous backup.
pcm-upgrade-tool.py restore --files -f /external-storage-mnt/config-backup/
20130708-134535.tar.gz

where config-backup is the location of the configuration file backup.


3. Upgrade the LSF component from Version 9.1.1 to LSF 9.1.3.
a. Create an LSF installer configuration file (lsf.install.config) and add it to
the /install/kits/kit-phpc-4.2/other_files directory. Refer to the
lsf.install.config in the /install/kits/kit-phpc-4.1.1.1/other_files
directory and modify the parameters as needed.
b. Replace LSF postscripts to in directory /install/postscripts/.
cp /install/kits/kit-phpc-4.2/other_files/KIT_phpc_lsf_setup /install/postscripts/
cp /install/kits/kit-phpc-4.2/other_files/KIT_phpc_lsf_config /install/postscripts/
cp /install/kits/kit-phpc-4.2//other_files/lsf.install.config /install/postscripts/phpc

c. Extract LSF installer package to a temp directory. The LSF installer package
is placed at /install/kits/kit-phpc-4.2/other_files/ For example:
tar xvzf /install/kits/kit-phpc-4.2/other_files/lsf9.1.3_lsfinstall_linux_x86_64.tar.Z -C /tmp/lsf

d. Run the LSF installation.


1) Navigate to the LSF installer directory.
cd /tmp/lsf

2) Copy the lsf.install.config configuration file from


/install/kits/kit-phpc-4.2/other_files.
cp /install/kits/kit-phpc-4.2/other_files/lsf.install.config ./

3) Run the LSF installer.


./lsfinstall -f lsf.install.config

Completing the upgrade


To complete the upgrade to the next release of IBM Platform HPC, you must
restore your system settings, database settings, and update the compute nodes.

Procedure
1. Restart Platform HPC services.
pcm-upgrade-tool.py services --reconfig

2. Refresh the database and configurations:


pcm-upgrade-tool.py upgrade --postupdate

3. If you previously installed GMF and the related monitoring packages with
Platform HPC, you must manually reinstall these packages. To check which
monitoring packages are installed, run the following commands:
rpm
rpm
rpm
rpm

-qa
-qa
-qa
-qa

|
|
|
|

grep
grep
grep
grep

chassis-monitoring
switch-monitoring
gpfs-monitoring
gmf

a. Uninstall the GMF package and the monitoring packages.


rpm
rpm
rpm
rpm

-e
-e
-e
-e

--nodeps
--nodeps
--nodeps
--nodeps

pcm-chassis-monitoring-1.2.1-1.x86_64
pcm-switch-monitoring-1.2.1-1.x86_64
pcm-gpfs-monitoring-1.2.1-1.x86_64
pcm-gmf-1.2-1.x86_64

b. Install the GMF package that is found in the /install/kits/kit-pcm-4.2/


repos/kit-phpc-4.2-rhels-6-x86_64 directory.
Chapter 10. Upgrading

53

rpm -ivh pcm-gmf-1.2-1.x86_64.rpm

c. Install the switch monitoring package that is found in the


/install/kits/kit-pcm-4.2/repos/kit-phpc-4.2-rhels-6-x86_64 directory.
rpm -ivh pcm-switch-monitoring-1.2.1-1.x86_64.rpm

d. Install the chassis monitoring package that is found in the


/install/kits/kit-pcm-4.2/repos/kit-phpc-4.2-rhels-6-x86_64 directory.
rpm -ivh pcm-chassis-monitoring-1.2.1-1.x86_64.rpm

e. If you have GPFS installed, run the following command to install the GPFS
monitoring package. The GPFS monitoring package is available in the
/install/kits/kit-pcm-4.2/repos/kit-phpc-4.2-rhels-6-x86_64 directory.
rpm ivh pcm-gpfs-monitoring-1.2.1-1.x86_64.rpm

f. Restart Platform HPC services.


# pcmadmin service restart --group ALL

4. Upgrade compute nodes.


a. Check if the compute nodes are reachable. Compute node connections can
get lost during the upgrade process, ping the compute nodes to ensure that
they are connected to the management node:
xdsh noderange "/bin/ls"

For any compute nodes that have lost connection and cannot be reached,
use the rpower command to reboot the node:
rpower noderange reset

where noderange is a comma-separated list of nodes or node groups


b. Update compute nodes to include the Platform HPC package.
updatenode noderange -S

where noderange is a comma-separated list of nodes or node groups.


c. Restart monitoring services.
xdsh noderange "source /shared/ibm/platform_lsf/conf/ego/phpc_cluster/kernel/profile.ego; egosh ego shutdown -f; egosh ego start -f"

where noderange is a comma-separated list of nodes or node groups.


5. Restart the LSF cluster. Run the following command on the management node.
lsfrestart -f

6. An SSL V3 security issue exists within the Tomcat server when HTTPS is
enabled. If you have not previously taken steps to fix this issue, you can skip
this step. Otherwise, if you have HTTPS enabled, complete the following steps
to fix this issue.
a. Edit the $GUI_CONFDIR/server.xml file. In the connector XML tag, set the
sslProtocol value from SSL to TLS, and save the file. For example:
<Connector port="${CATALINA_HTTPS_START_PORT}" maxHttpHeaderSize="8192
maxThreads="${CATALINA_MAX_THREADS}" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" disableUploadTimeout="true"
acceptCount="100" scheme="https" secure="true"
clientAuth="want" sslProtocol="TLS" algorithm="ibmX509"
compression="on" compressionMinSize="2000"
compressableMimeType="text/html,text/xml,text/css,text/javascript,text/plain"
connectionTimeout="20000" URIEncoding="UTF-8"/>

b. Restart the Web Portal service.


pcmadmin service stop --service WEBGUI
pcmadmin service start --service WEBGUI

54

Installing IBM Platform HPC Version 4.2

Verifying the upgrade


Ensure that the upgrade procedure is successful and that Platform HPC is working
correctly.
Note: A detailed log of the upgrade process can be found in the upgrade.log file
in the /opt/pcm/log directory.

Procedure
1. Log in to the management node as a root user.
2. Source Platform HPC environment variables.
# . /opt/pcm/bin/pcmenv.sh

3. Check that the PostgreSQL database server is running.


# service postgresql status
(pid 13269) is running...

4. Check that the Platform HPC services are running.


# service xcatd status
xCAT service is running
# service phpc status
Show status of the LSF subsystem
lim (pid 15858) is running...
res (pid 15873) is running...
sbatchd (pid 15881) is running...
SERVICE
RULE-EN*
PCMD
JOBDT
PLC
PURGER
PTC
PLC2
WEBGUI
ACTIVEMQ

STATE
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED

ALLOC
18
17
12
13
11
14
15
19
16

CONSUMER
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*

RGROUP RESOURCE SLOTS


Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1

[
SEQ_NO
1
1
1
1
1
1
1
1
1

OK ]
INST_STATE
RUN
RUN
RUN
RUN
RUN
RUN
RUN
RUN
RUN

ACTI
17
16
11
12
10
13
14
18
15

5. Check that the correct version of Platform HPC is running.


# cat /etc/phpc-release

6. Log in to the Web Portal.


a. Open a supported web browser. Refer to the Release Notes for a list of
supported web browsers.
b. Go to http://mgtnode-IP:8080, where mgtnode-IP is the real management
node IP address. If you are connected to a public network, you can also
navigate to http://mgtnode-hostname:8080, where mgtnode-hostname is the
real management node hostname.
c. Log in as a root user. The root user has administrative privileges and maps
to the operating system root user.
d. After you log in, the Resource Dashboard is displayed in the Web Portal.

Upgrading to Platform HPC 4.2 with OS reinstall


Upgrade your existing installation of IBM Platform HPC to the most recent
version, and reinstall or upgrade the operating system on the management node.

Preparing to upgrade
Before upgrading your IBM Platform HPC installation, there are some steps you
should follow to ensure your upgrade is successful.

Chapter 10. Upgrading

55

Before you begin


To prepare for your upgrade, ensure that you have the following items:
v You must have an external backup to store the contents of your 4.1.1.1 backup.
v The Platform HPC 4.2 ISO file.
v If you are upgrading the operating system, make sure that you have the RHEL
ISO file, and that you have a corresponding OS distribution created.
For additional requirements refer to Upgrading checklist on page 49.

About this task


Before you upgrade to the next release of Platform HPC, you must complete the
following steps:

Procedure
1. Mount the Platform HPC installation media:
mount -o loop phpc-4.2.x64.iso /mnt

2.

Upgrade the pcm-upgrade-tool package.


For RHEL:
rpm -Uvh /mnt/packages/repos/kit-phpc-4.2-rhels-6-x86_64/pcm-upgrade-tool-*.rpm

For SLES:
rpm -Uvh /mnt/packages/repos/kit-phpc-4.2-sles-11-x86_64/pcm-upgrade-tool-*.rpm

3. Set up the upgrade environment.


export PATH=${PATH}:/opt/pcm/libexec/

4. Prepare an external storage.


a. Ensure that the external storage has enough space for the backup files. To
check how much space you require for the back, run the following
commands:
# du -sh /var/lib/pgsql/data
# du -sh /install/

Note: It is recommended that the size of your external storage is greater


than the combined size of the database and the /install directory.
b. On the external storage, create a directory for the database backup.
mkdir /external-storage-mnt/db-backup

where the external-storage-mnt is the backup location on your external


storage.
c. Create a directory for the configuration file backup.
mkdir /external-storage-mnt/config-backup

where the external-storage-mnt is the backup location on your external


storage.
5. Determine which custom metrics you are using, if any. The custom metrics are
lost in the upgrade process, and can manually be re-created after the upgrade is
completed.
6. If you created any new users after Platform HPC was installed, you must
include these new users in your backup.
/opt/xcat/bin/updatenode mn-host-name -F

where mn-host-name is the name of your management node.

56

Installing IBM Platform HPC Version 4.2

Backing up Platform HPC


Create a backup of your current Platform HPC installation that includes a backup
of the database and settings before you upgrade to a newer version of Platform
HPC.
Note: The backup procedure does not back up any custom configurations. After
the upgrade procedure is completed, the following custom configurations can be
manually re-created:
v Customization to the PERF loader, including internal data collection and the
purger configuration files
v Customization to the Web Portal Help menu navigation
v Addition of custom metrics
v Alert polices
v LDAP packages and configurations

Before you begin


Platform HPC does not back up or restore LSF configuration files or data. Before
you upgrade, make sure to back up your LSF configuration files and data. After
the upgrade is complete, you can apply your backed up configuration files and
data.

Procedure
1. Stop Platform HPC services:
pcm-upgrade-tool.py services --stop

2. Create a database backup on the external storage. The database backup backs
up the database data and schema.
pcm-upgrade-tool.py backup --database -d /external-storage-mnt/db-backup/

where external-storage-mnt is the backup location on your external storage. The


backup includes database files and the backup configuration file pcm.conf.
3. Create a configuration file backup on the external storage.
pcm-upgrade-tool.py backup --files -d /external-storage-mnt/config-backup/

Performing the Platform HPC upgrade


Perform the upgrade with reinstalling the operating system and restore your
settings.

Before you begin


Ensure that you have prepared for the upgrade and have an existing backup of
your previous settings.

Procedure
1. Reinstall the management node, complete the following steps:
a. Record the following management node network settings: hostname, IP
address, netmask, and default gateway.
b. If you are upgrading to a new machine, you must power off the old
management node before you power on the new management node.
c. Reinstall the RHEL 6.5 operating system on the management node. Ensure
you use the same network settings as the old management node, including:
hostname, IP address, netmask, and default gateway.
Chapter 10. Upgrading

57

Refer to Installing and configuring the operating system on the


management node on page 13 on more information on installing an RHEL
operating system.
2. Install Platform HPC 4.2. In this step, the RHEL operating system is specified.
If you are using a different operating system, specify the operating system
accordingly.
a. Locate the default silent installation template autoinstall.conf.example in
the docs directory in the installation ISO.
mount -o loop phpc-4.2.x64.rhel.iso /mnt
cp /mnt/docs/phpc-autoinstall.conf.example

./phpc-autoinstall.conf

b. Edit the silent installation template and set the os_kit parameter to the
absolute path of the operating system ISO.
vi ./phpc-autoinstall.conf

c. Start the installation by running the silent installation.


/mnt/phpc-installer -f ./phpc-autoinstall.conf

3. Set up your environment.


export PATH=${PATH}:/opt/pcm/libexec/

4. Restore settings and database data, complete the following steps:


a. Stop Platform HPC services.
pcm-upgrade-tool.py services -stop

b. If you created custom metrics in Platform HPC 4.1.1.1, you can manually
re-create them. Refer to the "Defining metrics in Platform HPC" section in
the Administering Platform HPC guide for more information.
c. Restore database data from a previous backup.
pcm-upgrade-tool.py restore --database -d /external-storage-mnt/db-backup/

where external-storage-mnt is the backup location on your external storage


and db-backup is the location of the database backup.
d. Restore configuration files from a previous backup.
pcm-upgrade-tool.py restore --files -f /external-storage-mnt/config-backup/
20130708-134535.tar.gz

where config-backup is the location of the configuration file backup.


Related information:
Installing and configuring the operating system on the management node on
page 13

Completing the upgrade


To complete the upgrade to the next release of IBM Platform HPC and complete
the operating system reinstallation, you must restore your system settings,
database settings, and update the compute nodes.

Procedure
1. Restart Platform HPC services.
pcm-upgrade-tool.py services --reconfig

2. By default, the OS distribution files are not backed up or restored. The OS


distribution files can be manually created after the management node upgrade
is complete and before upgrading the compute nodes. To recreate an OS
distribution, run the following commands:
a. Mount the operating system.
# mount -o loop rhel-6.4-x86_64.iso /mnt

58

Installing IBM Platform HPC Version 4.2

where rhel-6.4-x86_64.iso is the name of the OS distribution.


b. Create a new backup directory. The backup directory must be the same as
the OS distribution path. To determine the OS distribution path, use the
lsdef -t osdistro rhels6.4-x86_64 command to get the OS distribution
path.
# mkdir /install/rhels6.4/x86_64

c. Synchronize the new directory.


# rsync -a /mnt/* /install/rhels6.4/x86_64

3. Refresh the database and configurations:


pcm-upgrade-tool.py upgrade --postupdate

4. Update compute nodes. If you want to upgrade the compute nodes to a higher
OS version, you must reprovision them. Otherwise, complete this step.
a. Check if the compute nodes are reachable. Compute node connections can
get lost during the upgrade process, ping the compute nodes to ensure that
they are connected to the management node:
xdsh noderange "/bin/ls"

For any compute nodes that have lost connection and cannot be reached,
use the rpower command to reboot the node:
rpower noderange reset

where noderange is a comma-separated list of nodes or node groups


b. Recover the SSH connection to the compute nodes.
xdsh noderange -K

where noderange is a comma-separated list of nodes or node groups.


c. Update compute nodes to include the Platform HPC 4.2 package.
updatenode noderange -S

where noderange is a comma-separated list of nodes or node groups.


d. Restart monitoring services.
xdsh noderange "source /opt/pcm/ego/profile.platform;
egosh ego shutdown -f;
egosh ego start -f"

where noderange is a comma-separated list of nodes or node groups.


5. By default, the LDAP configurations are not backed up or restored. If you want
to enable LDAP, refer to "LDAP user authentication" section in the
Administering Platform HPC guide.
6. An SSL V3 security issue exists within the Tomcat server when HTTPS is
enabled. If you have not previously taken steps to fix this issue, you can skip
this step. Otherwise, if you have HTTPS enabled, complete the following steps
to fix this issue.
a. Edit the $GUI_CONFDIR/server.xml file. In the connector XML tag, set the
sslProtocol value from SSL to TLS, and save the file. For example:
<Connector port="${CATALINA_HTTPS_START_PORT}" maxHttpHeaderSize="8192
maxThreads="${CATALINA_MAX_THREADS}" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" disableUploadTimeout="true"
acceptCount="100" scheme="https" secure="true"
clientAuth="want" sslProtocol="TLS" algorithm="ibmX509"
compression="on" compressionMinSize="2000"
compressableMimeType="text/html,text/xml,text/css,text/javascript,text/plain"
connectionTimeout="20000" URIEncoding="UTF-8"/>
Chapter 10. Upgrading

59

b. Restart the Web Portal service.


pcmadmin service stop --service WEBGUI
pcmadmin service start --service WEBGUI

Verifying the upgrade


Ensure that the upgrade procedure is successful and that Platform HPC is working
correctly.
Note: A detailed log of the upgrade process can be found in the upgrade.log file
in the /opt/pcm/log directory.

Procedure
1. Log in to the management node as a root user.
2. Source Platform HPC environment variables.
# . /opt/pcm/bin/pcmenv.sh

3. Check that the PostgreSQL database server is running.


# service postgresql status
(pid 13269) is running...

4. Check that the Platform HPC services are running.


# service xcatd status
xCAT service is running
# service phpc status
Show status of the LSF subsystem
lim (pid 15858) is running...
res (pid 15873) is running...
sbatchd (pid 15881) is running...
SERVICE
RULE-EN*
PCMD
JOBDT
PLC
PURGER
PTC
PLC2
WEBGUI
ACTIVEMQ

STATE
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED
STARTED

ALLOC
18
17
12
13
11
14
15
19
16

CONSUMER
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*
/Manage*

RGROUP RESOURCE SLOTS


Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1
Manag*
* 1

[
SEQ_NO
1
1
1
1
1
1
1
1
1

OK ]
INST_STATE
RUN
RUN
RUN
RUN
RUN
RUN
RUN
RUN
RUN

ACTI
17
16
11
12
10
13
14
18
15

5. Check that the correct version of Platform HPC is running.


# cat /etc/phpc-release

6. Log in to the Web Portal.


a. Open a supported web browser. Refer to the Release Notes for a list of
supported web browsers.
b. Go to http://mgtnode-IP:8080, where mgtnode-IP is the real management
node IP address. If you are connected to a public network, you can also
navigate to http://mgtnode-hostname:8080, where mgtnode-hostname is the
real management node hostname.
c. Log in as a root user. The root user has administrative privileges and maps
to the operating system root user.
d. After you log in, the Resource Dashboard is displayed in the Web Portal.

Troubleshooting upgrade problems


Troubleshooting problems that occur when upgrading to the new release of IBM
Platform HPC.

60

Installing IBM Platform HPC Version 4.2

To help troubleshoot your upgrade process, you can view the upgrade.log file that
is found in the /opt/pcm/log directory. This file logs informational information
about the upgrade procedure, and logs any warnings or errors that occur during
the upgrade process
Common upgrade problems include the following issues:
v Cannot log in to the Web Portal after upgrading to Platform HPC Version 4.2. To
resolve this issue try the following resolutions:
Restart the Web Portal. In most cases, the services that are required to run the
Web Portal start automatically. However, if the Web Portal goes down, you
can restart services and daemons manually. From the command line, issue the
following command:
# pcmadmin service restart --service WEBGUI

Then run the following command from the management node to resolve this
issue:
/opt/pcm/libexec/pcmmkcert.sh /root/.xcat/keystore_pcm

After upgrading to Platform HPC Version 4.2, some pages in the Web Portal do
not display or display old data. To resolve this issue, clear your web browser
cache and relogin to the Web Portal.
v After upgrading to Platform HPC Version 4.2, some pages in the Web Portal do
not display. Run the following command from the management node to resolve
this issue:
v

/opt/pcm/libexec/pcmmkcert.sh /root/.xcat/keystore_pcm

v If any of the following errors are found in the upgrade.log file that is found in
the /opt/pcm/log directory, they can be ignored and no further actions need to
be taken.
psql:/external-storage-mnt/db-backup/pmc_group_role.data.sql:25: ERROR:
permission denied: "RI_ConstraintTrigger_17314" is a system trigger
psql:/external-storage-mnt/db-backup/pmc_group_role.data.sql:29: ERROR:
permission denied: "RI_ConstraintTrigger_17314" is a system trigger
psql:/opt/pcm/etc/upgrade/postupdate/4.2/update-pcmgui-records.sql:7:
ERROR: duplicate key value violates unique constraint "ci_purge_register_pkey"
DETAIL: Key (table_name)=(pcm_node_status_history) already exists.
psql:/opt/pcm/etc/upgrade/postupdate/4.2/update-pcmgui-records.sql:11:
ERROR: duplicate key value violates unique constraint "ci_purge_register_pkey"
DETAIL: Key (table_name)=(lim_host_config_history) already exists.
psql:/opt/pcm/etc/upgrade/postupdate/4.2/update-pcmgui-records.sql:13:
ERROR: duplicate key value violates unique constraint "pmc_role_pkey"
DETAIL: Key (role_id)=(10005) already exists.
psql:/opt/pcm/etc/upgrade/postupdate/4.2/update-pcmgui-records.sql:15:
ERROR: duplicate key value violates unique constraint "pmc_resource_permission_pkey"
DETAIL: Key (resperm_id)=(11001-5) already exists.
psql:/opt/pcm/etc/upgrade/postupdate/4.2/update-pcmgui-records.sql:18:
ERROR: duplicate key value violates unique constraint "pmc_role_permission_pkey"
DETAIL: Key (role_permission_id)=(10009) already exists.

Rollback to Platform HPC 4.1.1.1


Revert to the earlier version of Platform HPC.

Before you begin


Before you rollback to Platform HPC 4.1.1.1, ensure that you have both the
Platform HPC 4.1.1.1 ISO and the original operating system ISO.
Chapter 10. Upgrading

61

Procedure
1. Reinstall the management node, complete the following steps:
a. Record the following management node network settings: hostname, IP
address, netmask, and default gateway.
b. Reinstall the original operating system on the management node. Ensure
you use the same network settings as the old management node, including:
hostname, IP address, netmask, and default gateway.
Refer to Installing and configuring the operating system on the
management node on page 13 on more information on installing an
operating system.
2. Install Platform HPC 4.1.1.1, complete the following steps:
a. Locate the default silent installation template autoinstall.conf.example in
the docs directory in the installation ISO.
mount -o loop phpc-4.1.1.1.x86_64.iso /mnt
cp /mnt/docs/phpc-autoinstall.conf.example

./phpc-autoinstall.conf

b. Edit the silent installation template and set the os_kit parameter to the
absolute path for the operating system ISO.
vi ./phpc-autoinstall.conf

c. Start the installation by running the installation program specifying the


silent installation file.
/mnt/phpc-installer -f ./phpc-autoinstall.conf

3. Restore settings and database data, complete the following steps:


a. Set up the environment:
export PATH=${PATH}:/opt/pcm/libexec/

b. Stop Platform HPC services:


pcm-upgrade-tool.py services -stop

c. If you created custom metrics in Platform HPC 4.1.1.1, you can manually
re-create them. Refer to the "Defining metrics in Platform HPC" section in
the Administering Platform HPC guide for more information.
d. Restore database data from a previous backup.
pcm-upgrade-tool.py restore --database -d /external-storage-mnt/db-backup/

where external-storage-mnt is the backup location on your external storage


and db-backup is the location of the database backup.
e. Restore configuration files from a previous backup.
pcm-upgrade-tool.py restore --files -f /external-storage-mnt/config-backup/20130708-134535.tar.gz

where config-backup is the location of the configuration file backup.


4. Restart Platform HPC services:
pcm-upgrade-tool.py services --reconfig

5. Reinstall compute nodes, if needed.


v If the compute nodes have Platform HPC 4.1.1.1 installed, recover the SSH
connection for all compute nodes:
xdsh noderange -K

where noderange is a comma-separated list of nodes or node groups.


v If the compute nodes have Platform HPC 4.1.1.1 or 4.2 installed, they must
be reprovisioned to use Platform HPC 4.2.

62

Installing IBM Platform HPC Version 4.2

Upgrading entitlement
In IBM Platform HPC, you can upgrade your LSF or PAC entitlement file from
Express to Standard.

Upgrading LSF entitlement


In IBM Platform HPC, you can upgrade your LSF entitlement file from Express to
Standard.

Before you begin


To upgrade your product entitlement for LSF, contact IBM client services for more
details and to obtain the entitlement file.

About this task


To upgrade your entitlement, as a root user, complete the following steps on the
Platform HPC management node:

Procedure
1. Copy the new entitlement file to the unified entitlement path
(/opt/pcm/entitlement/phpc.entitlement).
2. Restart LSF.
lsfrestart

3. Restart the Web Portal.


pmcadmin stop
pmcadmin start

Results
Your LSF entitlement is upgraded to the standard version.

Upgrading PAC entitlement


In IBM Platform HPC, after upgrading your Platform Application Center (PAC)
entitlement file from Express Edition to Standard Edition, ensure that you are able
to connect to the remote jobs console.

Before you begin


To upgrade your product entitlement for PAC, contact IBM client services for more
details and to obtain the entitlement file.

About this task


After you upgrade to PAC Standard, complete the following steps to connect to the
remote jobs console.

Procedure
1. Log in to the Web Portal.
2. From the command line, update the vnc_host_ip.map configuration file in the
$GUI_CONFDIR/application/vnc directory. The vnc_host_ip.map file must specify
the IP address that is mapped to the host name.

Chapter 10. Upgrading

63

# cat vnc_host_ip.map
# This file defines which IP will be use for the host, for example
#hostname1=192.168.1.2
system3750=9.111.251.141

3. Kill any VNC server sessions if they exist.


vncserver -kill :${session_id}

4. Go to the /opt/pcm/web-portal/gui/work/.vnc/${USER}/ directory. If the VNC


sessions files, vnc.console and vnc.session, exist, then delete them.
5. Restart the VNC server.
#vncserver :1 ;vncserver :2

6. Restart the Web Portal.


7. Stop the iptables service on the management node.
8. Verify that the remote job console is running.
a. Go to the Jobs tab, and click Remote Job Consoles.
b. Click Open My Console.
c. If you get the following error, then you are missing the VncViewer.jar file.
Cannot find the required VNC jar file:
/opt/pcm/web-portal/gui/3.0/tomcat/webapps/platform/pac/vnc/lib/VncViewer.jar.
For details about configuring remote consoles, see "Remote Console".

To resolve this error, copy the VncViewer.jar file to the


/opt/pcm/web-portal/gui/3.0/tomcat/webapps/platform/pac/vnc/lib
directory. Issue the following command:
#cp /opt/pcm/web-portal/gui/3.0/tomcat/webapps/platform/viewgui/common/applet/VncViewer.jar
/opt/pcm/web-portal/gui/3.0/tomcat/webapps/platform/pac/vnc/lib/VncViewer.jar

Results
Using PAC Standard Edition, you are able to connect to the remote jobs console.

64

Installing IBM Platform HPC Version 4.2

Chapter 11. Applying fixes


Check for any new fixes that can be applied to your Platform HPC installation.
Note: In a high availability environment, ensure that the same fixes are applied on
the primary management node and the failover node

About this task


Fixes are available for download from IBM Fix Central website.
Note: In a high availability environment, ensure that the same fixes are applied on
the primary management node and the failover node.

Procedure
1. Go to IBM Fix Central.
2. Locate the product fixes, by selecting the following options:
a. Select Platform Computing as the product group.
b. Select Platform HPC as the product name.
c. Select 4.2 as the installed version.
d. Select your platform.
3. Download each individual fix.
4. Apply the fixes from the command line.
a. Extract the fix tar file.
b. From the directory where the fix files are extracted to, run the installation
script to install the fix.

Copyright IBM Corp. 1994, 2014

65

66

Installing IBM Platform HPC Version 4.2

Chapter 12. References


Configuration files
High availability definition file
High availability definition file specifies values to configure high availability.

High availability definition file


The high availability definition file specifies values to configure a high availability
environment.
virtualmn-name:
nicips.eth0:0=eth0-IP-address
nicips.eth1:0=eth1-IP-address
sharefs_mntp.work=work-directory
sharefs_mntp.home=home-directory

virualmn-name:
Specifies virtual node name of the active management node, where
virualmn-name is the name of the virtual node.
The virtual node name must be a valid node name. It cannot be a fully
qualified domain name, it must be the short name without the domain name.
This line must end with a colon (:).
nicips.eth0:0=eth0-IP-address
Specifies the virtual IP address of a virtual NIC connected to the management
node, where eth0-IP-address is an IP address.
For example: nicips.eth0:0=172.20.7.5
Note: A virtual NIC does not need to be created and the IP address does not
need to be configured. The pcmhatool command automatically creates the
needed configurations.
nicips.eth1:0=eth1-IP-address
Specifies the virtual IP address of a virtual NIC connected to the management
node, where eth1-IP-address is an IP address.
For example: nicips.eth1:0=192.168.1.5
Note: A virtual NIC does not need to be created and the IP address does not
need to be configured. The pcmhatool command automatically creates the
needed configurations.
sharefs_mntp.work=work-directory
Specifies the shared storage location for system work data, where work-directory
is the shared storage location. For example: 172.20.7.200:/export/data.
If the same shared directory is used for both user home data and system work
data, specify this parameter as the single shared directory.
Only NFS is supported.

Copyright IBM Corp. 1994, 2014

67

sharefs_mntp.home=home-directory
Specifies the shared storage location for user home data, where home-directory
is the shared storage location. For example: 172.20.7.200:/export/home.
If the same shared directory is used for both user home data and system work
data, do not specify this parameter. The specified sharefs_mntp.work
parameter, is used as the location for both user home data and system work
data.
Only NFS is supported.

Example
The following is an example of a high availability definition file:
# A virtual node name
virtualmn:
# Virtual IP address of a virtual NIC connected to the management node.
nicips.eth0:0=192.168.0.100
nicips.eth1:0=172.20.7.100
# Shared storage for system work data
sharefs_mntp.work=172.20.7.200:/export/data
# Shared storage for user home data
sharefs_mntp.home=172.20.7.200:/export/home

Commands
pcmhatool
an administrative command interface to manage a high availability environment

Synopsis
pcmhatool [-h | --help] | [-v | --version]
pcmhatool subcommand [options]

Subcommand List
pcmhatool config -i | --import HAINFO_FILENAME -s | --secondary SMN_NAME
[-q | --quiet] [-h | --help]
pcmhatool reconfig -s|--standby SMN_NAME [-q|--quiet] [-h|--help]
pcmhatool info [-h|--help]
pcmhatool failto -t|--target SMN_NAME [-q|--quiet] [-h|--help]
pcmhatool failmode -m|--mode FAILOVER_MODE [-h|--help]
pcmhatool status [-h|--help]
pcmhatool check [-h|--help]

68

Installing IBM Platform HPC Version 4.2

Description
The pcmhatool command manages a high availability environment. It is used to
enable high availability, display settings, set the failover mode, trigger a failover,
and show high availability data and running status.

Options
-h | --help
Displays the pcmhatool command help information.
-v | --version
Displays the pcmhatool command version information.

Subcommand Options
config -i HAINFO_FILENAME -s SMN_NAME
Specifies high availability settings to be used to enable high availability
between the primary management node and the secondary management node,
where HAINFO_FILENAME is the high availability definition file and
SMN_NAME is the name of the secondary management node.
-i|--import HAINFO_FILENAME
Specifies the import file name of the high availability definition file, where
HAINFO_FILENAME is the name of the high availability definition file.
-s|--secondary SMN_NAME
Specifies the secondary management node name, where SMN_NAME is the
name of the secondary management node.
reconfig -s|--standby SMN_NAME
Enables high availability on the standby management node after the
management node is reinstalled, where SMN_NAME is the name of the
standby management node.
info
Displays high availability settings, including: the virtual IP address, the
management node name, and a list of shared directories.
failto -t|--target SMN_NAME
Sets the specified standby management node to an active management node,
where SMN_NAME is the current standby management node.
failmode -m|--mode FAILOVER_MODE
Sets the failover mode, where FAILOVER_MODE is set to auto for automatic
failover or manual for manual failover. In automatic mode, the standby node
takes over the cluster when it detects the active node has failed. In manual
mode, the standby node only takes over the cluster if the pcmhatool failto
command is issued.
status
Displays the current high availability status, including: state of the nodes,
failover mode and status of running services. Nodes that are in unavail state
are unavailable and indicate a node failure or lost network connection.
check
Displays high availability diagnostic information related to the high availability
environment, including current status data, failure and correction data.

Chapter 12. References

69

70

Installing IBM Platform HPC Version 4.2

Notices
This information was developed for products and services offered in the U.S.A.
IBM may not offer the products, services, or features discussed in this document in
other countries. Consult your local IBM representative for information on the
products and services currently available in your area. Any reference to an IBM
product, program, or service is not intended to state or imply that only that IBM
product, program, or service may be used. Any functionally equivalent product,
program, or service that does not infringe any IBM intellectual property right may
be used instead. However, it is the user's responsibility to evaluate and verify the
operation of any non-IBM product, program, or service.
IBM may have patents or pending patent applications covering subject matter
described in this document. The furnishing of this document does not grant you
any license to these patents. You can send license inquiries, in writing, to:
IBM Director of Licensing
IBM Corporation
North Castle Drive
Armonk, NY 10504-1785
U.S.A.
For license inquiries regarding double-byte character set (DBCS) information,
contact the IBM Intellectual Property Department in your country or send
inquiries, in writing, to:
Intellectual Property Licensing
Legal and Intellectual Property Law
IBM Japan Ltd.
1623-14, Shimotsuruma, Yamato-shi
Kanagawa 242-8502 Japan
The following paragraph does not apply to the United Kingdom or any other
country where such provisions are inconsistent with local law: INTERNATIONAL
BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION AS IS
WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR
PURPOSE. Some states do not allow disclaimer of express or implied warranties in
certain transactions, therefore, this statement may not apply to you.
This information could include technical inaccuracies or typographical errors.
Changes are periodically made to the information herein; these changes will be
incorporated in new editions of the publication. IBM may make improvements
and/or changes in the product(s) and/or the program(s) described in this
publication at any time without notice.
Any references in this information to non-IBM Web sites are provided for
convenience only and do not in any manner serve as an endorsement of those Web
sites. The materials at those Web sites are not part of the materials for this IBM
product and use of those Web sites is at your own risk.

Copyright IBM Corp. 1994, 2014

71

IBM may use or distribute any of the information you supply in any way it
believes appropriate without incurring any obligation to you.
Licensees of this program who wish to have information about it for the purpose
of enabling: (i) the exchange of information between independently created
programs and other programs (including this one) and (ii) the mutual use of the
information which has been exchanged, should contact:
IBM Corporation
Intellectual Property Law
Mail Station P300
2455 South Road,
Poughkeepsie, NY 12601-5400
USA
Such information may be available, subject to appropriate terms and conditions,
including in some cases, payment of a fee.
The licensed program described in this document and all licensed material
available for it are provided by IBM under terms of the IBM Customer Agreement,
IBM International Program License Agreement or any equivalent agreement
between us.
Any performance data contained herein was determined in a controlled
environment. Therefore, the results obtained in other operating environments may
vary significantly. Some measurements may have been made on development-level
systems and there is no guarantee that these measurements will be the same on
generally available systems. Furthermore, some measurement may have been
estimated through extrapolation. Actual results may vary. Users of this document
should verify the applicable data for their specific environment.
Information concerning non-IBM products was obtained from the suppliers of
those products, their published announcements or other publicly available sources.
IBM has not tested those products and cannot confirm the accuracy of
performance, compatibility or any other claims related to non-IBM products.
Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products.
All statements regarding IBM's future direction or intent are subject to change or
withdrawal without notice, and represent goals and objectives only.
This information contains examples of data and reports used in daily business
operations. To illustrate them as completely as possible, the examples include the
names of individuals, companies, brands, and products. All of these names are
fictitious and any similarity to the names and addresses used by an actual business
enterprise is entirely coincidental.
COPYRIGHT LICENSE:
This information contains sample application programs in source language, which
illustrates programming techniques on various operating platforms. You may copy,
modify, and distribute these sample programs in any form without payment to
IBM, for the purposes of developing, using, marketing or distributing application
programs conforming to the application programming interface for the operating
platform for which the sample programs are written. These examples have not
been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or

72

Installing IBM Platform HPC Version 4.2

imply reliability, serviceability, or function of these programs. The sample


programs are provided "AS IS", without warranty of any kind. IBM shall not be
liable for any damages arising out of your use of the sample programs.
Each copy or any portion of these sample programs or any derivative work, must
include a copyright notice as follows:
(your company name) (year). Portions of this code are derived from IBM Corp.
Sample Programs. Copyright IBM Corp. _enter the year or years_.
If you are viewing this information softcopy, the photographs and color
illustrations may not appear.

Trademarks
IBM, the IBM logo, and ibm.com are trademarks of International Business
Machines Corp., registered in many jurisdictions worldwide. Other product and
service names might be trademarks of IBM or other companies. A current list of
IBM trademarks is available on the Web at "Copyright and trademark information"
at http://www.ibm.com/legal/copytrade.shtml.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo,
Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or
registered trademarks of Intel Corporation or its subsidiaries in the United States
and other countries.
Java and all Java-based trademarks and logos are trademarks or registered
trademarks of Oracle and/or its affiliates.
Linux is a trademark of Linus Torvalds in the United States, other countries, or
both.
LSF, Platform, and Platform Computing are trademarks or registered trademarks of
International Business Machines Corp., registered in many jurisdictions worldwide.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of
Microsoft Corporation in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of
others.

Privacy policy considerations


IBM Software products, including software as a service solutions, (Software
Offerings) may use cookies or other technologies to collect product usage
information, to help improve the end user experience, to tailor interactions with
the end user, or for other purposes. In many cases no personally identifiable
information is collected by the Software Offerings. Some of our Software Offerings
can help enable you to collect personally identifiable information. If this Software
Offering uses cookies to collect personally identifiable information, specific
information about this offerings use of cookies is set forth below.
Depending upon the configurations deployed, this Software Offering may use
session and persistent cookies that collect each users user name, for purposes of
session management. These cookies cannot be disabled.

Notices

73

If the configurations deployed for this Software Offering provide you as customer
the ability to collect personally identifiable information from end users via cookies
and other technologies, you should seek your own legal advice about any laws
applicable to such data collection, including any requirements for notice and
consent.
For more information about the use of various technologies, including cookies, for
these purposes, see IBMs Privacy Policy at http://www.ibm.com/privacy and
IBMs Online Privacy Statement at http://www.ibm.com/privacy/details the
section entitled Cookies, Web Beacons and Other Technologies and the IBM
Software Products and Software-as-a-Service Privacy Statement at
http://www.ibm.com/software/info/product-privacy.

74

Installing IBM Platform HPC Version 4.2



Printed in USA

SC27-6107-02