Vous êtes sur la page 1sur 12

Biorap N75 Avril 2013

SOMMAIRE
Forums ORAP
HPC@LR: 2 ans et dj 100 projets
Nouvelles de GENCI et de PRACE
SysFera-DS et EDF R&D
Lire, se former, participer
Nouvelles brves
Agenda

Forums ORAP
Le 31
me
Forum a eu lieu Paris le 25 mars
2013. Les thmes centraux du forum taient :
Langages et runtimes
Applications du calcul intensif dans le do-
maine de la turbulence
Les supports des prsentations faites dans ce
cadre sont disponibles sur le site Orap :
http://www.irisa.fr/orap/Forums/forums.html


HPC@LR : 2 ans et dj 100 projetS
Un msocentre jeune et dynamique
Le 6 dcembre dernier, le centre de comp-
tences HPC@LR a clbr son 100me projet.
La journe a marqu louverture du centre vers
les entreprises rgionales, et s'est droule au
cur du Business and Innovation Centre (BIC)
de Montpellier Agglomration.
Une convention de partenariat a t signe
entre lUniversit Montpellier 2 (UM2) et Mont-
pellier Agglomration visant sensibiliser les
entreprises et projets accompagns par le BIC
de Montpellier Agglomration au Calcul Haute
Performance et aux services proposs par le
centre HPC@LR. Ainsi, le BIC sengage
prescrire les services de HPC@LR auprs des
entreprises et projets accompagns et le centre
HPC@LR sengage fournir lquivalent de
5000 heures de calcul HPC@LR toute entre-
prise et projet innovant accompagn par le BIC
de Montpellier Agglomration.
Un Msocentre financ par les acteurs r-
gionaux et port par les experts rgionaux
Financ par la Rgion Languedoc-Roussillon,
lEurope et lUniversit Montpellier 2 (UM2), le
centre a t cr sur la base dun partenariat
public-priv entre ASA (jeune entreprise inno-
vante spcialiste en informatique scientifique),
le CINES, IBM, SILKAN (anciennement HPC Pro-
ject, acteur majeur en simulation de haute per-
formance et pionnier de la simulation embar-
que) et lUM2, et est accompagn par Trans-
ferts-LR (association cre en 2005 l'initiative
de la Rgion Languedoc-Roussillon et de ltat
appuyant la comptitivit des entreprises par
son action de catalyseur de linnovation et du
transfert de technologie). Ce modle original
permet daccder au meilleur niveau de qualit
aussi bien pour le centre lui-mme que pour
lensemble de ses utilisateurs.
Au-del de ce partenariat, le centre HPC@LR
participe plusieurs projets collaboratifs (ANR,
FUI, Equipex, conventions avec des entreprises
prives,...) et organise divers vnements (s-
minaires, formations,...).
Un Msocentre destination de tous
Le centre HPC@LR est dvolu au calcul haute
performance destination des entrepreneurs,
des chercheurs et ingnieurs des secteurs pu-
blic et priv pour rpondre aux besoins de puis-
sance de calcul et de formation. Il accompagne
ses utilisateurs dans leur accs aux ressources
quel que soit leur niveau dexpertise, pour per-
mettre tous de bnficier des toutes dernires
avances en matire de calcul parallle.

Demandes de ressources ples thmatiques
NUMERO
75
Biorap N75 Avril 2013 2

Les projets accompagns sont lis des th-
matiques diverses et concernent des entre-
prises, des formations (initiales, continues,
coles d't) ainsi que des laboratoires de re-
cherche.


Rpartition des demandes de ressources public/priv

Le centre HPC@LR vise diffuser la culture du
HPC y compris dans des domaines non histori-
quement lis cet enjeu, pour faire notamment
face au dluge de donnes ( Big Data ).
Larchitecture matrielle a t tudie afin
dune part de permettre aux chercheurs de
tester diverses architectures matrielles, de
dimensionner leurs besoins en amont, dtudier
les incidences sur leur calcul dun changement
matriel, et dautre part de mutualiser leur in-
vestissement matriel permettant daccder
une architecture plus performante. Cette archi-
tecture volue en fonction des besoins expri-
ms par nos utilisateurs avec par exemple
larrive rcente de 2 nuds large mmoire (2
nuds IBM X3850 X5 - 80 curs/nud, 1 To
RAM/nud).
Chiffres Cls
plus de 100 projets accompagns
plus de 15 entreprises accompagnes et
partenaires
plus de 5 millions dheures de calcul
plus de 350 comptes utilisateurs crs
plus de 30 centres de recherche accompa-
gns par le centre HPC@LR en Rgion LR
plus de 420 k! de projets collaboratifs et
autres actions du centre
plus de 200 personnes formes au HPC
plus de 20,57 TFlops de puissance crte
(double prcision)
Un 100
e
projet emblmatique du position-
nement du centre HPC@LR et de la dyna-
mique de la Rgion Languedoc-Roussillon
sur les thmatiques de l'environnement et
de l'eau.

La clbration du 100
me

projet accompagn par le
centre HPC@LR a consacr
la socit BRLi.
BRL Ingnierie (BRLi) est une socit d'ing-
nierie spcialise dans les domaines lis
leau, lenvironnement et lamnagement du
territoire. Sappuyant sur plus de 160 collabora-
teurs, BRLI intervient en France et dans plus de
80 pays, la demande des collectivits, de
socits privs, des autorits locales et des
grands bailleurs de fonds internationaux.
Les collaborations entre le centre HPC@LR et
la socit BRLi sont multiples. Elles prennent la
forme d'un partenariat au sein du projet FUI
Litto-CMS visant dvelopper une plateforme
logicielle et des services innovants pour la pr-
vision et la gestion des crises de submersion
marine. Elles prennent aussi la forme de pres-
tations pour le compte de la socit qui ont
permis BRLi de se positionner sur des mar-
chs fortement concurrentiels dans le domaine
de la modlisation hydraulique de crues sur des
grands cours deau tel que la Loire et ce grce
la rduction des temps de calcul et l'exper-
tise externe apporte par le centre HPC@LR.
Les liens avec le centre HPC@LR et l'accom-
pagnement (par les experts d'ASA et d'IBM
notamment pour l'installation des logiciels et la
formation des utilisateurs) ont permis une mon-
te en comptence des quipes de BRLi qui
grent maintenant le lancement de leurs calculs
de manire autonome sur le calculateur
HPC@LR.
Ce centime projet est emblmatique des
forces de la Rgion Languedoc-Roussillon dans
les domaines lis l'environnement en gnral
et l'eau en particulier : liens avec l'OSU-
OREME, ple de comptitivit vocation mon-
diale, centre d'excellence IBM.
Contact : anne.laurent@univ-montp2.fr -
Directrice du centre HPC@LR
http://www.hpc-lr.univ-montp2.fr


Nouvelles de GENCI et de PRACE
GENCI : campagne 2013 dattribution de res-
sources sur moyens nationaux.
La 1
re
session de la campagne 2013
dattribution des ressources sur moyens natio-
naux est close. Au total, 537 dossiers ont t
dposs et valids : 108 nouveaux projets et
429 renouvellements de projets. A lissue du
processus dattribution, 506 projets ont obtenu
des heures sur les ressources de GENCI.


Biorap N75 Avril 2013 3

La seconde session de la campagne 2013 sera
ouverte du lundi 2 avril au vendredi 3 mai 2013,
pour une attribution des heures au 1
er
juillet
2013.
PRACE-3IP
PRACE a publi sur son site les dtails de la
troisime dimplmentation qui a dbut le 7
juillet 2012 pour une dure de 24 mois avec un
budget de 26,8 M!.
ww.prace-ri.eu/PRACE-Third-Implementation-
Phase,88?lang=en
PRACE : rsultats du Call6
PRACE a retenu 57 projets sur les 88 projets
soumis dans le cadre de son 6me appel
projets.
http://www.prace-ri.eu/IMG/pdf/2013-02-
28_call_6_allocations_final.pdf
PRACE : cole de printemps
Elle aura lieu du 23 au 26 avril Umea (Sude).
Le thme central : New and Emerging Tech-
nologies - Programming for Accelerators .
https://www.hpc2n.umu.se/prace2013/information
PRACE : cole dt
Lcole dt de PRACE aura lieu du 17 au 21
juin Ostrava (Rpublique Tchque).
http://events.prace-
ri.eu/conferenceDisplay.py?confId=140
PRACE Digest
Le numro 1/2013 de PRACE Digest est
disponible :
http://www.prace-
ri.eu/IMG/pdf/prace_digest_2013.pdf


SysFera-DS: a seamless access to
HPC resources for users and appli-
cations. Implementation @EDF R&D
During the last 20 years, distributed platforms
e.g., workstations, clusters, supercomputers,
grids, clouds) available to users have been
more and more complex and diverse. As users
primarily seek ease of use first and foremost,
many of these platforms are not used at the
maximum of their potential, and users often
cope with sub-optimal performance.
Engineers or researchers should not need to
know the characteristics of the machines they
use, be they static (e.g., performance, capacity)
or dynamic (e.g., availability, load). They should
launch their application and let the system take
care of how and where it runs to ensure the
optimal performance. For a system administra-
tor, managing such a set of heterogeneous
machines sometimes appears like a nightmare.
Optimizing the use of all the available resources,
at the lowest possible cost, remains a highly
complex task. In order to ease this manage-
ment a standard set of tools accessible from a
unique interface helps significantly.
Due to the complexity of the infrastructure and
the wide heterogeneity of users, applications
and available tools, EDF R&D has to cope with
daily problems such as: "How do I connect to
this remote machine? "How do I run a job on
this specific batch scheduler? "How do I man-
age my remote files?" What was my account
name and password to access to this particular
resource?" But in an ideal world, scientists
would like to spend most of their time to their
main field of expertise such as CAD modeling,
fluid dynamics or structural mechanics simula-
tions rather than consuming it in dealing with
the complexity of the infrastructure.
To address this issue, EDF R&D and SysFera
1

have co-developed a solution named SysFera-
DS that provides end-users with a unified and
simple view of the resources. It can be ac-
cessed through Unix command line, or several
APIs (C++, Java, Python), and from a higher
level through a web portal (SysFera-DS Web-
Board), which provides a graphical view of the
functionalities. Transparent access to remote
computational resources is made possible via
the main modules of SysFera-DSs middleware:
Vishnu.
Used conjointly, these modules provide easy
access to remote computational resources. It is
a non-intrusive (no particular rights needed on
the infrastructure for installing or using it) and
non-exclusive solution (it does not replace al-
ready present solutions).
Using SysFera-DS, applications can easily run
on remote HPC resources taking benefit from a
stable and homogeneous interface whatever
the infrastructure software installed.
In this article, we will present some uses of this
framework EDF R&D engineers are now testing.
In particular, we focus on the interactions be-
tween SysFera-DS and SALOME - an open-
source software (L-GPL license), co-developed
by EDF and the CEA, which provides a generic
platform for Pre- and Post-Processing and code
coupling for numerical simulation. We underline
the ease of integration of SysFera-DS into this
widely used simulation platform stressing the
expected benefits for SALOME in adopting this
coupling strategy.
1. EDF R&D infrastructure
a. Ressources

1
http://www.sysfera.com
Biorap N75 Avril 2013 4

In October 2012, the R&D Division from Elec-
tricit de France (EDF) was hosting 5 resources
dedicated to intensive computation: a 37680-
core IBM BlueGene/P scheduled by LoadLev-
eler, a 65536-core IBM BlueGene/Q and a
17208-core x86 Intel Xeon X5670 Westmere
cluster managed by SLURM, a 1984-core Intel
Xeon X7560 Nehalem-EX under Torque/Maui,
and a 384 core Intel Xeon X7560 Nehalem-EX
run through LSF.
This infrastructure changes on a regular basis
to increase the HPC resources made available
to Scientists in computation power, storage or
network bandwidth.
All these machines as well as the workstation
used by EDF developers run a version of 64-bit
Linux OS, have a dedicated filesystem, and
implement their own authentication. Coping with
such a diversity of schedulers and finely man-
aging the transfer of files can be time-
consuming for the users especially when a new
infrastructure appears and they have to learn
how to address it.
b. Users and applications
The need to address here is to develop and
deploy an access middleware on all the ma-
chines of this infrastructure in order to provide
the users with an environment that remains
uniform and constant in time. This kind of HPC
bus will grant them access to a unique and
perennial interface to manage jobs, to move
files, to obtain information or to authenticate
themselves to the underlying machines.
The targeted population covers experienced
developers as well as end-users that could
have no knowledge of HPC environments at all.
While virtualizing and simplifying the use to
HPC resources for the latter, the proposed mid-
dleware will also provide an easy access to the
details of a running application. For example, a
developer might want to check some intermedi-
ate files produced during the execution, down-
load a core file or examine live the memory
footprint of the application.
Applications running on the infrastructure be-
long to one or more of these families:
Parallel MPI applications that can use from
tens to thousands of cores depending on
their level of scalability.
Bag of tasks applications, consisting of
running a given application either parallel
or sequential while sweeping all the
possible combinations of a set of
parameters. Each run is independent of the
other and their total number can reach
several hundreds of thousands.
Interactive applications for which the user
may want to enter remotely specific
parameters during the execution, or
visualize immediately the computed results.
In addition, some of these applications can be
launched, monitored and controlled via a dedi-
cated platform running on the user workstation.
In this article, we address the example of the
SALOME platform.
Some end-user would also require a simplified
web access to their application. The idea con-
sists in launching; monitoring and controlling
the execution of a scientific application through
a friendly web page accessible via a simple
browser running on any Operating System.
c. Constraints
The deployment of such a middleware on the
infrastructures of an industrial company such as
EDF is not easy. Any proposed solution should
pass through a long and complex process of
test and qualification to be proposed as an ad-
ditional service available on the common infra-
structure. In particular, the middleware devel-
oped must take into account the following con-
straints to ease its deployment and acceptance:
It should not require any administrator
privileges on the client workstation or on the
frontend of the cluster addressed. By using
only regular user accounts, the testing and
deployment is kept simpler.
It should be robust and scale to manage at
least a 1000 different users and at least 100
simultaneous connections.
It should not depend on local reconfiguration
of any scheduler. Interfacing with them by
simply reading the results of the command
line might not be a good idea.
It should be interfaced with all the
authentication systems available on the
infrastructure addressed. In particular, it
should optionally connect to the companys
LDAP directories.
It should provide an emergency shutdown
option available to the infrastructure
administrator to immediately stop the
middleware in order to prevent any side
effect it may create on any of the machine.
As a distributed client-server architecture, it
should allow the cohabitation of several dif-
ferent versions of the servers and client on
the same machines.
2. SysFera-DS : a solution for transpa-
rent access to HPC resources
SysFera develops a solution named SysFera-
DS that provides a seamless access to remote
computational resources. It consists of a web
frontend: SysFera-DS WebBoard, and optional-
ly a distributed backend called Vishnu. In this


Biorap N75 Avril 2013 5

section, we present the functionalities provided
by SysFera-DS, emphasizing on Vishnu and
giving a quick introduction to the WebBoard.
a. Vishnu: a distributed and scalable mid-
dleware
Since 2010, EDF R&D and SysFera have co-
developed Vishnu, a distributed and open-
source middleware that eases the access to
computational resources. Its first target natural-
ly was EDF internal HPC resources on which it
has been installed since September 2010.
Vishnu is built around the following principles:
stability, robustness, reliability, performance
and modularity. Vishnu is an open-source soft-
ware, distributed under the CeCILL V2 License,
and freely available on GitHub
2
.
(i) Functionalities
Functionally, Vishnu is composed of the follow-
ing four modules:
UMS (User Management Services): manages
users and daemons authentication and authori-
zation for all the other modules. It provides SSO
(Single Sign On) on the whole platform using
Vishnus internal database or LDAP, along with
ssh.
IMS (Information Management Services): pro-
vides platform monitoring wherever a daemon
is launched (processes state, CPU, RAM,
batch schedulers queues"). It can also start
and stop remote Vishnu processes in case of
failures, or if requested by the administrator.
FMS (File Management Services): provides
remote file management. It offers services simi-
lar to posix ones (ls, cat, tail, head, mv, cp"),
along with synchronous and asynchronous
transfers (copy/move) between remote and/or
local resources.
TMS (Task Management Services): submission
of generic scripts on any kind of batch sched-
ulers (currently supported are: SLURM, LSF,
LoadLeveler, Torque, SGE, PBS Pro), as well
as submission to machines that are not handled
by a batch scheduler.
The commands provided by these four modules
are divided into two categories: administrator
and user commands. For a detailed list of the
available functionalities, we encourage the
readers to refer to the documentation of Vishnu.
Figure 1 presents an overview of the deploy-
ment of SysFera-DS at EDF R&D. It is de-
ployed on 4 of EDFs clusters or supercomput-
ers.

2
http://github.com/sysfera/vishnu

Figure 1: SysFera-DS deployed at EDF R&D
(ii) Functional architecture and design
Figure 2 shows the different functional parts of
VISHNU and their interactions. On the client
side (upper part of the diagram) we show the
different interface components that are provided
to the users: APIs and command-line interfaces.
These interfaces provide an access to the dif-
ferent VISHNU modules using a client compo-
nent that is itself connected with the corre-
sponding server component through the com-
munication bus. The server components handle
data items that belong to different inter-related
databases, each managing a specific kind of.
The TMS and FMS Server components also
handle some specific interaction with external
entities: the batch schedulers for the manage-
ment of jobs on clusters or supercomputers,
and the SCP and RSYNC protocols of the un-
derlying Linux system for the transfer of files
between two storage servers or between a cli-
ent system and a storage server.
Java API Python API
Command line
interface
Vishnu API (C++)
UMS client IMS client FMS client TMS client
UMS server IMS server FMS server TMS server
ZMQ
User and
infrastructure
database
Session
database
Jobs
database
File transfers
database
Monitoring
database
SCP/RSYNC
Batch
schedulers
Communication
bus
Vishnu interface
software
component
Vishnu internal
software
component
External
components
Client
side
Server
side

Figure 2. Functional architecture

From a deployment perspective, several con-
figurations can be used. For the sake of simplic-
ity, we always deploy a centralized database,
which needs to be accessible by all Vishnu
servers (also called SeD for Server Daemon).
The simplest deployment requires a UMS and
FMS SeD, and one TMS SeD per cluster (and
optionally an IMS SeD per machine where an-
other SeD is deployed for monitoring). In this
Biorap N75 Avril 2013 6

configuration, the clients (the users) need to
know the address (URI and port) of all the SeDs
to be able to address them. When only a few
SeDs are deployed, and when there are no
restrictions on the ports accessible by the users,
this can be a good solution.
In more complicated infrastructures, with sever-
al clusters, and with limitations on the ports the
users can have access to; the administrators
can deploy another daemon called the dis-
patcher. Basically, the dispatcher acts as a
proxy between the clients and the SeDs. Main-
taining a directory of all SeDs and services
available, it receives all clients requests and
forwards them to the relevant SeDs. Through
this dispatcher, a client only needs to know the
address of the forwarder to be able to interact
with the system. This also greatly eases the
addition of a new cluster, as the users do not
need to modify their configuration files. Only the
dispatcher needs to be aware of the modifica-
tion of the underlying infrastructure. As the dis-
patcher can be seen as a single point of failure,
several of them can be deployed in the platform
to share the load of routing messages, and to
ensure a higher availability. Note that if a dis-
patcher were to fail, the users could still directly
contact the SeDs if they know how to contact
them (if they know the addresses of the SeDs).
Using this approach, the well-designed interac-
tion of its main components makes the deploy-
ment of very flexible and easy to adapt to any
infrastructure configuration.
(iii) Implementation details
Vishnu has been developed in C++ following a
set of MISRA rules, and has been audited by an
external company for guarantying code reliabil-
ity and maintainability. Vishnu development
also followed a Model Driven Development
process; it uses emf4cpp for model design and
source code generation, and swig for Python
and Java API generation.
All communications within Vishnu are handled
by MQ. MQ is an enhanced socket library
designed for high throughput. It is able to send
messages across various transports like in-
process, inter-process, TCP, and multicast, with
various patterns low level communication pat-
terns, but it is the programmers responsibility to
implement higher level patterns for their needs.
Every element of Vishnu needs only one open
port to communicate, which makes it easy to
integrate into an existing platform. Moreover, if
a dispatcher is deployed, the clients only need
to know the address and port of the dispatcher
to be able to interact with Vishnu: the dispatch-
er manages all the messages within the infra-
structure, and thus the client does not have to
know the address of all the Vishnu elements
that are deployed. Note that in order to prevent
the dispatcher becoming a bottleneck, several
dispatchers can be deployed, thus sharing the
load of handling the clients messages.
In order to prevent parsing issues and bugs,
TMS does not rely on the batch schedulers
command line interface, but instead it links to
the client libraries themselves. This has the
advantage that APIs and data types are well
defined, thus the command results do not de-
pend on the local configuration or compilation
options. The inconvenience of having these link
dependencies in the TMS SeD is alleviated by
the implementation of a plugin manager. Thus
the TMS SeD is not statically linked to these
libraries, but instead it dynamically loads the
libraries at startup depending on a configuration
file.
(iv) Operating principles
When coping with heterogeneous platforms,
most of the time, it is not possible to rely on a
system that would provide single-sign-on (SSO)
on the whole platform; each clus-
ter/supercomputer can be managed inde-
pendently, with its own user management sys-
tem, some use LDAP, some dont. Thus, users
may have several logins and passwords, or
several ssh keys to manage in order to connect
to the infrastructure. Vishnu manages and hides
this heterogeneity, under the heavy constraint
that Vishnu cannot have a privileged access to
resources (no root or sudo account), but still
needs to execute the commands with the cor-
rect user account. These issues have been
solved in the following way:
Users connect to Vishnu using a single log-
in/password. These identifiers are checked
either against Vishnus database, or one or
several LDAPs directories. If the credentials
are correct, a Vishnu session is opened, and
the user can have access to the other com-
mands to interact with the platform. The
session key is checked every time a com-
mand is issued.
To allow Vishnu to execute commands with
the account, the user first defines their local
accounts on the platform (the username
they have on the different resources, and
the path to their home), and provides ssh
access to Vishnu to their local accounts us-
ing ssh keys. Then, Vishnu can connect to
the user local account with its ssh key. Thus,
identity is preserved, and no overlay for user
management needs to be installed: Vishnu
relies on local policies, without modifying
them.
This use case is quite common: when intercon-
necting several computational centers, each of
them can have local policies regarding user


Biorap N75 Avril 2013 7

management, but ssh connections are quite
often the only common access means.
Here is a summary of this process: (1) the user
opens a session, providing their credentials
(those can be given via the CLI, or stored in a
~/.netrc file); (2) the credentials are checked
against Vishnus database or a LDAPs directory
(the checking order can be changed): if the
credentials are valid a session is created; (3)
the user can interact with the system; (4) each
time a command is issued, the global ID (their
login) of the user and the session key is for-
warded to the relevant daemon; (5) the daemon
connects using ssh to the correct users local
account and executes the command; (6) when
the user has finished, they can close their ses-
sion (otherwise it expires after a pre-defined
period).
Vishnu also provides, through TMS, means to
abstract the underlying batch schedulers. In
order to execute a script on a remote scheduler,
the user just has to use the vishnu_submit
command with its script. Apart from the various
options available through the different APIs to
control the submission process, the script can
also contain a set of variables understood by
Vishnu, and replaced by the correct options
corresponding to those understood by the batch
scheduler. Thus, a script can either contain
options specific to a given batch scheduler, or
Vishnu options which are independent. Here
are a few examples of Vishnu options: #%
vishnu_working_dir (jobs remote working
directory), #% vishnu_wallclocklimit (the
estimated time for the job execution), #%
vishnu_nb_cpu (the number of cpus per
node), and many more to specify memory, mail
notification, queue"
In addition to these variables meant to interact
with the batch scheduler, Vishnu also provides
a set of variables to retrieve information on the
environment in which the job is executed:
VISHNU_BATCHJOB_ID (ID assigned to the job
by the batch system),
VISHNU_BATCHJOB_NODEFILE (path to the
file containing the list of nodes assigned to the
job), VISHNU_BATCHJOB_NUM_NODES (total
number of nodes allocated to the job),
VISHNU_INPUT/OUTPUT_DIR...
Finally, the user can also define their own vari-
ables meant to provide input data to the script.
The system allows the transmission of strings
and files as parameters to the script. They are
provided through the APIs in the form of
key=value couples. The key is the name of
the variable in the script. If the value is a string,
then it is provided to the script as is, and if it is a
file, then the system first transfers the input files
onto the remote cluster, and the variable con-
tains the local path to the transmitted file.
By default TMS provides a backend that is not
tied to any batch scheduler, thus allowing the
users to submit jobs to machines that are not
managed by a batch scheduler. This comes in
handy if you have spare desktop computers
that you want to use, or if you need to execute
something on the gateway of your cluster in-
stead of submitting a job to the batch scheduler
(e.g., compilation processes). In this case the
submission process is exactly the same as with
a batch scheduler, and the scripts can have
access to the same options and variables.
b. SysFera-DS WebBoard
The WebBoard does not necessarily rely on
Vishnu to operate (see Figure 3). It can be de-
ployed directly on top of a single cluster (in this
case the WebBoard directly interacts with the
batch scheduler and the underlying filesystem);
or if the infrastructure is more complex, it can
interact with Vishnu, in this case the WebBoard
only sends commands to Vishnu that handles
jobs and files.

Figure 3. The WebBoard can be deployed on top of
one or several clusters.
The WebBoard has been developed with the
Grails Framework, and also relies on the follow-
ing technologies: Spring Security, Hiber-
nate/GORM, RabbitMQ and jQuery.
Basically, the WebBoard provides the same set
of functionalities as Vishnu through a graphical
interface: single-sign-on, file management, job
management and monitoring. Apart from
providing a graphical interface, and thus ab-
stracting even more the usage of the infrastruc-
ture, the WebBoard also provides higher level
functionalities, which are described in the fol-
lowing sections.
(i) Applications
If Vishnu only handles jobs, the WebBoard is
able to provide higher-level services on top of
Biorap N75 Avril 2013 8

jobs: it provides access to applications through
dedicated interfaces.
Instead of dealing with scripts, the users can
directly interact with pre-defined applications.
The administrators can register applications in
the WebBoard that will be available within pro-
jects. An application is still a script that will be
submitted to a batch scheduler, but with this
script comes a set of parameters that the users
will have to provide to execute a job. The Web-
Board automatically generates the correspond-
ing submission webpage. On the submission
webpage, the user only sees the parameters:
they do not have to care about the way the
script has been written, nor the resources on
which it will be executed, they only needs to
concentrate on his simulations.
Applications can be defined globally (available
from all project), or locally to a project.
(ii) Projects
A notion of project is also provided. A project is
a triplet of Applications, Users and Machines: a
project provides a set of applications available
on a set of machines to a set of users. Within a
project, users are attributed a role. A role de-
fines the actions a user has access to: submit-
ting/canceling a job, see the project statistics,
see the other members jobs, and manage the
projects applications/resources/users"
(iii) Remote file management
Once a simulation is done, the user needs to
access the output files. The WebBoard provides
a graphical interface to remotely browse clus-
ters files un espaceystems and manage remote
files. The user can visualize remote text files,
copy/move/delete them around (on a single
cluster of between several), create/delete direc-
tories and manage access rights. They can also
download the files, or upload new ones.
(iv) Visualization
As retrieving large output files is not always
possible, or desirable, the WebBoard also pro-
vides remote visualization directly from an
HTML5 compatible web-browser. Using a sim-
ple calendar, the user can reserve a visualiza-
tion session on the visualization cluster. The
session is highly configurable; the user can set
the following options:
The name of the session;
Session start and end time;
The resolution of the remote screen;
Whether the session, if shared, is in view-
only mode for the collaborators, or if the
session is fully accessible and thus allowing
collaborative work (in both cases a pass-
word is required to access a shared ses-
sion);
Whether an application should be automati-
cally launched at startup;
The number of computing resources, thus
allowing running MPI visualization applica-
tions.
The WebBoard manages the whole process:
from the reservation on the remote cluster, to
the encapsulation of the visualization flow into
an https flow. Thus the user does not need to
install any third party software to access the
visualization session, only an HTML5 compati-
ble browser is required. From the administra-
tors point of view, this solution does not require
opening new ports in their firewalls; only the
https port is required which is a great ad-
vantage in terms of security. Figure 4 presents
the visualization web page (in this example
Paraview has been launched at startup).

Figure 4. Remote visualization: Paraview embedded
in the WebBoard. The whole desktop is also acces-
sible.
3. Integration of Vishnu in Salome
a. Presentation
SALOME is a generic platform for Pre- and
Post-Processing for numerical simulation. It is
based on an open and flexible architecture
made of reusable components.
SALOME is a cross-platform solution that has
been developed mainly by EDF R&D, CEA, and
Open Cascade since September 2000. Availa-
ble in most popular Linux distributions, it is dis-
tributed as open-source software under the
terms of the GNU LGPL license.
At EDF R&D, SALOME is a referenced platform,
widely used by developers and engineers either
as standalone application for generation of CAD
models, their preparation for numerical calcula-
tions and post-processing of the calculation
results or as a platform for integration of the
external third-party numerical codes to produce
a new application for the full life-cycle man-
agement of CAD models.
b. Integration with Vishnu


Biorap N75 Avril 2013 9

Until 2011, SALOME interacted directly with
remote batch schedulers and had to adapt to
each particularity of their configuration. To do
so, SALOME implemented LibBatch, a library
abstracting the schedulers LSF, SLURM, Load-
Leveler and PBS.
In the last version, SALOME v 6.5, released in
November 2012, LibBatch is now also linked to
a Vishnu Client. The interaction with Vishnu is
handled by a unix subprocess. In this process,
the connection to the bus, the transfer of the
input and result files as well as the submission
and monitoring of the remote jobs uses the Unix
API of Vishnu client installed on a machine
where SALOME can access.
The benefit for SALOME is to grant an immedi-
ate access to any resource addressed through
Vishnu. Neither additional compilation nor con-
figuration is needed. In this interaction, eventual
modifications of the resources are also hidden
from SALOME by Vishnu.
In this context, access to the HPC resources is
made easier for SALOME. For the end user as
well as for the developer, Vishnu also guaran-
tees a uniform and maintained interface as well
as a potential access to future higher-level ser-
vices like resource reservation, or the support
of bag of tasks in Vishnu.
4. Conclusion and future work
The Vishnu middleware has been designed and
developed for more than two years by
SYSFERA and INRIA to ease and supplu long-
term access to a set of HPC resources. Already
used on EDFs HPC infrastructure in beta-tests
and soon to be deployed massively, it is cur-
rently used by several other French computing
centers. With this middleware, the user no
longer needs to care about specific aspects of
each of the computational resources. The bene-
fits of this solution include, unification of the
machines naming, single sign on, standardized
job management and delegation of the man-
agement of file transfers to the middleware.
This middleware, supplies several programming
interfaced (C++, Python, Java) to access to
HPC resources, and a Unix command line inter-
face. In addition a web interface has been de-
veloped and provides higher-level functionali-
ties such as embedded browser visualization
tools. It has been proven that Vishnu can be
easily integrated with existing software, such as
with EDFs SALOME platform for example.
Currently, the R&D division of EDF uses Vishnu
and the WebBoard. In 2013, Vishnu is expected
to be available to all of EDFs other divisions
and increase significantly its number of users.
Contacts: Samuel.kortas@edf.fr, benja-
min.depardon@sysfera.fr
Lire, participer, se former
Lire
Le numro de janvier 2013 de la Newslet-
ter de Ter@Tec
http://www.teratec.eu/actu/newsletter/2013_01_news
letter_teratec.html
Le numro 10 de la PRACE Newsletter
http://www.prace-
ri.eu/IMG/pdf/prace_nl_final21__12_12.pdf
Le nouveau magazine franais HighPer-
formanceComputing ddi aux technolo-
gies, aux usages et aux recherches en en-
vironnement HPC. Abonnement gratuit :
http://www.hpcmagazine.fr/
Participer en France
Le Groupe Calcul fte cette anne ses 10
ans d'existence et organise une journe
spciale ddie l'"Histoire du Calcul", le 9
avril 2013 lIHP (Paris).
http://calcul.math.cnrs.fr/spip.php?article219
Les Rencontres Numriques de lANR. 17
et 18 avril Paris :
http://www.agence-nationale-recherche.fr/
Colloques/RencontresduNumerique2013/
11th VI-HPS Tuning Workshop, 22 au 25
avril 2013, Maison de la Simulation, Saclay.
http://www.vi-hps.org/training/tws/tw11.html
Journe Equip@Meso CFD 2013, le 16
mai Rouen :
https://equipameso-cfd2013.crihan.fr/
Ter@Tec : le forum 2013 aura lieu les 25 et
26 juin lEcole Polytechnique (Palaiseau).
http://www.teratec.eu
HLPP 2013 : International Symposium on
High-level Parallel Programming and Appli-
cations . Paris, 1 et 2 juillet.
https://sites.google.com/site/hlpp2013/
MUSAF II Colloquium : Multiphysics,
Unsteady Simulations, Control and Op-
timization. Toulouse, 18 au 20 septembre.
http://www.cerfacs.fr/musaf/
Se former
Sminaire Aristote Bibliothques pour le
calcul scientifique : outils, enjeux et cosys-
tme . Le 15 mai lEcole Polytechnique.
http://www.association-aristote.fr/doku.php/
public:seminaires:seminaire-2013-05-15
CEA/EDF/Inria 2013 Computer Science
Summer School : Programming Hetero-
geneous Parallel Architectures . 24 juin au
5 juillet Cadarache, 13115 Saint-Paul-
Lez-Durance.
http://www-hpc.cea.fr/SummerSchools2013-CS.htm
Biorap N75 Avril 2013 10

NOUVELLES BREVES
! How to Benefit from AMD, Intel and NVI-
DIA Accelerator Technologies in Scilab
CAPS a publi un article expliquant comment
utiliser, de faon souple et portable, dans la
bibliothque Scilab, les technologies
dacclration de AMD, Intel et NVIDIA. Ceci
grce la technologie OpenHPMM dvelop-
pe par CAPS. Larticle est tlcharger sur :
http://www.caps-entreprise.com/wp-
content/uploads/2013/03/How-to-Benefit-from-AMD-
Intel-and-Nvidia-Accelerator-Technologies-in-
Scilab.pdf
! Green Computing Report
Le groupe Tabor Communication, qui dite en
particulier HPCwire et HPC in the Cloud ,
annonce un nouveau portail centr sur
lefficacit nergtique et cologique des
centres informatiques.
http://www.greencomputingreport.com
! Vers une liste BigData Top 100 ?
Le SDSC (San Diego Supercomputer Center
de luniversit de Californie rflchit la cra-
tion, avec laide de la communaut scientifique,
dune liste des systmes les plus performants
dans le domaine du traitement des grands vo-
lumes de donnes. Un benchmark spcifique
serait mis en place. Informations sur cette initia-
tive disponibles sur le site
http://www.bigdatatop100.org/
Un article publi dans la revue Big Data :
http://online.liebertpub.com/doi/pdfplus/10.1089/big.2
013.1509
! Berkeley Lab prpare lExascale
Le NERSC (National Energy Research Scienti-
fic Computing Center) Berkeley a commenc
linstallation du systme Edison (ou
NERSC-7), un Cray XC30 dune performance
crte de plus de 2 PFlops. Le centre prpare
dj la dfinition de la gnration suivante,
NERSC-8, qui devrait tre installe fin 2015,
dernire tape avant un systme exaflopique.
! Le DoE amricain prpare lExascale
Les trois principaux centres du DoE amricain
(Oak Ridge, Argonne et Lawrence Livermore)
ont une approche globale de lvolution de leurs
superordinateurs. Un appel doffres (ou 3 ap-
pels doffres, concerts) devrait tre lanc
avant la fin 2013 pour dployer des systmes
de plus de 100 PFlops vers 2016-2017. Un
march trs attractif pour les constructeurs
Cray, IBM, voire SGI ! Et cest aussi la route
vers lexascale "
! En Chine : 100 PFlops avant lExascale
Selon HPCwire, la Chine prpare la construc-
tion dun ordinateur de 100 PFlops qui devrait
tre oprationnel avant la fin de 2014. Il serait
bas sur des composants Intel : 100,000 CPUs
Xeon Ivy Bridge-EP associs 100,000 copro-
cesseurs Xeon Phi.
! Inde : inauguration de PARAM Yuva - II
LInde revient dans le HPC avec linauguration
dun systme hybride, dune performance crte
de plus de 500 TFlops, appel PARAM Yuva
II, install luniversit de Pune.
! UK : 45 M$ pour le Hartree Centre
Le Hartree Centre, au Science and Technology
Facilities Council (STFC) Daresbury, a t
inaugur et a reu 45 M$ dont 28 M! devraient
tre consacrs la R&D sur les logiciels desti-
ns aux grands dfis scientifiques et aux logi-
ciels qui doivent permettre aux entreprises de
mieux utiliser le HPC. Ce centre est aussi de-
venu le partenaire dUnilever dans le domaine
du HPC.
http://www.stfc.ac.uk/hartree/default.aspx
! Un rapport de lENISA sur le Cloud
L'agence de cyber- scurit europenne ENISA
a publi un nouveau rapport qui examine le
Cloud Computing du point de vue de la protec-
tion des infrastructures d'information critiques
(PIIC), et souligne limportance croissante du
Cloud Computing compte tenu des utilisateurs
et des donnes ainsi que de son utilisation
croissante dans les secteurs critiques, comme
les finances, la sant et l'assurance.
http://www.enisa.europa.eu/activities/Resilience-and-
CIIP/cloud-computing/critical-cloud-computing
! Une application utilise plus dun million
de coeurs
Le CTR (Stanford Engineering's Center for
Turbulence Research) a tabli un nouveau
record en utilisant, pour une application com-
plexe de mcanique des fluides, plus dun mil-
lion de curs. Il a utilis la machine IBM BG/Q
Sequoia du LLNL.
! IDC HPC Award Recognition Program
IDC lance son programme annuel qui permet
didentifier et de rcompenser les projets HPC
les plus reprsentatifs dans le monde. Date
limite pour les soumissions initiales : 19 avril.
https://www.hpcuserforum.com/innovationaward/appl
icationform.html
! Atipa Technologies
Atipa Technologies (situ Lawrence, Kansas)
va fournir un systme de 3,4 PFlops crte au
Dpartement Energy's Environmental Molecu-
lar Sciences Laboratory (EMSL), du DoE
amricain. Il associe 23,000 processeurs Intel
et des acclrateurs Intel Phi (MIC).


Biorap N75 Avril 2013 11

! Bull
Mto-France a choisi de s'quiper auprs
de Bull pour l'achat de supercalculateurs
destins aux prvisions mtorologiques et
la recherche sur le climat. Les modles
retenus, des Bullx B700 DLC, seront instal-
ls Toulouse, partir du premier trimestre
2013. La puissance totale crte serait de 5
PFlops.
Bull et lUniversit de technologie de Dres-
den (Allemagne) ont sign un accord par
lequel Bull va fournir un superordinateur
dune puissance crte de plus de 1 PFlops.
Le 22 mars 2013, Bull a lanc son Centre
dExcellence pour la Programmation Paral-
lle, install Grenoble et qui dlivrera un
haut niveau dexpertise pour aider labora-
toires et entreprises optimiser leurs appli-
cations pour les nouvelles technologies
manycore. Le Centre fournira un large por-
tefeuille de services, incluant lanalyse, le
conseil, la paralllisation et loptimisation de
codes. Ce centre bnficiera aussi de
lexpertise de deux socits : Allinea et
CAPS.
! ClusterVision
ClusterVision a install un cluster de 200
TFlops luniversit de Paderborn (Allemagne).
Avec 10.000 curs, le cluster comprend des
processeurs Intel Xeon E5-2670 (16 curs), et
des GPU NVIDIA Tesla K20.
! Cray
Cray a reu un contrat de 39 M$ du HLRN
(North-German Supercomputing Alliance)
pour installer deux systmes Cray XC30
(anciennement surnomms Cascade )
au Zuse Institute (Berlin) et luniversit
Leibniz (Hannovre). Ces systmes seront
exploits conjointement et fourniront une
puissance crte de plus de 1 PFlops.
Cray va fournir deux superordinateurs Cray
XC30 et deux systmes de stockage Cray
Sonexion 1600 au service national mto-
rologique allemand situ Offenbach. Le
contrat est valu 23 M$.
! Dell
TACC : une prsentation faite par le Dr Karl
Schulz, directeur des applications scientifiques
pour le TACC se focalise sur la mise en produc-
tion du systme et les dfis relevs lors de sa
ralisation. Disponible sur :
http://www.hpcadvisorycouncil.com/events/2013/Swit
zerland-Workshop/Presentations/Day_1/7_TACC.pdf
! HP
HP a commenc linstallation dun systme
dune performance d1 PFlops au NREL (Natio-
nal Renewable Energy Laboratory) du Dpar-
tement amricain de lnergie. Les serveurs HP
utilisent des processeurs Xeon et co-
processeurs Xeon Phi dIntel.
! IBM :
LEPFL (Suisse) a acquis un BG/Q dune per-
formance de 172 TFlops. Il fait partie des 10
systmes les plus verts dans le monde.


AGENDA
9 au 11 avril EASC 2013 : Solving Software Chal-
lenges for Exascale (Edinburgh, Royaume-Uni)
15 au 16 avril - 5th PRACE Executive Industrial
Seminar (Stuttgart, Allemagne)
22 au 24 avril EE-LSDS 2013 : Energy Efficiency in
Large Scale Distributed Systems (Vienne, Autriche)
8 au 10 mai CLOSER 2013 : 3
rd
International Con-
ference on Cloud Computing and Services Science
(Aachen, Allemagne)
13 au 16 mai CCGRID 2013 : The 13
th
IEEE/ACM
International Symposium on Cluster, Cloud and Grid
Computing (Delft, Pays-Bas)
13 au 16 mai Extreme Grid Workshop : Extreme
Green & Energy Efficiency in Large Scale Distributed
Systems (Delft, Pays-Bas)
20 mai HiCOMB 2013 : 12
th
IEEE International
Workshop on High Performance Computational Bio-
logy (Boston, MA, Etats-Unis)
20 mai CASS 2013 : The 3rd Workshop on Com-
munication Architecture for Scalable Systems (Bos-
ton, MA, Etats-Unis)
20 mai HPDIC 2013 : 2013 International Workshop
on High Performance Data Intensive Computing
(Boston, MA, Etats-Unis)
20 mai HCW 2013 : Twenty second international
Heterogeneity in Computing Workshop (Boston, MA,
Etats-Unis)
20 mai EduPar 2013 : Third NSF/TCPP Workshop
on Parallel and Distributed Computing Education
(Boston, MA, Etats-Unis)
20 au 24 mai IPDPS 2013 : 27
th
IEEE International
Parallel & Distributed Processing Symposium (Bos-
ton, MA, Etats-Unis)
24 mai VIPES 2013 : 1st Workshop on Virtual
Prototyping of Parallel and Embedded Systems (Bos-
ton, MA, Etats-Unis)
24 mai PCO 2013 : Third Workshop on Parallel
Computing and Optimization (Boston, MA, Etats-
Unis)
27 au 30 mai ECMS 2013 : 27
th
European Confe-
rence on Modelling and Simulation (Aalesund Uni-
versity College, Norvge)
27 mai au 1
er
juin Cloud Computing 2013 : The
Fourth International Conference on Cloud Compu-
ting, GRIDs, and Virtualization (Valencia, Espagne)
27 mai au 1
er
juin Future Computing 2013 : The
Fifth International Conference on Future Computa-
Biorap N75 Avril 2013 12

tional Technologies and Applications (Valencia, Es-
pagne)
27 mai au 1
er
juin Computational Tools 2013 :
The Fourth International Conference on Computatio-
nal Logics, Algebras, Programming, Tools, and Ben-
chmarking (Valencia, Espagne)
27 au 1
er
juin Adaptive 2013 : The Fifth Internatio-
nal Conference on Adaptive and Self-Adaptive Sys-
tems and Applications (Valencia, Espagne)
30 au 31 mai CAL 2013 : 7me Conference sur les
Architectures Logicielles (Toulouse, France)
5 au 7 juin ICCS 2013 : International Conference
on Computational Science : Computation at the Fron-
tiers of Science (Barcelone, Espagne)
5 au 7 juin TPDACS 2013 : 13
th
Workshop on
Tools for Program Development and Analysis in
Computational Science (Barcelone, Espagne)
5 au 7 juin ALCHEMY Workshop : Architecture,
Languages, Compilation and Hardware support for
Emerging ManYcore systems (Barcelone, Espagne)
6 au 10 juin GECCO 2013 : Genetic and Evolutio-
nary Computation Conference (Amsterdam, Pays-
Bas)
10 au 14 juin ICS 2013 : International Conference
on Supercomputing (Eugene, OR, Etats-Unis)
10 juin ROSS 2013 : International Workshop on
Runtime and Operating Systems for Supercomputers
(Eugene, OR, Etats-Unis)
16 au 20 juin ISC 2013 : International Supercompu-
ting Conference (Leipzig, Allemagne)
17 au 18 juin VTDC 2013 : The 7th International
Workshop on Virtualization Technologies in Distri-
buted Computing (New-York, NY, Etats-Unis)
17 au 21 juin HPDC 2013 : The 22nd International
ACM Symposium on High Performance Parallel and
Distributed Computing (New-York, NY, Etats-Unis)
17 au 21 juin FTXS 2013 : 3rd International
Workshop on Fault-Tolerance for HPC at Extreme
Scale (New-York, NY, Etats-Unis)
17 au 20 juin PROMASC 2013 : The Second
Track on Provisioning and Management of Service
Oriented Architecture and Cloud Computing (Ham-
mamet, Tunisie)
20 au 23 juin CEC 2013 : Evolutionary algorithms
for Cloud computing systems (Cancun, Mexique)
24 au 27 juin AHS 2013 : 2013 NASA/ESA Confe-
rence on Adaptive Hardware and Systems (Turin,
Italie)
25 au 26 juin Ter@Tec 2013 (Palaiseau, France)
27 au 29 juin IGCC 2013 : The Fourth International
Green Computing Conference (Arlington, VA, Etats-
Unis)
27 au 30 juin ISPDC 2013 : The 12th International
Symposium on Parallel and Distributed Computing
(Bucarest, Roumanie)
27 juin au 2 juillet CLOUD 2013 : The 6th IEEE
International Conference on Cloud Computing (Santa
Clara, CA, Etats-Unis)
27 juin au 2 juillet BigData 2013 : The 2013 Inter-
national Congress on Big Data (Santa Clara, CA,
Etats-Unis)
1 au 2 juillet HLPP 2013 : International Symposium
on High-level Parallel Programming and Applications
(Paris, France)
1 au 5 juillet ECSA 2013 : 7th European Confe-
rence on Software Architecture (Montpellier, France)
1 au 5 juillet HPCS 2013 : The International Confe-
rence on High Performance Computing & Simulation
(Helsinki, Finlande)
14 au 20 juillet ACACES 2013 : Ninth International
Summer School on Advanced Computer Architecture
and Compilation for High-Performance and Embed-
ded Systems (Fiuggi, Italie)
16 au 18 juillet ISPA 2013 : The 11th IEEE Interna-
tional Symposium on Parallel and Distributed Pro-
cessing with Applications (Melbourne, Australie)
22 au 25 juillet WorldComp 2013 : The 2013
World Congress in Computer Science, Computer
Engineering, and Applied Computing (Las Vegas,
NE, Etats-Unis)
22 au 25 juillet PDPTA 2013 : The 2013 Internatio-
nal Conference on Parallel and Distributed Proces-
sing Techniques and Applications (Las Vegas, NE,
Etats-Unis)
22 au 25 juillet DMIN 2013 : The 2013 International
Conference on Data Mining (Las Vegas, NE, Etats-
Unis)
26 au 30 aot EuroPar 2013 : Parallel and distri-
buted computing (Aachen, Allemagne)
28 au 29 aot Globe 2013 : 6th International Con-
ference on Data Management in Cloud, Grid and
P2P Systems (Prague, Rpublique Tchque)
10 au 11 septembre BigData 2013 : Big Data
Summit Europe (Sintra, Portugal)
15 au 18 septembre EuroMPI 2013 (Madrid, Es-
pagne)

Les sites de ces manifestations sont accessibles sur
le serveur ORAP (rubrique Agenda).


Si vous souhaitez communiquer des informa-
tions sur vos activits dans le domaine du cal-
cul de haute performance, contactez Jean-
Loic.Delhaye@inria.fr

Les numros de BI-ORAP sont disponibles en
format pdf sur le site Web dORAP.


ORAP
Structure de collaboration cre par
le CEA, le CNRS et lINRIA

Secrtariat : Chantal Le Tonquze
INRIA, campus de Beaulieu, 35042 Rennes
Tl : 02 99 84 75 33, fax : 02 99 84 74 99
chantal.le_tonqueze@inria.fr
http://www.irisa.fr/orap