Vous êtes sur la page 1sur 38

INGENIOUS:

Using next generation computers and algorithms for modeling the dynamics of large biomolecular systems

Makoto Taiji
Computational Biology Research Core RIKEN Quantitative Biology Center Processor Research Team RIKEN Advanced Institute for Computational Sciences
taiji@riken.jp

Our future targets

Cilia of mouse embryo Fluid dynamic mechanism responsible for breaking left right symmetry of the Human Body: The Nodal Flow,
N. Hirokawa, Y. Okada, Y. Tanaka, Annual Review of Fluid Mechanics 41, 53-72 (2009). https://www.youtube.com/watch?v=3y_P67KwuvU

Bacterial Flagellum https://www.youtube.com/watch?v=vxiwhfgzL0Q

Challenges in Molecular Dynamics simulations of biomolecules


Target Region
Strong Anton/MDG4 Scaling

30,000 year ExaFLOPS 1021J energy (~3x1018J is spent in Japan in each year) Multiscale approach is essential
Weak Scaling

K computer

S. O. Nielsen, et al, J. Phys. (Condens. Matter.), 15 (2004) R481

Organization
nMolecular Fluctuations (Aston Group) nMolecular Fluctuations Fluid Dynamics (Moscow Group) nMultiscale Fluid Dynamics (Univ. London/ Cambridge Group) nHPC (RIKEN)

Mercedes-Benz water as a bridge to the macroscale Arturs Scukins and Dmitry Nerukh
Implementa8on in molecular dynamics

Why MB water?
A rela8vely simple, 2D model that has all features and peculiari8es of real water Being 2D scales much beFer with size: allows to reach the spa8al sizes of hydrodynamics Well developed theore8cally (started by Ben- Naim in early seven8es) Computa8onally well studied by Monte Carlo, but no inves8ga8ons by Molecular Dynamics Well suited fro our purpose of developing hybrid Molecular Dynamics Fluid Dynamics approach

Mercedes Benz poten8al: , where is Lennard-Jones poten8al, , is orienta8on dependant poten8al, is a Gaussian func8on.

We have derived the formulas for calcula8ng thermodynamics from MD trajectories


Temperature: Pressure: Heat capacity: Heat expansion coecient: Compressibility:

where is a 8me average, V is an area, N - number of molecules, T - temperature, K kine8c energy, density.

Results

Structure
The RDF qualita8vely diers from Lennard Jones RDF but coincides with the results obtained using Monte Carlo

Conclusions
The 2D Mercedes-Benz model mimics real water behaviour. Captures minimum of pressure (volume), nega8ve expansion coecient, minimum of compressibility and high heat capacity. RDF qualita8vely diers from Lennard-Jones RDF.

Towards accurate modeling across dierent scales: high- resolu6on methods for Fluctua6ng Hydrodynamics equa6ons

V.Y.Glotov, V.M.Goloviznin, A.V.Danilin


s ( x, t ) s ( x , t ) =

One dimensional case


u + =0 t x 2 u ( u + P ) 4 2u s + 2 = 0 t x 3 x x E ( E + P ) u 4 u T + u + t x x 3 x x

8 k T ( x x ) ( t t ) ; 3 2 k T 2 q ( x, t ) q ( x , t ) = ( x x ) ( t t ) ;

( q + u s ) =0 x

u2 E = cv T + ; 2

Characteristic form of LL-NS equations


u ( 1) s 1 P u 1 P + = G1 ; + u + c 2 + 2 2 2 2 t t x x c ( 1) s c ( 1) s u u 1) s ( 1 P 1 P 2 + u c = G2 ; 2 2 2 2 t t x c ( 1) s c ( 1) s x P ln t P s 1 + u ln c T t v x s 1 c T x = G3 ; v

Condition for hyperbolicity

c2 s< ( 1)

Stochas6c uxes
Stochastic fluxes approximation
8 k T ( x x ) (t t ) ; 3 2 k T 2 q ( x, t ) q ( x, t ) = ( x x ) (t t ) ; s ( x, t ) s ( x, t ) =

sh ( x, t ) =

8 k T Gauss ( 0,1) ; 3 x t

2 k T 2 qh ( x, t ) = Gauss ( 0,1) ; x t

For high value of stochastic forcing (large s and q fluxes) the solution of the LL Navier-Stokes equations is very challenging

Our choice: Compact Accurately Boundary Adjusting high-REsolution Technique (CABARET)


Iserlis 1986 Roe 1998 Samarskii and Goloviznin 1998 Goloviznin and Karabasov 1998 Karabasov, Hynes and Goloviznin 2001 Tran and Scheurer 2002 Kim 2004 Goloviznin 2005 Karabasov and Goloviznin 2007, 2009

+ c =0 t x

Explicit, second-order in space and time Non-dissipative and low-dispersive Very compact stencil Conservation form Staggered variables: one-cell stencil in space and time Nonlinear flux correction based on maximum principle Nonlinear flux reconstruction based on the minimum solution variation Highly scalable method and has already been successfully used in unsteady convection-dominated flow modelling

Comparison of several computa8onal schemes for the Bell problem


Variance in conserved quantities at equilibrium
2 Exact value: 2.35 108
MacCormack scheme Piecewise parabolic method Third-order Runge-KuFa CABARET Molecular Simula8on 2.01 1.97 2.34 2.31 2.35 -14.3% -16.0% -1.3% -1.7% 0%

J2

Exact value: 13.34


13.31 13.27 13.65 13.18 13.21 -0.3% -0.5% 2.3% -1.2% -1%

MacCormack scheme Piecewise parabolic method Third-order Runge-KuFa CABARET Molecular Simula8on

E2

Exact value: 2.84 1010


2.61 2.58 2.87 2.75 2.78 -8.4% -9.4% 0.9% -3.2% -2.1%

MacCormack scheme Piecewise parabolic method Third-order Runge-KuFa CABARET Molecular Simula8on

A MULTI-SPACE-TIME ALGORITHM FOR CONCURRENT LARGE/SMALL SCALE FLUID DYNAMICS SIMULATIONS


Anton Markesteijn and Sergey Karabasov

Towards micro- and nano-scales


Temperature uctuations important, large density and velocity uctuations Acoustics: ultra-sound / Biological applications: Coupling with MD

Interesting phenomena concurrently occur at small and larger scale, both in time and space
Numerically dicult to deal eciently with large time/space dierences A multi-space-time algorithm is demonstrated

Mul6 Space-Time algorithm - Overview


Fluctuating Hydrodynamics (Landau&Lifshitz)
Mimic microscopic behaviour at macroscopic scales Dissipative uxes treated as stochastic variables Thermodynamics: Fluctuation-Dissipation theorem

Scale Function: A (pre)de ned meshless zoom value


The value of this function is increased where small time and space phenomena are dominant The scale function also determines the actual comp. grid

Equation transformations both in space and time


Transformations are dependent on scale function Transformed (Computational Domain) / Untransformed (Physical Domain)

Special time marching (local and global time)


Local time step controlled by scale function Cells only updated when necessary (local<global time) (CFL curse): increased eciency, decreased error

Some Examples of Scale Func6ons


1D Scale dierence of 5
Both mesh size and local time scaled Computational domain simple Cartesian mesh

2D Mesh (radial 25 to 1 (200x200 mesh)

2D Example: Fluctua6ng Hydrodynamics


Scale dierence of 100, on a 200x200 mesh

Probe in centre of domain


Measure density transient Acoustic signal recovered from noise
Time ensemble

Variables are Maxwellian

Fluctua6ng Hydrodynamics vs MD Density uctuations and the speed of sound


Domain 250x40, Scale Function 1 to 25 to 1 in plateaus Smallest volume 0.6x0.6x0.6 nm3 (liquid water) Speed of sound obtained by t (~1510 m/s) Continuum results compared to MD results

Scaling challenges in MD
n 50,000 FLOP/particle/step n Typical system size : N=105 n 5 GFLOP/step n 5TFLOPS eective performance 1msec/step = 170nsec/day Rather Easy n 5PFLOPS eective performance 1sec/step = 200sec/day??? Dicult, but important

Scaling of MD on K Computer
Strong scaling 50 atoms/core ~3M atoms/Pflops

Since K Computer is still under development, the result shown here is tentative.

1,674,828 atoms

22

GRAPE: special-purpose computer for classical particle simulations


n GRAvity PipE n Originaly proposed by Prof. Chikada, NAOJ n Special-purpose accelerator
Astrophysical N-body simulations Molecular Dynamics Simulations

Host Computer

Particle Data

GRAPE
Results

Most of Calculation Others

GRAPE Host computer

J. Makino & M. Taiji, Scientific Simulations with Special-Purpose Computers, John Wiley & Sons, 1997.

Problem in Heterogeneous System - GRAPE/ GPUs nIn small system


Good acceleration, High performance/cost

nIn massively-parallel system


Scaling is often limited by host-host network, host-accelerator interface
Typical Accelerator System
Accelerator Low-Bandwidth Host Computer Host Computer Accelerator
Accelerator Accelerator High-Bandwidth Low-Latency System-on-Chip

SoC-based System

Generalpurpose core

Generalpurpose core

Host Network High-Latency

Embedded memories

Embedded memories

Host Computer Low-Bandwidth High-Latency

Network Low-Latency

Anton
n D. E. Shaw Research n Special-purpose pipeline + General-purpose CPU core + Specialized network n Anton showed the importance of the optimization in communication system

R. O. Dror et al., Proc. Supercomputing 2009, in USB memory.

MDGRAPE-4
n Special-purpose computer for MD simulation n Test platform for special-purpose machines n Target performance
20sec/step for 100K atom system 8.6sec/day (2fsec/step)

n Target application : GROMACS n Completion: ~2013 n Enhancement from MDGRAPE-3


130nm 40nm process Integration of Network / CPU

MDGRAPE-4 System
MDGRAPE-4 SoC 12 lane 6Gbps Electric = 7.2GB/s (after 8B10B encoding) 48 Optical Fibers Total 512 chips (8x8x8)

Node (2U Box) Total 64 Nodes (4x4x4) =4 pedestals

12 lane 6Gbps Optical

MDGRAPE-4 System-on-Chip
n40 nm (Hitachi HDL4S), ~ 230mm2 n64 force calculation pipelines @ 0.8GHz ~ 2.5 TFLOPS equivalent n64 general-purpose processors Tensilica Extensa LX4 @0.6GHz n72 lane SERDES @6GHz n65W

SoC Block Diagram


Instruction Memory (CGP) Pipeline Blocks 8 Pipelines
Core

Instruction Memory (1) GP Blocks IMem DMem

Instruction Memory (2)

Control GP

Core

IMem DMem

8 Pipelines

Core

IMem DMem
Core

IMem DMem

Message Queue 8 Pipelines

Core

IMem DMem

Bus Arbiter /DMAC

Bus Arbiter /DMAC

Global Memory

Network Unit

FPGA IF 100MHz x 128

6Gbps x 12 x 6

Embedded Global Memories in SoC


n ~1.8MB n 4 Block n For Each Block
2 Pipeline Blocks Network 2 GP Blocks GM4 Block 460KB GM4 Block 460KB

128bit X 2 for Generalpurpose core 192bit X 2 for Pipeline 64 bit X 6 for Network 256bit X 2 for Inter-block

GM4 Block 460KB

GM4 Block 460KB

General-Purpose Core
n Tensilica LX @ 0.6 GHz n 32bit integer / 32bit Floating n 4KB I-cache / 4KB D-cache n 8KB Local Memory
DMA or PIF access

Core 4KB 8KB D-ram

Dcache

Core Integer Queue Floa:ng


Icache

4KB

n 8KB Local Instruction Memory


DMA read from 512KB Instruction memory
GP Block Instruction Memory
Inst- ruc:on DMAC Barrier

8KB I-ram

Core Core

Core Core

Core Core

Core Core

DMAC PIF Queue IF

Global Memory

Control Processor

Software evaluation platform for MDGRAPE-4: RTL model nRTL-based simulator on Candence Ncsim nCycle accurate nSlow (>10ms/cycle)

Software evaluation platform for MDGRAPE-4:


nUnder construction (4Q 2012) nTensilica XTMP based multicore processor simulator (non-free) nIncludes behavior models of
Network Special-purpose pipeline Memories (latency can be considered)

nTwo-levels
Precise memory models for instruction In nite memory for instruction

Software evaluation platform for MDGRAPE-4 (2) nProgramming language


C C++ (without malloc)

nDirect control of network units nNo operating system, with simple monitor

Evaluation platform based on MDGRAPE-4 simulator nExtend MDGRAPE-4 simulator nChange Balance
More resources for general-purpose cores

nGeneral-purpose cores
Shared on-chip memory for 8-16 cores o-chip memory synchronization mechanism

nSpecial-purpose pipelines nNetwork interface

Special- purpose block General- purpose cores Local/Cache memories On-chip Network O-chip Memory

Special- purpose block General- purpose cores Local/Cache memories On-chip Network

Special- purpose block General- purpose cores Local/Cache memories On-chip Network O-chip Memory

Special- purpose block General- purpose cores Local/Cache memories On-chip Network

O-chip Network

O-chip Network

Toward Exascale
For Molecular Dynamics n Single-chip system
>1/30 of the MDGRAPE-4 system can be embedded with 11nm process Local MD + Multiscale Still network is necessary inside SoC

n For further strong scaling for MD


# of operations / step / 20Katom ~ 109 # of arithmetic units in system ~ 106 /P ops

Exascale means Flash (one-path) calculation


More specialization is required

Meetings & Visits


n Past
Nov 2011 @ Cambridge UK PI(Dr. Nerukh)s visit to RIKEN for a month in Dec 2011 Sep 2012 @ Kobe UK Researcher (Mr. Skukins) RIKEN (Sep-Nov 2012)

n Future related events


Dec 2012: UK-Japan bilateral workshop at British embassy in Tokyo (supported by British embassy Japan) Jul 2013: Royal Society Kavli Seminar in UK
Multiscale systems: linking quantum chemistry, molecular dynamics, and micro uidic hydrodynamics

Project workshops in UK or/and Japan (2013)

Vous aimerez peut-être aussi