Vous êtes sur la page 1sur 6

High Performance Computing for non-programmers

G. Ivashkevych, Kharkiv Institute of Physics and Technology

Summary: Rapid advancement in computing and ever growing scales of scientific problems require solid knowledge of software development from both scientists and engineers. Such knowledge often is out of scope of university curriculum. Proposed course introduces the basics of HPC software development and necessary tools, as well as some more advanced techniques, such as GPU computing skills. Motivation. In recent years computing resources became broadly available, which brings substantial benefits for many branches of both science and engineering. Tasks, that were once either untreatable or extremely costly, are solvable now. With fast-moving commoditization of computing, storage and networks, as well as emergence of modern computing architectures, such as CUDA, the task of building a desktop supercomputer became a reality. Utilization of such architectures in cloud environment allows to build extremely powerful computing systems in a very short time. Wide range of open source software packages and maturity of Linux operating system amplify this trend even more. There is every reason to believe that this trend for low-cost and highly available computing resources would continue and accelerate. Development of efficient, flexible and maintainable scientific software is not a trivial task. It requires a broad range of skills and involves a variety of tools. Effective and efficient use of new hardware and software require not only knowledge of programming per se, but understanding of many related topics and especially techniques for software optimizations and parallelisation. Indeed, modern scientist or engineer needs to preprocess data, develop and compile a code, possibly link it with some third-party libraries, perform a calculation and, finally, postprocess and analyze the results. For this chain to be high performance, scientist must have not only technical skills, but understand how to divide computational problem into smaller chunks to process them efficiently using most proper computing technologies and architectures. At the same time, since many of scientists work in explicit or implicit (e.g., by using or contributing to open source projects) teams, skills of teamwork is a must for efficient collaborative development of reliable high performance software. This includes revision control, documentation and testing, leaving aside more general organizational skills. Unfortunately, engineers and scientists often do not acquire necessary knowledge and skills during their training.

Proposed course aims at closing some of the gaps in HPC skills of scientists, first of all physicists. It uncovers basics of programming and more complex topics in software development (testing and debugging, build tools, revision control systems, design patterns), design of parallel applications and building of computing systems. Logistics. Course is designed to run during two terms and contains 31 main and 5 optional topics. One topic requires generally two lectures during the week. Students of third year and older can attend the course, given the curriculum in computer science disciplines for physicists at Department of Physics and Technology in Kharkiv V.N. Karazin University. Material is largely based on examples from real life. Although some of them are not directly related to high performance scientific computing, they illustrate general principles and could be interesting to students. Course does not imply labs. Instead, students would be provided with virtual machine image, containing Linux OS and necessary open source software. Also, a range of cloud technologies is involved, such Amazon Web Services and Google App engine. Proposed homework assignments are optional, although they require a moderate amount of time to complete and it's highly recommended to at least try to complete them. Assignments are based on and extend corresponding real-life examples provided during the lectures. Three group projects are proposed towards the end of the course (topics 28-30). These projects are not supposed to result in world-class software packages, but rather in practical experience in previously covered topics. Author is aware of the fact, that it's impossible to train a professional in HPC in just two terms. Thus, the goals of the course are more specific. First, we aim at establishing a foundation for everyday software development, including working with Linux OS and Python programming language. Second aim is to demonstrate at least fundamental techniques for software optimization and parallelization and glance at a more specific technologies like CUDA. And, finally, we want to demonstrate that development of HPC software is a learnable skill which can be mastered, if striving properly.

Syllabus
1. Introduction: how computing is changing our lives. Computing resources were never so available as today. Influence on science, engineering and medicine. Supercomputers, cloud computing and mobile devices. Big Data. Why it's important to understand the basic principles instead of just programming languages.
Real-life story: Formula One and computational fluid dynamics.

2. Python programming language: overview. What Python can be used for? Scripting languages and compiled languages. Syntax, data types. Modules and packages. Classes, objects and object-oriented programming. Using IPython.
Real-life example: First web-application with Python and Google App Engine.

3. Python programming language: advanced techniques. How to use Python efficiently? Magic methods. Working with lists. Iterators. Decorators. Metaprogramming. Jinja2 template engine.
Real-life example: Python and modeling of epidemics.

4. GNU/Linux operating system: introduction. Why Linux? Linux structure: booting, kernel and kernel modules. Processes and threads. Filesystems.
Real-life story: Is Android a Linux?

5. GNU/Linux operating system: practical overview. Installation and configuration. Repositories and software installation. Network configuration and usage. ssh.
Real-life example: Set up a server in Amazon cloud.

6. Bash: basic operations. How to work without UI? List files and directories. Copy, move and delete files. Redirection. Substitutions. Functions, cycles and conditionals. Text processing.
Real-life examples: bash tricks.

6. bash and Python for system administration. How to simplify routine operations? Command line arguments. Changing file permissions. Working with logs. screen. virtualenv.
Real-life example: Managing our server at Amazon cloud.

7. Development tools: revision control systems and documentation. Why use revision control? git. Repositories, commits and branches. Merging. How to work in team. Github and BitBucket. Documenting with Sphinx.
Real-life example: Making local server for git.

8. Development tools: testing and debugging. Why test? Unit tests and integration tests. n ose Python testing framework. pdb Python debugger.
Real-life story: Test Driven Development.

9. Main data structures and algorithms. Why is it important to choose a right data structures? Arrays, linked lists, stacks and queues. How to store sparse matrices. Search algorithms.
Real-life example: Data structures in Python standard library.

9a. Java review. How JVM works? Cross-platform development. Java syntax. Stack and heap. Classes, objects and methods. Data structures in Java class library.
Real-life example: Inside Java Virtual Machine.

10. Design patterns: overview. Why is software design so important? What is design patterns and their benefits. At which stage should we start to think about architecture of an application?
Real-life story: The Deadline by Tom DeMarco.

11. Design patterns: creational patterns. Why is it not always efficient to use constructors explicitly? Factory pattern. Prototype. Singleton.
Real-life example: Creational patterns in Android API.

12. Design patterns: structural patterns. How to deal with complex architecture? And what about legacy code? Proxy, Adapter and Decorator. Bridge pattern.
Real-life example: Structural patterns in Android API.

13. Design patterns: behavioral patterns. How to create flexible applications? Command, Template and Strategy patterns.
Real-life example: Behavioral patterns in Android API.

14. Computer architecture: review. What's inside your computer? Von Neumann architecture. CPU and RAM. Cores and caches. Representation of data and instructions. GPU.
Real-life story: Power wall and development of multi-core processors.

15. /++ review. When is needed? and ++ syntax. Arrays, pointers and pointer arithmetic.
Real-life story: How was born and why it's important.

16. Development tools: compilers. How to build an executable? Preprocessing, compilation, linking. GCC. Main options. Headers and libraries look-up. Static and dynamic linking. extern. Basic GNU make.
Real-life example: Using GNU Scientific Library.

16. Development tools: build automation. How to build a large project? GNU make. Targets and rules. Embedded scripting.
Real-life example: Building Linux kernel.

17. /++ profiling and optimization. Profiling and optimization. valgrind and perf. ++ vs. Java vs. Python: choosing a right tool for the job. Memory and cache optimizations. Knowing your compiler.
Real-life example: Step-by-step real-world code profiling tutorial.

18. Patterns for parallel applications. How to create parallel applications? Parallel vs. sequential programs. Different kinds of parallelism and scalability. Flynn's taxonomy. Shared memory and message passing. Ensuring data integrity. Deadlocks and race conditions. Blocking and synchronization.
Real-life example: MapReduce concept and Google App Engine.

19. Threads in C++. Using threads in ++11: starting and joining threads. What to synchronize? Mutual exclusion and locks. Support by compilers.
Real-life example: Why GUI libraries are not thread safe and how to deal with it.

20. Introduction to OpenMP. OpenMP model. Main pragmas. Threads synchronization. Different scheduling approaches and load balancing. How to build an OpenMP program.
Real-life example: Parallel matrix multiplication and memory wall.

21. Introduction to MPI. MPI concept. Differences from shared memory model. Main operations. MPI and Python.
Real-life example: Building computing cluster in Amazon cloud.

22. CUDA: concept and architecture overview. Why is GPU so suitable for parallel computing? History of graphic processing units. Data parallelism. Hybrid computing. GPU architecture. Tools.
Real-life story: Avatar movie and CUDA.

23. CUDA: programming model. How is CUDA architecture organized? Threads, thread blocks and grid. Kernels. Memory types and data transfer between CPU and GPU. Threads synchronization and kernel synchronization. Data types.
Real-life example: Converting picture to b/w on GPU.

24. CUDA: practical approach. How to choose right number of threads and blocks? Overlapping computations and data transfers: streams. cuBLAS and cuFFT libraries.
Real-life example: Matrix multiplication on GPU nave approach.

25. CUDA: profiling and optimization. How to profile GPU code? Events. NVIDIA Visual Profiler. Efficient use of memory. Use of shared memory. How GPU code is executed. Warps. Minimizing threads divergence.
Real-life example: Matrix multiplication on GPU with shared memory.

25a. CUDA and OpenCL: commonalities and distinctions. CUDA - OpenCL dictionary. Data types in OpenCL. Choice between CUDA and OpenCL. Comparing performance. Intel OpenCL SDK.
Real-life story: OpenCL in open source projects.

26. Computing with Python. Linear algebra with NumPy. Multidimensional Vectorization. Data visualization with matplotlib.
Real-life story: PyCUDA for chaotic dynamics simulations.

array

creation

and

usage.

26. Interfacing C/C++ with Python. How to speed up Python program? Python C API. ctypes. Calling functions from /++ libraries. scipy.weave simple approach.
Real-life example: Global Interpreter Lock and calls to libraries.

27. GPU computing with PyCUDA. Main operations and abstractions. GPUArray class and its applications. pyFFT. Creating and running GPU code at runtime.
Real-life story: PyCUDA for chaotic dynamics simulations.

28. Example: Amazon EC2. Cloud computing. Amazon Web Services architecture. Elastic Compute Cloud - EC2. Types of instances. Python and AWS: boto.
Project: Developing simple Dropbox alternative.

29. Example: neural networks. How brain processes information. Machine learning and types of learning. Artificial neural networks: structure of the model. ANN training: back-propagation algorithm.
Project: Machine learning with neural networks on GPU.

30. Example: molecular dynamics and N-body simulations. Algorithms for particles simulation. Force calculation. Known implementations.
Project: Stellar dynamics on GPU with PyCUDA.

31. Topics we didn't cover. Structure of networks. Software Defined Networking. Data storage. HDF5 and netCDF. Parallel filesystems. And what about Fortran? Ten tools you should look at.

Vous aimerez peut-être aussi